Many organizations have pre-fixed formats for their reporting needs. These formats or Management Information Reports are updated at monthly and quarterly intervals at exactly the same format. However when the spreadsheets become big, analysis becomes tedious in comparing two big spreadsheets due to the sheer number of cells involved.
Using SAS , we can automate this process almost instantly.
We will use proc import to import data from the spreadsheets in such a manner that top row imported consist column headings (sas dataset variables).Note both spreadsheets are exactly in same format.
We will then use proc compare to compare these two datasets.
We can then use the integrated approach to automated reporting in SAS (See Archives- Category Analytics) to further reduce this to a simple batch process.
The relevant codes are –
%let pathfile = “C:\Documents and Settings\” ;
run;
/*CREATING LIBRARY NAME */
libname auto &pathfile;
run;
/*TO CONSERVE SPACE*/
options compress=yes;
/*TO MAKE LOG READABLE */
options macrogen symbolgen;
PROC IMPORT OUT= auto.TEST1
DATAFILE= “C:\Documents and Settings\excel1-full.xls”
DBMS=EXCEL2000 REPLACE;
SHEET=”‘Sales$'”;
/*SPECIFYING WORKSHEET FOR MULTIPLE SHEETS */
GETNAMES=YES;
/*TO TAKE VARIABLE NAMES FROM TOP ROW */
RANGE=”A4:AB2000″;
/*SPECIFYING RANGE OF CELLS IN SPREADSHEET TO BE READ */
RUN;
PROC IMPORT OUT= auto.TEST2
DATAFILE= “C:\Documents and Settings\excel2-full.xls”
DBMS=EXCEL2000 REPLACE;
SHEET=”‘Sales$'”;
/*SPECIFYING WORKSHEET FOR MULTIPLE SHEETS */
GETNAMES=YES;
/*TO TAKE VARIABLE NAMES FROM TOP ROW */
RANGE=”A4:AB2000″;
/*SPECIFYING RANGE OF CELLS IN SPREADSHEET TO BE READ */
RUN;
/* COMPARING THE TWO SPREADSHEETS */
proc compare base=auto.test1 compare=auto.test2;
var
/*SPECIFYING WHAT VARIABLES TO BE COMPARED */
Applications
Approvals
Disbursals
30dayplus
60dayplus
90dayplus
;
with Branch;
/*SPECIFYING VARIABLE FOR COMPARISON
FOR SAME BRANCH IN THIS CASE */
run;
The output will simply compare and compute the cell by cell difference.
You can then use ods to ouput this in another big spreadsheet 🙂
This is particularly relevant in telecommunications and banks, where they need to compare a lot of metrics across timely intervals.