Comparing Big SpreadSheet A to Big SpreadSheet B

Many organizations have pre-fixed formats for their reporting needs.  These formats or Management Information Reports are updated at monthly and quarterly intervals at exactly the same format. However when the spreadsheets become big, analysis becomes tedious in comparing two big spreadsheets due to the sheer number of cells involved.

Using SAS , we can automate this process almost instantly.

We will use proc import to import data from the spreadsheets in such a manner that top row imported consist column headings (sas dataset variables).Note both spreadsheets are exactly in same format.

We will then use proc compare to compare these two datasets.

We can then use the integrated approach to automated reporting in SAS (See Archives- Category Analytics) to further reduce this to a simple batch process.

The relevant codes are –

%let pathfile = “C:\Documents and Settings\” ;
run;

/*CREATING LIBRARY NAME */

libname auto &pathfile;

run;

/*TO CONSERVE SPACE*/

options compress=yes;

/*TO MAKE LOG READABLE */

options macrogen symbolgen;

PROC IMPORT OUT= auto.TEST1
DATAFILE= “C:\Documents and Settings\excel1-full.xls”
DBMS=EXCEL2000 REPLACE;
SHEET=”‘Sales$'”;

/*SPECIFYING WORKSHEET FOR MULTIPLE SHEETS */
GETNAMES=YES;

/*TO TAKE VARIABLE NAMES FROM TOP ROW */

   RANGE=”A4:AB2000″;

/*SPECIFYING RANGE OF CELLS  IN SPREADSHEET TO BE READ */

RUN;

PROC IMPORT OUT= auto.TEST2
DATAFILE= “C:\Documents and Settings\excel2-full.xls”
DBMS=EXCEL2000 REPLACE;
SHEET=”‘Sales$'”;

/*SPECIFYING WORKSHEET FOR MULTIPLE SHEETS */
GETNAMES=YES;

/*TO TAKE VARIABLE NAMES FROM TOP ROW */

   RANGE=”A4:AB2000″;

/*SPECIFYING RANGE OF CELLS  IN SPREADSHEET TO BE READ */

RUN;

/* COMPARING THE TWO SPREADSHEETS */

proc compare base=auto.test1 compare=auto.test2;
var

/*SPECIFYING WHAT VARIABLES TO BE COMPARED */
Applications

Approvals

Disbursals

30dayplus

60dayplus

90dayplus

;
with Branch;

/*SPECIFYING VARIABLE FOR COMPARISON

FOR SAME BRANCH IN THIS CASE */
 run;
The output will simply compare and compute the cell by cell difference.

You can then use ods to ouput this in another big spreadsheet 🙂

This is particularly relevant in telecommunications and banks, where they need to compare a lot of metrics across timely intervals.

Author: Ajay Ohri

http://about.me/ajayohri

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s