Model Presentation

Presenting a model is different from making a model, as the end audience is non technical and business minded. These are some thumb rules I use for making model presentation templates

1) Model Lift- How good is the model vs current effort.This is best shown by lift curves or KS statistics where you plot % Responders on X Axis and % Population on Y axis. Maximum separation between goods and bads is the KS statistic.

2) Model Robustness- What facts back up statistical validity of model output/equation ? Is there a way to test the model without executing it fully?

3) Model Assumptions- This deals with historical assumptions like which event is the model based on, data assumptions for validation and missing value treatment, capping of outliers.

The best way to convince business audiences is splitting the dataset into three random samples of 60 %,30 % and 10 % for model building, validation and testing again.

Then rerun the model equation on another random sample ,using a different seed in the RANUNI function. The KS should be similar and so should be the stats.

Ultimately models get validated or battered in the marketplace. A 1 % difference in response rates can make or lose hundreds of thousands of dollars especially in mass marketing or credit modeling. Business perspective and buy in is thus essential and so is continuous model performance feedback to avoid deterioration of  model, as it will eventually deteriorate over a period of time.

Jodha Akbar (Movie review) 2008 -Bollywood Movie

The Oscar nominated director , Ashutosh G (for best Foreign language Movie Lagaan ) casts Hrithik Roshan (one of India’s most beautiful actors) and ex-Miss World Aishwarya Rai Bacchan in this love story of Mughal Emperor Akbar the Great.

jodhaaakbar.jpg

First things first – the movie is as historically true as you can expect entertainment movies to be.   So much for history.Hrithik and Aishwarya  and the other actors more than compensate for the historical in accuracies . Akbar didn’t win all battles by negotiation, but as a descendant of Mongals did use massacres selectively for shock, awe and deft political manouevering.

But the songs are good , the people are lovely and the movie sets sparkle with mughal grandeur . Lots of Bollywood masala, and some inspiration from Troy ,the Hollywood movie. Which is sad actually. Why borrow the climax from the famous Eric Bana /Hector and Brad pitt/achilles contest.

Hrithik and Aishwarya might as well act in a Hollywood movie rather than copy one. their chemistry makes the movie tick, and you might get to see one of the slowest sensual love songs created in the new generation Bollywood.

Watch Jodha Akbar……..good time pass and well worth your ticket money…… but forget the history of both Akbar and Troy and go with lowered expectations…and you will have a reasonably good time.

Good Night Baghdad

Good night Baghdad

We met as soul mates
On Green Zone Island
We left as inmates
From an asylum

And we were sharp
As sharp as knives
And we were so gung ho
To lay down our lives

We came in spastic
Like tameless horses
We left in plastic
As numbered corpses

And we learned fast
To travel and fight
Our armored cars were heavy
But our bellies were tight

We had no home front
We had no one
They sent us John Mccain
They gave us Jessica Simpson

We dug in deep
And shot on sight
And prayed to Jesus Christ
With all of our might

We had video cameras
To shoot the landscape
We passed the hash pipe
And played our Eminem tapes

 

And it was dark
So dark in our humvee

And we held on to each other
Like brother to brother

We promised our mothers we’d write
And we would all go down together
We said we’d all go down together
Yes we would all go down together

Remember the Sunnis and Shias

Remember the imprisoned Iraqi

They left their childhood
On every acre

And who was wrong?
And who was right?
It didn’t matter in the thick of the fight

We held the day
In the palm
Of our hand
They ruled the night

And the night
Seemed to last as long as six weeks
On Green zone Island

We held the street corners
They held the rooftops
And they were sharp
As sharp as knives

They heard the hum of our convoys
They counted the humvees
And waited for us to arrive

And we would all go down together
We said we’d all go down together
Yes we would all go down together

 bush_codpiece.jpg

(with credit to Billy Joel from Good Night Saigon)

Politics in Analytics

Observers of American Electoral politics ,including the current Presidential Campaign would be struck by the sophisticated degree of analytics being involved. This includes the following –

1) Segmentation of likely percentage Response Rates (vote yes(1) , vote opponent (0)

based on

history of voting

response to stimuli (experience vs change)

ethnicity (black, white ,latino)

income groups (<40,000 USD ,>100000 USD)

education (college educated)

gender (male,female)

geography (rural ,urban,college town)

union affliation

and even coffee (latte drinkers etc 🙂 )

What is striking is that most of these variables like race, gender cannot be used for marketing anything else like credit cards, or financial services on charges of discrimination.

What could be really interesting is if they add credit bureau variables and create logistic models (and not just segmentation). Maybe by 2016, there will be a different category of analytics called Quantitative Political analytics.

Another note – What is similar between Ralph Nader , Chaos Theory and Butterfly effect.

Chaos theory states that future results can vary a lot based on slight changes in initial differences.

Butterfly Effect uses this to say a small event like butterfly fluttering in china can cause a big event like typhoon in the US.

Ralph Nader entering the race in 2000 got 90000 votes in Florida, mostly to be siphoned away from Al Gore , who lost Florida and the elections by less than 1000 votes.

Al Gore was against Iraq war since the beginning and had he been President maybe the world would have been greener and no war in Iraq. maybe. No offense meant to anyone.

Howard Dean screaming or Bill Clinton calling Obama as similar to Jesse Jackson’s wins (which is analytically and quantitatively true) or Hillary crying in Iowa , can be similar butterfly effects.No offense meant to anyone.

Comparing Big SpreadSheet A to Big SpreadSheet B

Many organizations have pre-fixed formats for their reporting needs.  These formats or Management Information Reports are updated at monthly and quarterly intervals at exactly the same format. However when the spreadsheets become big, analysis becomes tedious in comparing two big spreadsheets due to the sheer number of cells involved.

Using SAS , we can automate this process almost instantly.

We will use proc import to import data from the spreadsheets in such a manner that top row imported consist column headings (sas dataset variables).Note both spreadsheets are exactly in same format.

We will then use proc compare to compare these two datasets.

We can then use the integrated approach to automated reporting in SAS (See Archives- Category Analytics) to further reduce this to a simple batch process.

The relevant codes are –

%let pathfile = “C:\Documents and Settings\” ;
run;

/*CREATING LIBRARY NAME */

libname auto &pathfile;

run;

/*TO CONSERVE SPACE*/

options compress=yes;

/*TO MAKE LOG READABLE */

options macrogen symbolgen;

PROC IMPORT OUT= auto.TEST1
DATAFILE= “C:\Documents and Settings\excel1-full.xls”
DBMS=EXCEL2000 REPLACE;
SHEET=”‘Sales$'”;

/*SPECIFYING WORKSHEET FOR MULTIPLE SHEETS */
GETNAMES=YES;

/*TO TAKE VARIABLE NAMES FROM TOP ROW */

   RANGE=”A4:AB2000″;

/*SPECIFYING RANGE OF CELLS  IN SPREADSHEET TO BE READ */

RUN;

PROC IMPORT OUT= auto.TEST2
DATAFILE= “C:\Documents and Settings\excel2-full.xls”
DBMS=EXCEL2000 REPLACE;
SHEET=”‘Sales$'”;

/*SPECIFYING WORKSHEET FOR MULTIPLE SHEETS */
GETNAMES=YES;

/*TO TAKE VARIABLE NAMES FROM TOP ROW */

   RANGE=”A4:AB2000″;

/*SPECIFYING RANGE OF CELLS  IN SPREADSHEET TO BE READ */

RUN;

/* COMPARING THE TWO SPREADSHEETS */

proc compare base=auto.test1 compare=auto.test2;
var

/*SPECIFYING WHAT VARIABLES TO BE COMPARED */
Applications

Approvals

Disbursals

30dayplus

60dayplus

90dayplus

;
with Branch;

/*SPECIFYING VARIABLE FOR COMPARISON

FOR SAME BRANCH IN THIS CASE */
 run;
The output will simply compare and compute the cell by cell difference.

You can then use ods to ouput this in another big spreadsheet 🙂

This is particularly relevant in telecommunications and banks, where they need to compare a lot of metrics across timely intervals.