Bi , Ba and Bs

Business intelligence is an over-used term that has had its day, and business analytics is now the differentiator that will allow customers to better forecast the future especially in this current economic climate.Business intelligence doesn’t make a difference to the top or bottom line, and is merely a productivity tool like e-mail.

Quote from Jim Davis ,

SAS Institute Inc.’s senior vice-president and chief marketing officer.

Pigeon-holing one element or another as backward-looking and another as forward-looking doesn’t even make much marketing sense, let alone being a tenable intellectual position to take. I think it is not unreasonable to expect more cogent commentary from the people at SAS than Mr Davis’ recent statements.

Post from Peter Thomas, Business Intelligence Guru.

Bottom line, it’s all fluff. I don’t like the term business analytics; it doesn’t tell me anything. Frankly, I think business intelligence as a term is downright laughable, too. What does that mean?

Post from Neil Raden Founder of Hired Brains

Here are my views on this

  • Is the distinction pure branding or semantics. Or is it rebranding  because SAS is the leader and the biggest business analytics and would not be the biggest business intelligence vendor- thus marking a tactical and aggressive shift in their strategy –
  • Also SAS remains the largest independent private business and the recent consolidation in this industry could be unsettling to people who want to keep it independent.
  • Ultimately customers vote with the cheque books –

call it business intelligence, business analytics or business as usual.

Bi,ba or bs

Modeling Visualization Macros

Here is a nice SAS Macro from Wensui’s blog at http://statcompute.spaces.live.com/blog/

Its particularly useful for Modelling chaps, I have seen a version of this Macro sometime back which had curves also plotted but this one is quite nice too

SAS MACRO TO CALCULATE GAINS CHART WITH KS

%macro ks(data = , score = , y = );

options nocenter mprint nodate;

data _tmp1;
  set 
&data;
  where &score ~= . and y in (1, 0);
  random = ranuni(1);
  keep &score &y random;
run;

proc sort data = _tmp1 sortsize = max;
  by descending &score random;
run;

data _tmp2;
  set _tmp1;
  by descending &score random;
  i + 1;
run;

proc rank data = _tmp2 out = _tmp3 groups = 10;
  var i;
run;

proc sql noprint;
create table
  _tmp4 as
select
  i + 1       as decile,
  count(*)    as cnt,
  sum(&y)     as bad_cnt,
  min(&score) as min_scr format = 8.2,
  max(&score) as max_scr format = 8.2
from
  _tmp3
group by
  i;

select
  sum(cnt) into :cnt
from
  _tmp4;

select
  sum(bad_cnt) into :bad_cnt
from
  _tmp4;    
quit;

data _tmp5;
  set _tmp4;
  retain cum_cnt cum_bcnt cum_gcnt;
  cum_cnt  + cnt;
  cum_bcnt + bad_cnt;
  cum_gcnt + (cnt – bad_cnt);
  cum_pct  = cum_cnt  / &cnt;
  cum_bpct = cum_bcnt / &bad_cnt;
  cum_gpct = cum_gcnt / (&cnt &bad_cnt);
  ks       = (max(cum_bpct, cum_gpct) – min(cum_bpct, cum_gpct)) * 100;

  format cum_bpct percent9.2 cum_gpct percent9.2
         ks       6.2;
  
  label decile    = ‘DECILE’
        cnt       = ‘#FREQ’
        bad_cnt   = ‘#BAD’
        min_scr   = ‘MIN SCORE’
        max_scr   = ‘MAX SCORE’
        cum_gpct  = ‘CUM GOOD%’
        cum_bpct  = ‘CUM BAD%’
        ks        = ‘KS’;
run;

title "%upcase(&score) KS";
proc print data  = _tmp5 label noobs;
  var decile cnt bad_cnt min_scr max_scr cum_bpct cum_gpct ks;
run;    
title;

proc datasets library = work nolist;
  delete _: / memtype = data;
run;
quit;

%mend ks;    

data test;
  do i = 1 to 1000;
    score = ranuni(1);
    if score * 2 + rannor(1) * 0.3 > 1.5 then y = 1;
    else y = 0;
    output;
  end;
run;

%ks(data = test, score = score, y = y);

/*
SCORE KS              
                                MIN         MAX
DECILE    #FREQ    #BAD       SCORE       SCORE     CUM BAD%    CUM GOOD%        KS
   1       100      87         0.91        1.00      34.25%        1.74%      32.51
   2       100      78         0.80        0.91      64.96%        4.69%      60.27
   3       100      49         0.69        0.80      84.25%       11.53%      72.72
   4       100      25         0.61        0.69      94.09%       21.58%      72.51
   5       100      11         0.51        0.60      98.43%       33.51%      64.91
   6       100       3         0.40        0.51      99.61%       46.51%      53.09
   7       100       1         0.32        0.40     100.00%       59.79%      40.21
 &#
160; 8       100       0         0.20        0.31     100.00%       73.19%      26.81
   9       100       0         0.11        0.19     100.00%       86.60%      13.40
  10       100       0         0.00        0.10     100.00%      100.00%       0.00
*/

Its particularly useful for Modelling , I have seen a version of this Macro sometime back which had curves also plotted but this one is quite nice too.

Here is another example of a SAS Macro for ROC Curve  and this one comes from http://www2.sas.com/proceedings/sugi22/POSTERS/PAPER219.PDF

APPENDIX A
Macro
/***************************************************************/;
/* MACRO PURPOSE: CREATE AN ROC DATASET AND PLOT */;
/* */;
/* VARIABLES INTERPRETATION */;
/* */;
/* DATAIN INPUT SAS DATA SET */;
/* LOWLIM MACRO VARIABLE LOWER LIMIT FOR CUTOFF */;
/* UPLIM MACRO VARIABLE UPPER LIMIT FOR CUTOFF */;
/* NINC MACRO VARIABLE NUMBER OF INCREMENTS */;
/* I LOOP INDEX */;
/* OD OPTICAL DENSITY */;
/* CUTOFF CUTOFF FOR TEST */;
/* STATE STATE OF NATURE */;
/* TEST QUALITATIVE RESULT WITH CUTOFF */;
/* */;
/* DATE WRITTEN BY */;
/* */;
/* 09-25-96 A. STEAD */;
/***************************************************************/;
%MACRO ROC(DATAIN,LOWLIM,UPLIM,NINC=20);
OPTIONS MTRACE MPRINT;
DATA ROC;
SET &DATAIN;
LOWLIM = &LOWLIM; UPLIM = &UPLIM; NINC = &NINC;
DO I = 1 TO NINC+1;
CUTOFF = LOWLIM + (I-1)*((UPLIM-LOWLIM)/NINC);
IF OD > CUTOFF THEN TEST="R"; ELSE TEST="N";
OUTPUT;
END;
DROP I;
RUN;
PROC PRINT;
RUN;
PROC SORT; BY CUTOFF;
RUN;
PROC FREQ; BY CUTOFF;
TABLE TEST*STATE / OUT=PCTS1 OUTPCT NOPRINT;
RUN;
DATA TRUEPOS; SET PCTS1; IF STATE="P" AND TEST="R";
TP_RATE = PCT_COL; DROP PCT_COL;
RUN;
DATA FALSEPOS; SET PCTS1; IF STATE="N" AND TEST="R";
FP_RATE = PCT_COL; DROP PCT_COL;
RUN;
DATA ROC; MERGE TRUEPOS FALSEPOS; BY CUTOFF;
IF TP_RATE = . THEN TP_RATE=0.0;
IF FP_RATE = . THEN FP_RATE=0.0;
RUN;
PROC PRINT;
RUN;
PROC GPLOT DATA=ROC;
PLOT TP_RATE*FP_RATE=CUTOFF;
RUN;
%MEND;

VERSION 9.2 of SAS has a macro called %ROCPLOT http://support.sas.com/kb/25/018.html

SPSS also uses ROC curve and there is a nice document here on that

http://www.childrensmercy.org/stats/ask/roc.asp

Here are some examples from R with the package ROCR from

http://rocr.bioinf.mpi-sb.mpg.de/

 

image

Using ROCR’s 3 commands to produce a simple ROC plot:
pred <- prediction(predictions, labels)
perf <- performance(pred, measure = "tpr", x.measure = "fpr")
plot(perf, col=rainbow(10))

The graphics are outstanding in the R package and here is an example

Citation:

Tobias Sing, Oliver Sander, Niko Beerenwinkel, Thomas Lengauer.
ROCR: visualizing classifier performance in R.
Bioinformatics 21(20):3940-3941 (2005).

 

Interview with Anne Milley, SAS II

Anne Milley is director of product marketing, SAS Institute . In part 2 of the interview Anne talks of immigration in technology areas, open source networks ,how she misses coding and software as a service especially SAS Institute’s offering . She also reveals some preview on SAS ‘s involvement with R and mentions cloud computing.

Anne_Milley

Ajay – Labor arbitrage outsourcing versus virtual teams located globally. What is the SAS Inst position and your opinion on this. What do you feel about the recent debate on HB1 visas and job cuts. How many jobs if at all is SAS planning to cut in 2009-2010.

Anne – SAS is a global company, with customers in more than 100 countries around the world.  We hire employees in these countries to help us better serve our global customers.  Our workforce decisions are based on our business needs.  We also employ virtual teams–the feedback and insights from our global workforce help us improve and develop new products to meet the evolving needs of our customers.  (As someone who works from her home office in Connecticut, I am a fan of virtual teaming!)  We see these approaches as complementary.

The issue of the H-1B visa is a different discussion entirely.  H-1B visas, although capped, permit US employers to bring foreign employees in “specialty occupations” into this country.   The better question, though, is what is necessitating the need for H-1B visas.  We would submit that the reason the U.S. has to look outside its borders for highly qualified technical workers is because we are not producing a sufficient number of workers with the right skill sets to meet U.S. demand.  In turn, that means that our educational system is not producing students interested or qualified to pursue the STEM (science, technology, engineering or mathematics) professions (either at a K-12 or post-secondary level), or developing the workforce improvement programs that may allow workers to pursue these “specialty occupations.”  Further, any discussion about H-1B visas (or any other type of visa) should include a more comprehensive review of our nation’s immigration policies—are they working, are they not working, how or why are they, are we able to limit illegal immigration and if not, why not, etc.

I am not aware of any planned job cuts at SAS.  In fact, I am aware of a few groups which are actively hiring.

Ajay- What open source softwares have SAS Institute worked in the past and it continues to support financially as well as technologically.  Any exciting product releases in 2009-2010 that you can tell us about.

Anne- Open source software provides many options and benefits.  We see many (SAS included) embracing open source for different things.  Our software runs on Linux and we use some open-source tools in development. There are different aspects of open source software in developing SAS software:

-Development with open source tools such as Eclipse, Ant, NAnt, JUnit, etc. to build, test, and package our software

-Using open source software in our products; examples include Apache/Jakarta products such as the Apache Web Server.

-Developing open source software, making changes to an open source codebase, and optionally contributing that source back to the open source project, to adapt an open source project for use in a SAS product or for internal use. Example: Eclipse.

And we plan to do more with open source in the future.  The first step of SAS integrating with R will be shown at SAS Global Forum coming up in DC later this month.  Other announcements for new offerings are also planned at this event. 

Ajay- What do you feel about adopting Software as a service for any of  SAS Institute’s products. Any new initiatives from SAS on the cloud computing front especially in terms of helping customers cut down on hardware costs.

Anne- SAS Solutions OnDemand, the division which oversees the infrastructure and support of all our hosted offerings, is expanding in this rapidly growing market.  SAS Solutions OnDemand Drug Development was our first SaaS offering announced in January.  Additional news on new hosted offerings will be announced at SAS Global Forum later this month.  SAS doesn’t currently offer any external cloud computing options, but we’re actively looking at this area.

AjayWhich software do you personally find best to write code into and why. Do you miss writing code, if so why ?

Anne- In my current role, I have limited opportunity to write code.  At times, I do miss the logical thought process coding forces you to adopt (to do the job as elegantly as possible).  I had the opportunity to do a long-term assignment at a major financial services company in the UK last year and did get to use some SAS and JMP, including a little JSL (JMP scripting language).  There’s nothing like real-world, noisy, messy data to make you thankful for the power of writing code!  Even though I don’t write code on a regular basis, I am happy to see continued investment in the languages SAS provides—among the most recent, the addition of an algebraic optimization modeling language in our SAS/OR module contained within the SAS language as “PROC OPTMODEL.”

I have great respect for people who invest in learning (or even getting exposure to) more than one language and who appreciate the strengths of different languages for certain tasks and applications.

Ajay- It is great to see passionate people at work on both sides of the open source as well as packaged software teams- and even better for them to collaborate once in a while.Most of our work is based on scientists who came before us (especially in math theory).

Ultimately we are all just students of science anyway.

SAS Global Forum –http://support.sas.com/events/sasglobalforum/2009/

Annual event of SAS language practitioners.SAS language consists of data step and proc steps for input and output thus simplifying syntax for users.

SAS Institute – The leader of analytics software since 1970’s , it grew out of the North Carolina University, and provides jobs to thousands of people. The world’s largest privately held company, admired for it’s huge investments in Research and Development and criticized for its premium price  on packaged software solutions.A recent entrant in corporate users who are willing to support R language.

Interview – Anne Milley, SAS Part 1

Anne Milley has been a part of SAS Institute’s core strategy team.

She was in the news recently with an article by the legendary Ashlee Vance in the Bits Blog of  New York Times http://bits.blogs.nytimes.com/2009/02/16/sas-warms-to-open-source-one-letter-at-a-time/

In the article,  Ms. Milley said, “I think it addresses a niche market for high-end data analysts that want free, readily available code. We have customers who build engines for aircraft. I am happy they are not using freeware when I get on a jet.”

To her credit, Ms. Milley addressed some of the critical comments head-on in a subsequent blog post.

This sparked my curiosity in knowing Anne ,and her perspective more than just a single line quote and here is an interview. This is part 1 of the interview . Anne_Milley

Ajay -Describe your career journey , both out of and in SAS Institute. What advice would you give to young high school students to pursue careers in science. Do you think careers in science are as rewarding as other careers.

Anne-

Originally, I wanted to major in international business to leverage my German (which is now waning from lack of use!).  I found the marketing and management classes at the time provided little practical value and happily ended up switching to the college of social science in the economics department, where I was challenged with several quantitative courses and encouraged to always have an analytical perspective.  In school, I was exposed to BASIC, SPSS, SHAZAM, and SAS.  Once I began my thesis (bank failure prediction models and the term structure of interest rates) and started working, it was SAS that served as the best software investment, both in banking (Federal Home Loan Bank of Dallas) and in retail (7-Eleven Corp.).  After 5+ years in Dallas, my husband wanted to move back to New England and SAS happened to be opening an office at the time.  From there, I enjoyed a few years as a pre-sales technical consultant, many years in analytical product management, and most recently in product marketing.  All the while, it has been a great motivating factor to work with so many talented people focused on solving problems, revealing opportunities and doing things better—both within and outside of SAS.

For high school and college students, I urge them to invest in studying some math and science, no matter the career they’re pursuing.  Whether they are interested in banking/finance, medicine and the life sciences, engineering or other fields, courses that will help them explore and analyze data, and come up with new approaches, new solutions, new advances based on a more scientific approach will pay off.

Course work in statistics, operations research, computer science and others will help hone skills for today’s data- and analytics-driven world.  One example of this idea in action:  North Carolina State University’s (NCSU) Institute for Advanced Analytics is seeing a huge increase in interest.  Its first graduating class last year saw higher average salaries than other graduate programs and multiple job offers per graduate.  Why?  Because there is still a huge demand for graduates with the ability to manipulate and analyze data in order to make better, more informed decisions.  I personally think careers in math and science are especially rewarding, but we need many diverse skills to make the world go round :o)

Ajay- Big corporations versus Startups. Where do you think is the balance between being big in terms of stability and size and being swift and nimble in terms of speed of product roll outs. What are the advantages and disadvantages of being a big corporation in a fast changing technology field.

Anne-

Ever a balancing act, with continuous learning along the way.  The advantage of being big (and privately held) is that you can be more long-term-oriented.  The challenge with fast-changing technology is to know where to best invest.  While others may go to market faster with new capabilities, we seek to provide superior implementations (we invest in ‘R’ (Research) AND ‘D’ (Development), making capabilities available on a number of platforms. 

In today’s economy, I think the big vs. small comparison is becoming less and less relevant.  Big corporations need to be agile and innovative, like their smaller rivals.  And small- to medium-sized businesses (SMBs) need to use the same techniques and technologies as the “big boys.”

First, on the big side, I’ll use an example of which I’m very familiar:  At SAS, a company founded more than 30 years ago as an entrepreneurial venture, we’ve certainly changed over the decades.  SAS started out in a small office with a handful of people.  It’s now a global company with hundreds of offices and thousands of employees around the world.  Yet one thing that has not changed for SAS in all this time:  a laser-like focus on the customer.  This has been the key to SAS’ success and uninterrupted growth .Not really a “secret sauce.” Just a simple yet profound approach: listen carefully to your customers and their changing needs, and innovate, develop and adapt based on these needs.

Of course, being large has its advantages:  we have more ideas from more people, and creativity and innovation knows no borders.  From Sydney to Warsaw, São Paulo to Singapore, Shanghai to Heidelberg, SAS employees work closely with customers to meet their business needs today and in the future.

SAS provides the stability and proven success that businesses look for, particularly in troubled economic times.  Being large and privately held enables SAS to grow when others are cutting back, and continue to invest in R&D at a high rate – 22% of revenues in 2008.

Yet with our annual subscription licensing model, SAS cannot rest on its laurels.  Each year, customers vote with their checkbooks:  if SAS provided them with business benefits, results and a positive ROI, they renew; if not, they can walk away.  Happily for SAS, the overwhelming majority of customers keep coming back.  But the licensing model keeps SAS on its toes, customer-focused, and always listening and innovating based on customer feedback.

As for SMBs, they are rapidly adopting the technologies used by large companies – such as business analytics – to compete in the global economy.  Two examples of this:

BGF Industries is a manufacturer of high-tech fabrics used in jet fighters, bullet-proof vests, movie-theater screens and surfboards, based in Greensboro, NC. BGF turned to SAS business analytics to help it deal with foreign competition.  BGF created a cost-effective, easy-to-use early-warning system that helps it track quality and productivity.  Per BGF, data is now available in minutes instead of hours.  And in the business world, this speed can be the difference between success and failure.  Per Bobby Hull, a BGF systems analyst: “The early-warning system we built with SAS allowed us to go from nothing to everything.  SAS allows us to focus away from clerical tasks to focus on the quality and process side of the job. Because of SAS, we’re never more than three clicks away from finding an answer.”

For Los Angeles-based The Wine House, installing a SAS-powered
inventory-management system helped it discover nearly $400,000 in “lost” inventory sitting on warehouse shelves.  For an SMB with annual sales of $20 million, that was a major find.  Business analytics helps it to compete with major retail and grocery chains.  Per Bill Knight, owner of The Wine House: “The first day the SAS application was live, we identified approximately 1,000 cases of wine that had not moved in over a year. That’s significant cash tied up in inventory.  We had a huge sale to blow it out, and just in time, because in today’s economy, we would be choking on that inventory.”

So regardless of size, businesses must remain agile, listen to their customers, and use technologies like business analytics to make sense of and derive value from their data – whether on the quality of surfboard covers or the number of cases of Oregon Pinot Noir in stock.

3) SAS Institute has been the de-facto leader in both market volume share as well as market value share in the field of data analytics. What are some of the factors do you think have contributed to this enduring success. What have been the principal challengers over the years.(Any comments on the challenge from SAS language software WPS please ??)

At SAS, we seek to provide a complete environment for analytics—from data collection, data manipulation, data exploration, data analysis, deployment of results – and the means to manage that whole process.  Competition comes in many forms and it pushes us to keep delivering value.  For me, one thing that sets SAS apart from other vendors is that we care so deeply about the quality of results.  Our Technical Support, Education and consulting services organizations really do partner with customers to help them achieve the best results.  That kind of commitment is deep in the DNA of SAS’ culture.

The good thing about competition is that it forces you to re-examine your value proposition and rethink your business strategy.  Customers value attributes of their analytics infrastructure in varying degrees— speed, quality, support, flexibility, ease of migration, backward and forward compatibility, etc.  Often there are options to trump any one or a subset of these and when that aligns with the customers’ priorities of what they value, they will vote with their pocketbooks.  For some customers with tight batch-processing windows, speed trumps everything.  In tests conducted by Merrill Consultants, an MXG program running on WPS runs significantly longer, consumes more CPU time and requires more memory than the same MXG program hosted on its native SAS platform.

While it’s easy to get caught up in fast-changing technology, one has to also consider history.  Some programming languages come and go; others have stood the test of time.  Even the use of different flavors of analysis ebbs and flows.  For instance, when data mining was all the rage almost a decade ago, many asked the very good question, “Why so much excitement about analyzing so much opportunistic data when design of experiments offers so much more?”  Finally, experimental design is being more readily adopted in areas like marketing.

At the end of the day, innovation is the only sustainable competitive advantage.  As noted above in question 2, SAS has remained firmly committed to customer-driven innovation.  And SAS has “stuck to its knitting” with respect to analytics.  A while back, SAS used to stand for “Statistical Analysis System.” If not literally, then philosophically, Analytics remains our middle name.

(Ajay- to be continued)

An R Package only for SAS Users

Dear All,

I am doing some research into creating a R Package for SAS language Users.

The name of the beta package is “ Anne”, but I am open to suggestions for the name please.

The basic idea is to enable SAS language Users (especially Windows SAS language  users) to get a feel to try out the R package without getting overwhelmed with the matrix level powerful capabilities as well as command line interface.

Creating new functions is quite easy as the following code shows.

The first R code for the “Anne 1.0” Package is

procunivariate(x) <- function(x) summary(x)

procimportcsv(x) <- function(x) read.table(x,header=TRUE,

                           + sep=”,”, row.names=”id”, na.string=”   “)

libname(x) <-function(x) setwd(x)

 

Note I am tweaking the code as we speak and would be trying to add one proc per week.

But how to put functions in a R Package ?

This is how to create a R package –( To be Continued)

Note- SAS here refers to SAS Language.

 

Learning SAS for SPSS Users

SAS Publishing just came out with a nice and nifty 28 page pdf document “ Coming To SAS FROM SPSS – A programming approach” Its a nice read, very useful for people curious or willing to try  SAS after learning SPSS, and very well written by Susan J Slaughter and Lora D , who have written “The Little SAS Book” , one of the most popular SAS handbooks ever written.

 

You can download it or plainly read it from

http://support.sas.com/publishing/bbu/companion_site/62272.pdf

SPSS of course has very nice menu driven setting, while SAS programmers generally prefer the scripting way of writing code- they do have menus in various products.

The big big Analytics Conference

The Predictive Analytics Conference (http://www.predictiveanalyticsworld.com/ ) starts today in Hotel Nikko ,San Francisco . A whole who’s who of analytics experts is gathering there including SAS,SPSS ,SAP, Click Forensics ,Acxiom ,Amazon, Google and a big R user conference as well. It is really really huge so stay tuned for some exciting announcements happening there.

image