Best of Decision Stats- Modeling and Text Mining Part3

Here are some of the top articles by way of views, in an  area I love– of modeling and text mining.

1) Karl Rexer – Rexer Analytics

http://www.decisionstats.com/2009/06/09/interview-karl-rexer-rexer-analytics/

Karl produces one of the most respected surveys that captures emerging trends in data mining and technology. Karl was also one of the most enthusiastic people I have interviewed- and I am thankful for his help in getting me some more interviews.

2) Gregory Piatesky Shapiro

One of the earliest and easily the best Knowledge Discoverer of all times, Gregory produces http://www.kdnuggets.com and the newsletter is easily the must newsletter to be on. Gregory was doing data mining , while the Google boys were still debating whether to drop out of Stanford or not.
Continue reading “Best of Decision Stats- Modeling and Text Mining Part3”

Buying SAS Institute

At risk of annoying a lot of friendly people, I am going to ask an old question and try and answer it quantitatively.

Who can buy SAS institute?

Graph from-http://www.sas.com/news/preleases/2008Financials.html

SAS_revenue_lores

As you can see from the graph (note the post 2001-2004 period) – which is a nice smoothed curve, textbook normal distribution on the left side, SAS Institute grew during the tough economic year of 2008 to show slowed but firm revenue growth. However if you use the same price/revenue multiple as for the SPSS acquisition ( 1.2 billion/ 300 million (2008) revenues) – that would put a price of 9.2 USD billion on SAS Institute.

Who has that kind of money? Well it seems the usual suspects are-

1) HP- from http://h30261.www3.hp.com/phoenix.zhtml?c=71087&p=irol-IRHome

and

Click to access HewlettPackard_2008_AR.pdf

Cash and cash equivalents on 12.851 Billion USD as on April 30, 2009.

2) Oracle- Oracle would be hard pressed to integrate both Sun and SAS in the same year, but may have financial leverage to do both.

from http://www.oracle.com/corporate/investor_relations/earnings/4q09-pressrelease-june.pdf

Fiscal year 2009
GAAP revenues were up 4% to $23.3 billion, while annual GAAP net income was up 1% to $5.6
billion.  Total GAAP new software license revenues for the year were down 5% to $7.1 billion.
GAAP software license updates and product support revenues were up 14% to $11.8 billion.
GAAP operating income was up 6% to $8.3 billion, and GAAP operating margins were up 80
basis points to 36% in fiscal year 2009.

3) IBM -from ftp://ftp.software.ibm.com/annualreport/2008/2008_ibm_financials.pdf

Cash on hand was 12.7 Billion USD as on 31 Dec 2008, and the company repurchased it’s own stock in 2008

In the current economic environment growth can come through acquisitions of newer clients ( not much) or new companies. IBM has capabilities to acquire BOTH SPSS and SAS Institute and merge the strong R and D facilities.

IBM 2008

4) SAP – from http://www.sap.com/germany/about/investor/reports/gb2008/en/our-results/finances.html

various sources of loan capital:

profit after income taxes for 2008 was slightly lower than for the previous year, we increased cash flows from operating activities 12% to € 2,158 million (2007: € 1,932 million) through efficient management of working capital.

  • To finance the acquisition of Business Objects, we entered into an agreement for a credit facility that was originally for € 5 billion and is repayable by December 31, 2009 (amount outstanding on December 31, 2008: € 2.3 billion). We did not draw the full € 5 billion available under the facility because we paid part of the purchase price from available cash.
  • To increase financial flexibility, in November 2004 we obtained a € 1 billion syndicated credit facility through an international group of banks. We already had other lines of credit in place; the new line was arranged to provide additional financial flexibility. As in the previous year, we did not draw on this facility during the year.
  • At the end of 2008, the other, bilateral lines of credit available to SAP AG totaled approximately € 597 million (2007: € 599 million). We did not draw on these facilities during 2008 or 2007. Several subsidiaries in the SAP Group had credit lines in their local currency. These totaled € 52 million (2007: € 44 million), for which SAP AG was guarantor. At the end of the year, the subsidiaries had drawn € 21 million under these facilities (2007: € 27 million).

Given these cash positions it seems that almost everyone can buy SAS Institute if and this is a big IF- someone sells it. Microsoft which some years allegedly tried and lost at acquiring Yahoo ( only to realize huge savings!) and SAS, would be also another suitor for SAS- and Google also has the financial and operating synergies with the best text mining capabilities could also act as a white knight in merging it’s Google Applications and Enterprise solutions ( especially the cloud based OS and cloud based productivity suite) with SAS Institute. I personally would favor a Google- SAS Institute joint venture on enterprise software solely based on the common history and shared values ( Note Google has dual ownership stock including class A and class B shares)

Who is John Galt ?

Another option could be using the Google Way and for SAS Institute to go for dual ownership IPO, with class A shares for the common public and class B shares for the founders and executives. A substantial endowment to colleges and universities can also be expected in the future, given the philanthropic tradition of SAS Institute owners and executives. Also could SAS try and buy SPSS- it would lead to synergies in both software ( with the SPSS GUI) as well as new clients. At the very minimum it would boost the valuation of other stock in this sector as well make SPSS more realistic valued.

So who will buy SAS Institute?

I don’ know 🙂 and I am just brushing off my half a decade old financial valuation skills here

What is the true value of SPSS

A brief study of the charts at http://tr.im/vDA4 ( CourtesyGoogle Finance) would suggest IBM is getting a bargain  for SPSS Inc.

And Oracle, Microsoft and other companies ( even the privately held SAS Institute) can do well to step in and take it away or at the very minimum make the valuation even more steep for IBM to hold on to.

SPSS reported in 2007 Total Revenue of $291million with a Net Income of $33.73million and in 2008 Total Revenue of $302.91million with a Net Income of $36.05million. Shares of SPSS Inc. (Public, NASDAQ:SPSS) increased from about $35 per share before the announcement to $49.50 per share after the announcement.

Citation-

http://shareholdersfoundation.com/caseinvestigation/spss-inc-takeover-subject-investor-investigations

SPSS.

Chart at http://tr.im/vDA4

Reactions to IBM -SPSS takeover.

The business intelligence -business analytics- data mining industry ( or as James Taylor would say Decision Management Industry) have some reactions on IBM – SPSS ( which was NOT a surprise to many including me). Really.

From SAS Institute, Anne Milley

http://blogs.sas.com/sascom/index.php?/archives/557-Analytics-is-still-our-middle-name.html

Besides SAS, SPSS was one of the last independent analytic software companies. A colleague says, “It’s the end of the analytics cold war.”

I’ve been saying all along that analytics is required for success. Yes, data integration, data quality, and query & reporting are important too but, as W. Edwards Deming says, “The object of taking data is to provide a basis for action.”

The end of the analytics cold war- hmm. We all know what the end of real cold war brought us- Google, Cloud Computing, and other non technical issues.

From KXEN, Roger Hadaad

“The price paid for SPSS of four times revenues and 25 times earnings shows just how valuable this sector really is,” says Haddad. “But the deal has also created a tremendous opportunity for the sector’s remaining independent vendors that

KXEN is well placed to capitalize on. “There is no For Sale sign hanging in our window,” continues Haddad. “We launched KXEN in 1998 to democratize the benefits of data mining and predictive analytics, making them practical and affordable across the whole enterprise and not just the exclusive preserve of a few specialists. It’s going to take up to two years for the dust to settle following the IBM

“Former SPSS partners, systems integrators and distributors will face uncertainty.”

I think the PE multiple was still low- SPSS was worth more if you count the client base, active community, brand itself in the valuation. Tremendous cross sell opportunities and IBM with it’s nice research and development is a good supporter of pure science.  Yes, next two years would be facing increasing consolidation and more “surprising” news. At 4 times earnings, anyone can be bought in the present market if it is a public listed company. 😉

From the rather subdued voices on SPSS list, some subjective and non quantitative ‘strategic” forecasts.

http://www.listserv.uga.edu/cgi-bin/wa?A2=ind0907&L=spssx-l&F=&S=&P=36324

I think the Ancient Chinese said it best “May you live in interesting times”.

Having worked with some flavors of Cognos and SPSS, I think there could be areas for technical integration for querying and GUI based forecasting as well, apart from financial mergers and administrative re adjustments. I mean people pull data not just to report it, but to estimate what comes next as well.

This could also spell the end of uni platform skilled analysts. You now need to learn atleast two different platforms like SAS,SPSS or KXEN, R or Cognos, Business Objects to hedge your chances of getting offshored (Note- I worked in offshoring for almost 4 years in India in data analytics).

Answering what IBM will do with SPSS and it’s open source commitment to R and consequences for employees, customers, vendors,partners who have more choices now than ever.

…. well it depends. Who is John Galt?

Decisionstats Interviews

Here is a list of interviews that I have published- these are specific to analytics and data mining and include only the most recent interviews. If I have missed out any notable recent interview related to analytics and data mining, kindly do let me know. Hat Tip to Karl Rexer, for this suggestion .

Date    Name of Interviewee    Designation and Organization

09-Jun    Karl Rexer                          President, Rexer Analytics
05-Jun    Jim Daves                          CMO, SAS Institute
04-Jun    Paul van Eikeren                 President and CEO, Blue Reference
29-May    David Smith                      Director of Community, REvolution Computing
17-May    Dominic Pouzin                 CEO, Data Applied
11-May    Bruno Delahaye                 VP, KXEN
04-May    Ron Ramos                        Director, Zementis
30-Apr    Oliver Jouve                       VP, SPSS Inc
21-Apr    Fabian Dill                         Co- Founder, Knime.com
18-Apr    Alicia Mcgreevey                 Head Marketing, Visual Numerics
27-Mar    Francoise Soulie Fogelman    VP, KXEN
17-Mar    Jon Peck                            Principal Software Engineer, SPSS Inc
06-Mar    Anne Milley                        Director of product marketing, SAS Institute
04-Mar    Anne Milley                        Director of product marketing, SAS Institute
03-Feb    Phil Rack                            Creator, Bridge to R,and CEO Minequest
03-Feb    Michael Zeller                     CEO, Zementis
31-Jan    Richard Schultz                   CEO, Revolution Computing
21-Jan    Bob Muenchen                    Author, R for SAS and SPSS Users
13-Jan    Dr Graham Williams           Creator, Rattle GUI for R
05-Jan    Roger Haddad                    CEO, KXEN
26-Sep    June Dershewitz                  VP, Semphonic
04-Sep    Vincent Granville                 Head, Analyticbridge

The URl’s to specific interviews are also in this sheet.

http://spreadsheets.google.com/pub?key=rWTqcMe9mqwHeFv1e4GS_yg&single=true&gid=0&range=a1%3Ae24&output=html

Interview Karl Rexer -Rexer Analytics

Here is an interview with Karl Rexer of Rexer Analytics. His annual survey is considered a benchmark in the data mining and analytics industry. Here Karl talks of his career, his annual survey and his views on the industry direction and trends.

Almost 20% of data miners report that their company/organizations have only minimal analytic capabilities – Karl Rexer

IMG_2031

Ajay- Describe your career in science. What advice would you give to young science graduates in this recession? What advice would you give to high school students choosing from science – non science careers?

Karl- My interests in science began as a child. My father has multiple science degrees, and I grew up listening to his descriptions of the cool things he was building, or the cool investigative tools he was using, in his lab. He worked in an industrial setting, so visiting was difficult. But when I could, I loved going in to see the high-temperature furnaces he was designing, the carbon-fiber production processes he was developing, and the electron microscope that allowed him to look at his samples. Both of my parents encouraged me to ask why, and to think critically about both scientific and social issues. It was also the time of the Apollo moon landings, and I was totally absorbed in watching and thinking about them. Together these things motivated me and shaped my world-view.

I have also had the good fortune to work across many diverse areas and with some truly outstanding people. In graduate school I focused on applied statistics and the use of scientific methods in the social sciences. As a grad student and young academic, I applied those skills to researching how our brains process language. But on the side, I pursued a passion for using the scientific method and analytics to address ….well anything I could. We called it “statistical consulting” then, but it often extended to research design and many other parts of the scientific process. Some early projects included assisting people with AIDS outcome studies, psycholinguistic research, and studies of adolescent adjustment.

My first taste of applying these skills outside of an academic environment was with my mentor Len Katz. The US Navy hired us to help assess the new recruits that were entering the submarine school. Early identification of sailors who would excel in this unusual and stressful environment was critical. Perhaps even more important was identifying sailors who would not perform well in that environment. Luckily, the Navy had years of academic and psychological testing on many sailors, and this data proved quite useful in predicting later job performance onboard the submarines. Even though we never got the promised submarine ride, I was hooked on applying measurement, scientific methods, and analytics in non-academic settings.

And that’s basically what I have continued to do – apply those skills and methods in diverse scientific and business settings. I worked for two banks and two consulting firms before founding Rexer Analytics in 2002. Last year we supported 30 clients. I’ve got great staff and they have great quant skills. Importantly, we also don’t hesitate to challenge each other, and we’re continually learning from each other and from each client engagement. We share a love of project diversity, and we seek it out in our engagements. We’ve forecasted sales for medical devices, measured B2B customer loyalty, identified manufacturing problems by analyzing product returns, predicted which customers will close their bank accounts, analyzed millions of tax returns, helped identify the dimensions of business team cohesion that result in better performance, found millions of dollars of B2B and B2C fraud, and helped many companies understand their customers better with segmentations, surveys, and analyses of sales and customer behavior.

The advice I would give to young science grads in this recession is to expand your view of where you can apply your scientific training. This applies to high school students considering science careers too. All science does not happen in universities, labs and other traditional science locations. Think about applying scientific methods everywhere! Sometimes our projects at Rexer Analytics seem far away from what most people would consider “science.” But we’re always asking “what data is available that can be brought to bear on the business issue we’re addressing.” Sometimes the best solution is to go out and collect more data – so we frequently help our clients improve their measurement processes or design surveys to collect the necessary data. I think there are enormous opportunities for science grads to apply their scientific training in the business world. The opportunities are not limited to physics wiz-kids making models for Wall Street trading or computer science students moving to Silicon Valley. One of the best analytic teams I ever worked on was at Fleet Bank in the late 90s. We had an economist, two physicists, a sociologist, a psychologist, an operations research guy, and person with a degree in marketing science. We were all very focused on data, measurement, and analytic methods.

I recommend that all science grads read Tom Davenport’s book Competing on Analytics *. It illustrates, with compelling examples, how businesses can benefit from using science and analytics. Several examples in Tom’s book come from Gary Loveman, CEO of Harrah’s Entertainment. I think that Gary also serves as a great example of how scientific methods can be applied in every industry. Gary has a PhD in economics from MIT, he’s worked at the Federal Reserve Bank, he’s been a professor at Harvard, but more recently he runs the world’s largest casino and gaming company. And he’s famously said many times that there are three ways to get fired at Harrah’s: steal, harass women, or not use a control group. Business leaders across all industries are increasingly wanting data, analytics and scientific decision-making. Science grads have great training that enables them to take on these roles and to demonstrate the success of these methods.

Ajay- One more survey- How does the Rexer survey differentiate itself from other surveys out there?

Karl- The Annual Rexer Analytics Data Miner Survey is the only broad-reaching research that investigates the analytic behaviors, views and preferences of data mining professionals. Each year our sample grows — in 2009 we had over 700 people around the globe complete our survey. Our participants include large numbers of both academic and business people.

Another way our survey is differentiated from other surveys is that each year we ask our participants to provide suggestions on ways to improve the survey. Incorporating participants’ suggestions improves our survey. For example, in 2008 several people suggested adding questions about model deployment and off-shoring. We asked about both of these topics in the 2009 survey.

Ajay -Could you please share some sneak previews of the survey results? What impact is the recession likely to have on IT spending?

Karl- We’re just starting to analyze the 2009 survey data. But, yes, here’s a peek at some of the findings that relate to the impact of the recession:

* Many data miners report that funding for data mining projects can sometimes be a problem.
* However, when asked what will happen in 2009 if the economic downturn continues, many data miners still anticipate that their company/organization will conduct more data mining projects in 2009 than in previous years (41% anticipate more projects in 2009; 27% anticipate fewer projects).
* The vast majority of companies conduct their data mining internally, and very few are sending data mining off-shore.

I don’t have a crystal ball that tells me about the trends in overall corporate spending on IT, Business Intelligence, or Data Mining. It’s my personal experience that many budgets are tight this year, but that key projects are still getting funded. And it is my strong opinion that in the coming years many companies will increase their focus on analytics, and I think that increasingly analytics will be a source of competitive advantage for these companies.

There are other people and other surveys that provide better insight into the trends in IT spending. For example, Gartner’s recent survey of over 1,500 CIOs (http://www.gartner.com/it/page.jsp?id=855612 ) suggests that 2009 IT spending is likely to be flat. I’m personally happy to see that in the Gartner survey, Business Intelligence is again CIOs’ top technology priority, and that “increasing the use of information/analytics” is the #5 business priority.

Ajay- I noticed you advise SPSS among others. Describe what an advisory role is for an analytics company and how can small open source companies get renowned advisors?

Karl- We have advised Oracle, SPSS, Hewlett-Packard and several smaller companies. We find that advisory roles vary greatly. The biggest source of variation is what the company wants advice about. Example include:

* assessing opportunity areas for the application of analytics
* strategic data assessments
* analytic strategy
* product strategy
* reviewing software

Both large and small companies that look to apply analytics to their businesses can benefit from analytic advisors. So can open source companies that sell analytic software. Companies can find analytic advisors in several ways. One way is to look around for analytic experts whose advice you trust, and hire them. Networking in your own industry and in the analytic communities can identify potential advisors. Don’t forget to look in both academia and the business world. Many skilled people cross back and forth between these two worlds. Another way for these companies to obtain analytic advice is to look in their business networks and user communities for analytic specialists who share some of the goals of the company – they will be motivated for your company to succeed. Especially if focused topic areas or time-constrained tasks can be identified, outside experts may be willing to donate their time, and they may be flattered that you asked.

Ajay- What made you decide to begin the Rexer Surveys? Describe some results of last year’s surveys and any trends from the last three years that you have seen.

Karl- I’ve been involved on the organizing committees of several data mining workshops and conferences. At these conferences I talk with a lot of data miners and companies involved in data mining. I found that many people were interested in hearing about what other data miners were doing: what algorithms, what types of data, what challenges were being faced, what they liked and disliked about their data mining tools, etc. Since we conduct online surveys for several of our clients, and my network of data miners is pretty large, I realized that we could easily do a survey of data miners, and share the results with the data mining community. In the first year, 314 data miners participated, and it’s just grown from there. In 2009 over 700 people completed the survey. The interest we’ve seen in our research summaries has also been astounding – we’ve had thousands of requests. Overall, this just confirms what we originally thought: people are hungry for information about data mining.

Here is a preview of findings from the initial analyses of the 2009 survey data:

* Each year we’ve seen that the most commonly used algorithms are decision trees, regression, and cluster analysis.
* Consistently, some of the top challenges data miners report are dirty data and explaining data mining to others. Previously, data access issues were also reported as a big challenge, but in 2009 fewer data miners reported facing this challenge.
* The most prevalent concerns with how data mining is being utilized are: insufficient training of some data miners, and resistance to using data mining in contexts where it would be beneficial.
* Data mining is playing an important role in organizations. Half of data miners indicate their results are helping to drive strategic decisions and operational processes.
* But there’s room for data mining to grow – almost 20% of data miners report that their company/organizations have only minimal analytic capabilities.

Bio-

Karl Rexer, PhD is President of Rexer Analytics, a small Boston-based consulting firm. Rexer Analytics provides analytic and CRM consulting to help clients use their data to make better strategic and tactical decisions. Recent projects include fraud detection, sales forecasting, customer segmentation, loyalty analyses, predictive modeling for cross-sell and attrition, and survey research. Rexer Analytics also conducts an annual survey of data miners and freely distributes research summaries to the data mining community. Karl has been on the organizing committees of several international data mining conferences, including 3 KDD conferences, and BIWA-2008. Karl is on the SPSS Customer Advisory Board and on the Board of Directors of the Oracle Business Intelligence, Warehousing, & Analytics (BIWA) Special Interest Group. Karl and other Rexer Analytics staff are frequent invited speakers at MBA data mining classes and conferences.

To know more do check out the website on www.rexeranalytics.com

*

SPSS launches two more PASWs

Just got news from the Chicago school of analytics, or the company known as SPSS. they have decided to lauch two more PASW products and you can see this from the release itself.

SPSS Inc. and the value of Predictive Analytics.

This week we announced PASW Data Collection 5.6 feedback management and survey research software, and PASW Collaboration & Deployment Services 4, our integrated platform to share, manage, automate and integrate analytic assets directly into business processes.

PASW Data Collection 5.6 (formerly Dimensions)

* The use of surveys to capture Voice of the Customer across multiple touch-points is integral to bringing data about peoples attitudes into analytical decision-making to improve customer intimacy.
* PASW Data Collection 5.6 supports the entire survey lifecycle from authoring to managing the data collection process to survey reporting and analysis supporting global, multichannel research and feedback collection.
* New functionality includes data entry capabilities, an enhanced authoring interface suitable for the novice and the research professional, and new phone-based interviewing capabilities designed to shape the modern survey research call center. This release also further extends the enterprise readiness of the data collection platform with enhancements to performance and security.

You can read the press release at http://www.spss.com/press/template_view.cfm?PR_ID=1088

PASW Collaboration and Deployment Services 4 (formerly Predictive Enterprise Services)

* The platform automates analytical processes for greater consistency and control, and deploys results to business users, consumers or directly into operational systems to reduce customer churn, improve marketing campaigns or identify cases of fraud.
* PASW Collaboration and Deployment Services 4 provides the foundation to integrate analytics into key business processes, so the right decisions are made and the best actions are taken on a consistent, repeatable basis.
* New functionality includes enhanced collaboration capabilities that provide more options for publishing analytical results; enhancements to the Automation Service with additional integration options; and a Real-time Scoring Service to deploy analytical scores into existing applications.

You can read the full press release at http://www.spss.com/press/template_view.cfm?PR_ID=1087