R for Business Analytics- Book by Ajay Ohri

So the cover art is ready, and if you are a reviewer, you can reserve online copies of the book I have been writing for past 2 years. Special thanks to my mentors, detractors, readers and students- I owe you a beer!

You can also go here-



R for Business Analytics

R for Business Analytics

Ohri, Ajay

2012, 2012, XVI, 300 p. 208 illus., 162 in color.


ISBN 978-1-4614-4342-1

Due: September 30, 2012


approx. 44,95 €
  • Covers full spectrum of R packages related to business analytics
  • Step-by-step instruction on the use of R packages, in addition to exercises, references, interviews and useful links
  • Background information and exercises are all applied to practical business analysis topics, such as code examples on web and social media analytics, data mining, clustering and regression models

R for Business Analytics looks at some of the most common tasks performed by business analysts and helps the user navigate the wealth of information in R and its 4000 packages.  With this information the reader can select the packages that can help process the analytical tasks with minimum effort and maximum usefulness. The use of Graphical User Interfaces (GUI) is emphasized in this book to further cut down and bend the famous learning curve in learning R. This book is aimed to help you kick-start with analytics including chapters on data visualization, code examples on web analytics and social media analytics, clustering, regression models, text mining, data mining models and forecasting. The book tries to expose the reader to a breadth of business analytics topics without burying the user in needless depth. The included references and links allow the reader to pursue business analytics topics.


This book is aimed at business analysts with basic programming skills for using R for Business Analytics. Note the scope of the book is neither statistical theory nor graduate level research for statistics, but rather it is for business analytics practitioners. Business analytics (BA) refers to the field of exploration and investigation of data generated by businesses. Business Intelligence (BI) is the seamless dissemination of information through the organization, which primarily involves business metrics both past and current for the use of decision support in businesses. Data Mining (DM) is the process of discovering new patterns from large data using algorithms and statistical methods. To differentiate between the three, BI is mostly current reports, BA is models to predict and strategize and DM matches patterns in big data. The R statistical software is the fastest growing analytics platform in the world, and is established in both academia and corporations for robustness, reliability and accuracy.

Content Level » Professional/practitioner

Keywords » Business Analytics – Data Mining – Data Visualization – Forecasting – GUI – Graphical User Interface – R software – Text Mining

Related subjects » Business, Economics & Finance – Computational Statistics – Statistics


Why R.- R Infrastructure.- R Interfaces.- Manipulating Data.- Exploring Data.- Building Regression Models.- Data Mining using R.- Clustering and Data Segmentation.- Forecasting and Time-Series Models.- Data Export and Output.- Optimizing your R Coding.- Additional Training Literature.- Appendix

Quantifying Analytics ROI

Japanese House Crest “Go-Shichi no Kiri”
Image via Wikipedia

I had a brief twitter exchange with Jim Davis, Chief Marketing Officer, SAS Institute on Return of Investment on Business Analytics Projects for customers. I have interviewed Jim Davis before last year https://decisionstats.com/2009/06/05/interview-jim-davis-sas-institute/

Now Jim Davis is a big guy, and he is rushing from the launch of SAS Institute’s Social Media Analytics in Japan- to some arguably difficult flying conditions in time to be home in America for Thanksgiving. That and and I have not been much of a good Blog Boy recently, more swayed by love of open source, than love of software per se. I love equally, given I am bad at both equally.

Anyways, Jim’s contention  ( http://twitter.com/Davis_Jim ) was customers should go in business analytics only if there is Positive Return on Investment.  I am quoting him here-

What is important is that there be a positive ROI on each and every BA project. Otherwise don’t do it.

That’s not the marketing I was taught in my business school- basically it was sell, sell, sell.

However I see most BI sales vendors also go through -let me meet my sales quota for this quarter- and quantifying customer ROI is simple maths than predictive analytics but there seems to be some information assymetry in it.

Here is a paper from North Western University on ROI in IT projects-.

but overall it would be in the interest of customers and Business Analytics Vendors to publish aggregated ROI.

The opponents to this transparency in ROI would be market leaders in market share, who have trapped their customers by high migration costs (due to complexity) or contractually.

A recent study listed Oracle having a large percentage of unhappy customers who would still renew!, SAP had problems when it raised prices for licensing arbitrarily (that CEO is now CEO of HP and dodging legal notices from Oracle).

Indeed Jim Davis’s famous unsettling call for focusing on Business Analytics,as Business Intelligence is dead- that call has been implemented more aggressively by IBM in analytical acquisitions than even SAS itself which has been conservative about inorganic growth. Quantifying ROI, should theoretically aid open source software the most (since they are cheapest in up front licensing) or newer technologies like MapReduce /Hadoop (since they are quite so fast)- but I think that market has a way of factoring in these things- and customers are not as foolish neither as unaware of costs versus benefits of migration.

The contrary to this is Business Analytics and Business Intelligence are imperfect markets with duo-poly  or big players thriving in absence of customer regulation.

You get more protection as a customer of $20 bag of potato chips, than as a customer of a $200,000 software. Regulators are wary to step in to ensure ROI fairness (since most bright techies are qither working for private sector, have their own startup or invested in startups)- who in Govt understands Analytics and Intelligence strong enough to ensure vendor lock-ins are not done, and market flexibility is done. It is also a lower choice for embattled regulators to ensure ROI on enterprise software unlike the aggressiveness they have showed in retail or online software.

Who will Analyze the Analysts and who can quantify the value of quants (or penalize them for shoddy quantitative analytics)- is an interesting phenomenon we expect to see more of.



Open Source Business Intelligence: Pentaho and Jaspersoft

Here are two products that are used widely for Business Intelligence_ They are open source and both have free preview.

Jaspersoft-For the Enterprise version click on the screenshot while for the free community version you can go to


Interestingly (and not surprisingly) Revolution Analytics is teaming up with Jaspersoft to use R for reporting along with the Jaspersoft BI stack.





Date: Wednesday, September 22, 2010
Time: 9:00am PDT (12:00pm EDT; 4:00pm GMT)
Presenters: David Smith, Vice President of Marketing, Revolution Analytics
Andrew Lampitt, Senior Director of Technology Alliances, Jaspersoft
Matthew Dahlman, Business Development Engineer, Jaspersoft
Registration: Click here to register now!

R is a popular and powerful system for creating custom data analysis, statistical models, and data visualizations. But how can you make the results of these R-based computations easily accessible to others? A PhD statistician could use R directly to run the forecasting model on the latest sales data, and email a report on request, but then the process is just going to have to be repeated again next month, even if the model hasn’t changed. Wouldn’t it be better to empower the Sales manager to run the model on demand from within the BI application she already uses—daily, even!—and free up the statistician to build newer, better models for others?

In this webinar, David Smith (VP of Marketing, Revolution Analytics) will introduce the new “RevoDeployR” Web Services framework for Revolution R Enterprise, which is designed to make it easy to integrate dynamic R-based computations into applications for business users. RevoDeployR empowers data analysts working in R to publish R scripts to a server-based installation of Revolution R Enterprise. Application developers can then use the RevoDeployR Web Services API to securely and scalably integrate the results of these scripts into any application, without needing to learn the R language. With RevoDeployR, authorized users of hosted or cloud-based interactive Web applications, desktop applications such as Microsoft Excel, and BI applications like Jaspersoft can all benefit from on-demand analytics and visualizations developed by expert R users.

To demonstrate the power of deploying R-based computations to business users, Andrew Lampitt will introduce Jaspersoft commercial open source business intelligence, the world’s most widely used BI software. In a live demonstration, Matt Dahlman will show how to supercharge the BI process by combining Jaspersoft and Revolution R Enterprise, giving business users on-demand access to advanced forecasts and visualizations developed by expert analysts.

Click here to register for the webinar.

Speaker Biographies:

David Smith is the Vice President of Marketing at Revolution Analytics, the leading commercial provider of software and support for the open source “R” statistical computing language. David is the co-author (with Bill Venables) of the official R manual An Introduction to R. He is also the editor of Revolutions (http://blog.revolutionanalytics.com), the leading blog focused on “R” language, and one of the originating developers of ESS: Emacs Speaks Statistics. You can follow David on Twitter as @revodavid.

Andrew Lampitt is Senior Director of Technology Alliances at Jaspersoft. Andrew is responsible for strategic initiatives and partnerships including cloud business intelligence, advanced analytics, and analytic databases. Prior to Jaspersoft, Andrew held other business positions with Sunopsis (Oracle), Business Objects (SAP), and Sybase (SAP). Andrew earned a BS in engineering from the University of Illinois at Urbana Champaign.

Matthew Dahlman is Jaspersoft’s Business Development Engineer, responsible for technical aspects of technology alliances and regional business development. Matt has held a wide range of technical positions including quality assurance, pre-sales, and technical evangelism with enterprise software companies including Sybase, Netonomy (Comverse), and Sunopsis (Oracle). Matt earned a BA in mathematics from Carleton College in Northfield, Minnesota.

The second widely used BI stack in open source is Pentaho.

You can download it here to evaluate it or click on screenshot to read more at



Business Analytics Analyst Relations /Ethics/White Papers

Curt Monash, whom I respect and have tried to interview (unsuccessfully) points out suitable ethical dilemmas and gray areas in Analyst Relations in Business Intelligence here at http://www.dbms2.com/2010/07/30/advice-for-some-non-clients/

If you dont know what Analyst Relations are, well it’s like credit rating agencies for BI software. Read Curt and his landscaping of the field here ( I am quoting a summary) at http://www.strategicmessaging.com/the-ethics-of-white-papers/2010/08/01/

Vendors typically pay for

  1. They want to connect with sales prospects.
  2. They want general endorsement from the analyst.
  3. They specifically want endorsement from the analyst for their marketing claims.
  4. They want the analyst to do a better job of explaining something than they think they could do themselves.
  5. They want to give the analyst some money to enhance the relationship,

Merv Adrian (I interviewed Merv here at http://www.dudeofdata.com/?p=2505) has responded well here at http://www.enterpriseirregulars.com/23040/white-paper-sponsorship-and-labeling/

None of the sites I checked clearly identify the work as having been sponsored in any way I found obvious in my (admittefly) quick scan. So this is an issue, but it’s not confined to Oracle.

My 2 cents (not being so well paid 😉 are-

I think Curt was calling out Oracle (which didnt respond) and not Merv ( whose subsequent blog post does much to clarify).

As a comparative new /younger blogger in this field,
I applaud both Curt to try and bell the cat ( or point out what everyone in AR winks at) and for Merv for standing by him.

In the long run, it would strengthen analyst relations as a channel if they separate financial payment of content from bias. An example is credit rating agencies who forgot to do so in BFSI and see what happened.

Customers invest millions of dollars in BI systems trusting marketing collateral/white papers/webinars/tests etc. Perhaps it’s time for an industry association for analysts so that individual analysts don’t knuckle down under vendor pressure.

It is easier for someone of Curt, Merv’s stature to declare editing policy and disclosures before they write a white paper.It is much harder for everyone else who is not so well established.

White papers can take as much as 25,000$ to produce- and I know people who in Business Analytics (as opposed to Business Intelligence) slog on cents per hour cranking books on R, SAS , webinars, trainings but there are almost no white papers in BA. Are there any analytics independent analysts who are not biased by R or SAS or SPSS or etc etc. I am not sure but this looks like a good line to  pursue 😉 – provided ethical checks and balances are established.

Personally I know of many so called analytics communities go all out to please their sponsors so bias in writing does exist (you cant praise SAS on a R Blogging Forum or R USers Meet and you cant write on WPS at SAS Community.org )

– at the same time someone once told me- It is tough to make a living as a writer, and that choice between easy money and credible writing needs to be respected.

Most sponsored white papers I read are pure advertisements, directed at CEOs rather than the techie community at large.

Almost every BI vendor claims to have the fastest database with 5X speed- and benchmarking in technical terms could be something they could do too.

Just like Gadget sites benchmark products, you can not benchmark BI or even BA products as it is written not to do so  in many licensing terms.

Probably that is the reason Billions are spent in BI and the positive claims are doubtful ( except by the sellers). Similarly in Analytics, many vendors would have difficulty justifying their claims or prices if they are subjected to a side by side comparison. Unfortunately the resulting confusion results in shoddy technology coming stronger due to more aggressive marketing.

How they stack up: IDC on Business Analytics

So here is intelligent enterprise on the latest IDC rankings on Business Intelligence and Business Analytics vendors. If you ever wondered how big the bog boys were- read it at



In 2008, Oracle led the overall market, followed in order by SAP, IBM, SAS and Microsoft, the report said. Rounding out the top 10 were Teradata, Fair Isaac, Informatica, Infor and MicroStrategy, respectively


IDC divides the business analytics software market into four primary segments: analytic applications, business intelligence tools, data warehousing platform software and spatial information analytics tools.


Fourth-place SAS’ broad portfolio spans all business analytics market segments and is exclusively dedicated to this market. “The company leads in the advanced analytics tools segment and is within the top two vendors in two other market segments,”IDC said.

It’s a brilliant analysis and survey. IDC and Intelligent Enterprise- thanks a tonne for letting us know.