SAS Thought Leader declares war on data scientists on Valentine Eve


It all started because of the Google Guy, Hal Varian

Feb 25, 2009 – I keep saying the sexy job in the next ten years will be statisticians Hal Varian, The McKinsey Quarterly, January 2009.

Then these guys ( Thomas H. Davenport and D.J. Patil)  made us sexy -that too in the Harvard Business Review.

Jill Dyche* is a thought leader. That’s what her job says. that too at SAS  which took over her start-up Baseline Consulting. (* In addition to this, she writes forewords for struggling poets here )

She says here

If the importance of data scientists is growing with the advent of big data, the sooner we understand what exactly it is they do, the better.

That is fair enough. But to add grievous injury to data scientists, She adds

(For fun I wrote a blog post on being a data scientist’s girlfriend.)

Actually the blog post was-Why I Wouldn’t Have Sex with a Data Scientist

But there’s no use. The data scientist is preoccupied. Preoccupied with finding, accessing, analyzing, validating, cleansing, integrating, provisioning, modeling, verifying, and explaining data to his management, colleagues, end-users, and friends.

And this is the year of the statistician ??

This is bare knuckles tactics. The art of Vaseline Insulting? Perish the thought. Geeks and Data Scientists  rule.

Dont we? and we are perfect? right.

We statisticians (and data scientists and big dataists and data miners and business analysts and …)

are bringing sexy back!

Justin+Timberlake+JT+PNG+1(and we need a hug too.)

SAS gets awesome revenues

There are 2.87 billion reasons SAS is not going away anywhere in the Big Data Analytics space. Yes , thats the revenue figures declared by them-

Of course I have always wondered how much they earn from SAS Federal LLC ( which is a subsidary that caters to the lucrative and not very competitive analytics in Intelligence) and their revenue breakdown by Product ( how much did they earn by Base SAS licenses versus how much they earned by Cyber Security )

I wonder how many other analytics companies have even realized that they can help cut down the federal government costs ( or even have something close to this )

This year revenue breakdown was-

The Americas generated 47 percent of SAS’ total revenue; Europe, Middle East and Africa (EMEA) 41 percent; and Asia Pacific 12 percent.

but last year

The Americas accounted for 46 percent of total revenue; Europe, Middle East and Africa (EMEA) 42 percent; and Asia Pacific 12 percent

So Americas revenue grew faster than Europe revenues!Okay

Continue reading

Revolution Analytics and Pricing Analytics

Cost of 1 day of Revolution Analytics Training at


1. Intro to R

Price:  Commercial: SGD$500.00

1 Singapore dollar = 0.8197 US dollars

10% Early Bird Discount Deadline: November 13, 2012 @ 12:00PM Pacific Time
Discount code: earlybird

2. (aptly titled Minimalistic Sufficient R…you think the ricing would be minimalistic.. but)



$100 Early Bird Discount Deadline: November 16, 2012 @ 12:00PM Pacific Time
Discount code: earlybird


Advanced R (Italian)

Price:  Commercial: €680.00
Academic: €480.00

1 euro = 1.2975 US dollars


Big Data AnalyticS with RevoScaleR

Price:  $500 with 2 month Revolution R Enterprise workstation evaluation.

$700 with 1 year subscription of Revolution R enterprise workstation ($1500 value)

10% Early Bird Discount Deadline: October 30, 2012 @ 12:00PM Pacific Time
Discount code: early


Revolution R Time Series Training

Price:  Commercial: S$1,200.00

10% Early Bird Discount Deadline: October 30, 2012 @ 12:00PM Pacific Time
Discount code: earlybird

so training costs differently different strokes for different folks I guess,

BUT me hearties.

Cost of 1 year of Revolution Enterprise= $1000

Thats a flat rate, so the Linux and Windows costs the same and so does the 32-bit and 64-bit

(see )

( My comment- either Revo should give away the license for free to enterprises, rationalize training costs, seriously how can 2 days of training cost like a 1 year of license and the software is definitely quite good., or create a paid Amazon Ec 2 AMI for enterprises to rent the Revolution Analytics software (like SAP Hana ), or even on Windows Azure if they insist on hugging Microsoft, though I am clearly seeing various flavors of Linux beating Windows Server to a pulp in the Big Data market, though I am probably more optimistic on the Windows 8 on Surface but because of hardware not software/ Azure alternative to Amazon given Google’s delayed offering- I dont even know many many instance of Windows related HPC or HPA,  (/end_of_rant)

Annual Subscription
Includes software license and technical support
Price Quantity Total
Revolution R Enterprise Single-User Workstation (64-bit Windows) $1,000.00 $0.00
Revolution R Enterprise Single-User Workstation (32-bit Windows) $1,000.00 $0.00
Revolution R Enterprise Single-User Workstation (64-bit Red Hat 6 Enterprise Linux) $1,000.00 $0.00
Revolution R Enterprise Single-User Workstation (64-bit Red Hat 5 Enterprise Linux) $1,000.00 $0.00


R and Hadoop #rstats

Lovely ppt from the formidable Jeffrey Bean, whose lucid style in explaining R has made me a big fan of his awesome work!

Take at look at his extensive collection of Big Data with R slides  at – they are both very comprehensive and a delightful addition to anyone wishing to go the cloud, hadoop, R  route
His blog at talks of lots of very relevant topics.

RCOMM 2012 goes live in August

An awesome conference by an awesome software Rapid Miner remains one of the leading enterprise grade open source software , that can help you do a lot of things including flow driven data modeling ,web mining ,web crawling etc which even other software cant.

Presentations include:

  • Mining Machine 2 Machine Data (Katharina Morik, TU Dortmund University)
  • Handling Big Data (Andras Benczur, MTA SZTAKI)
  • Introduction of RapidAnalytics at Telenor (Telenor and United Consult)
  • and more

Here is a list of complete program




Training / Workshop 1
Conference 1
Conference 2
Training / Workshop 2
09:00 – 10:30
Introductory Speech
Ingo Mierswa (Rapid-I)Resource-aware Data Mining or M2M Mining (Invited Talk)

Katharina Morik (TU Dortmund University)

More information


Data Analysis


NeurophRM: Integration of the Neuroph framework into RapidMiner
Miloš Jovanović, Jelena Stojanović, Milan Vukićević, Vera Stojanović, Boris Delibašić (University of Belgrade)

To be announced (Invited Talk)
Andras Benczur 

Recommender Systems


Extending RapidMiner with Recommender Systems Algorithms
Matej Mihelčić, Nino Antulov-Fantulin, Matko Bošnjak, Tomislav Šmuc (Ruđer Bošković Institute)

Implementation of User Based Collaborative Filtering in RapidMiner
Sérgio Morais, Carlos Soares (Universidade do Porto)

Parallel Training / Workshop Session

Advanced Data Mining and Data Transformations


Development Workshop Part 2

10:30 – 11:00
Coffee Break
Coffee Break
Coffee Break
11:00 – 12:30
Data Analysis

Nearest-Neighbor and Clustering based Anomaly Detection Algorithms for RapidMiner
Mennatallah Amer, Markus Goldstein (DFKI)

Customers’ LifeStyle Targeting on Big Data using Rapid Miner
Maksim Drobyshev (LifeStyle Marketing Ltd)

Robust GPGPU Plugin Development for RapidMiner
Andor Kovács, Zoltán Prekopcsák (Budapest University of Technology and Economics)



Optimization Plugin For RapidMiner
Venkatesh Umaashankar, Sangkyun Lee (TU Dortmund University; presented by Hendrik Blom)


Image Mining Extension – Year After
Radim Burget, Václav Uher, Jan Mašek (Brno University of Technology)

Incorporating R Plots into RapidMiner Reports
Peter Jeszenszky (University of Debrecen)

12:30 – 13:30
13:30 – 15:30
Parallel Training / Workshop Session

Basic Data Mining and Data Transformations


Development Workshop Part 1



Introduction of RapidAnalyticy Enterprise Edition at Telenor Hungary
t.b.a. (Telenor Hungary and United Consult)


Application of RapidMiner in Steel Industry Research and Development
Bengt-Henning Maas, Hakan Koc, Martin Bretschneider (Salzgitter Mannesmann Forschung)

A Comparison of Data-driven Models for Forecast River Flow
Milan Cisty, Juraj Bezak (Slovak University of Technology)

Portfolio Optimization Using Local Linear Regression Ensembles in Rapid Miner
Gábor Nagy, Tamás Henk, Gergő Barta (Budapest University of Technology and Economics)



An Octave Extension for RapidMiner
Sylvain Marié (Schneider Electric)


Unstructured Data


Processing Data Streams with the RapidMiner Streams-Plugin
Christian Bockermann, Hendrik Blom (TU Dortmund)

Automated Creation of Corpuses for the Needs of Sentiment Analysis
Peter Koncz, Jan Paralic (Technical University of Kosice)


Demonstration: News from the Rapid-I Labs
Simon Fischer; Rapid-I

This short session demonstrates the latest developments from the Rapid-I lab and will let you how you can build powerful analysis processes and routines by using those RapidMiner tools.

Certification Exam
15:30 – 16:00
Coffee Break
Coffee Break
Coffee Break
16:00 – 18:00
Book Presentation and Game Show

Data Mining for the Masses: A New Textbook on Data Mining for Everyone
Matthew North (Washington & Jefferson College)

Matthew North presents his new book “Data Mining for the Masses” introducing data mining to a broader audience and making use of RapidMiner for practical data mining problems.


Game Show
Did you miss last years’ game show “Who wants to be a data miner?”? Use RapidMiner for problems it was never created for and beat the time and other contestants!

User Support

Get some Coffee for free – Writing Operators with RapidMiner Beans
Christian Bockermann, Hendrik Blom (TU Dortmund)

Meta-Modeling Execution Times of RapidMiner operators
Matija Piškorec, Matko Bošnjak, Tomislav Šmuc (Ruđer Bošković Institute)

Conference day ends at ca. 17:00.

Social Event (Conference Dinner)
Social Event (Visit of Bar District)


and you should have a look at

Conference is in Budapest, Hungary,Europe.

( Disclaimer- Rapid Miner is an advertising sponsor of in case you didnot notice the two banner sized ads.)


Making Big Data Analytics an API call away

I have compared some of Amazon’s database in the cloud offerings with Google’s and especially the Google BigQuery API in my latest article. With more than 2 years under its belt for development, Google BigQuery API is a good service to test out if you want to reduce dependencies on database vendors.
Read it at
Google BigQuery API Makes Big Data Analytics Easy


I have been busy-

1) Finally my divorce came through. My advice – dont do it without a pre-nup ! Alimony means all the money.

2) Spending time on Quora after getting bored from LinkedIn, Twitter,Facebook,Google Plus,Tumblr, WordPress

See this answer to-

 What are common misconceptions about startups?

1) we will change the world
2) if we get 1% of a billion people market, we will be rich
3) if we have got funding, most of the job is done
4) lets pay ourselves high salaries since we got funded
5) our idea is awesome and cant be copied, improvised, stolen, replicated
6) startups are painless
7) it is a better life than a corporate career
8) long term vision is important than short term cash burn
9) we will never sell out or exit. never
10) its a great idea to make startups with friend

Say hello to me –

3) Writing freelance articles on APIs for Programmable Web

Why write pro? See point 1)

Recent Articles-

4) Writing poetry on It now gets 23000 views a month. I wish I could say my poems were great, but the readers are kind (364 subscribers!) and also Google Image Search is very very kind.

5) Kicking tires with next book ” R for Cloud Computing” and be tuned for another writing announcement

6) Waiting for Paul Kent, VP, SAS Big Data to reply to my emails for interview after HE promised me!! You dont get to 105 interviews without being a bit stubborn!

7) Sighing on politics engulfing my American friends especially with regards to Chic-fil-A and Romney’s gaffes. Now thats what I call a first world problem! Protesting by eating or boycotting chicken sandwiches! In India we had the world’s biggest blackout two days in a row- and no one is attending the Hunger Fast against corruption protests!

8) Watching Olympics! Our glorious nation of 1.2 billion very smart people has managed to win 1 Bronze till today!! Michael Phelps has won more medals and more gold than the whole of  India has since the Olympics Games began!!

9) Consulting to pay the bills. includes writing R code, making presentations. Why consult when I have writing to do? See point 1)

10) Reading New York Times to get insights on Big Data and Analytics. Trust them- they know what they are doing!