New Deal in Statistical Training

The United States Government is planning a new initiative at providing employable skills to people, to cope with unemployment.
One skill perpetually in shortage is analytics training along with skills in statistics.

It is time that corporates like IBM SPSS, SAS Institute and Revolution Analytics as well as offshore companies in India or Asia can ramp up their on demand trainings, certification as well as academic partnership bundles. Indeed offshroing companies can earn revenue as well as goodwill if they help in with trainers available via video- conferencing. The new Deal initiative would require creative thinking as well as direct top management support to focus their best internal brains at developing this new revenue stream. Again the company that trains the most users (be it Revolution for R, IBM for SPSS-Cognos, SAS Institute for Base SAS-JMP, WPS for SAS language) is going to get a bigger chunk of new users and analysts.

Analytics skills are hot. There is big new demand for hot new skills by millions of unemployed Americans and Asians. How do you think this services market will play out?

If the US government could pump 800 Billion for bailouts, how much is your opinion it should spend on training programs to help citizens compete globally?

From http://www.nytimes.com/2010/10/03/business/economy/03skills.html?hpw

The national program is a response to frustrations from both workers and employers who complain that public retraining programs frequently do not provide students with employable skills. This new initiative is intended to help better align community college curriculums with the demands of local companies.

SAS recognizes the market –

see http://www.sas.com/news/preleases/aba-tech-engage.html

In tough economic times, it is more important than ever that companies be able to make better decisions using analytics. SAS is involved in two programs this summer that offer MBAs and unemployed technology workers the opportunity to learn and enhance analytics skills, and increase their marketability.

SAS is a partner in TechEngage, a week-long program of training classes that offer unemployed technology professionals new skills at a low cost to help them compete effectively in the marketplace.”

So does IBM-

http://www-03.ibm.com/press/us/en/pressrelease/28994.wss

. “Fordham has a long history of collaboration with IBM that has brought innovative new skills to our curriculum to prepare students for future jobs. With this effort, Fordham is preparing students with marketable skills for a coming wave of jobs in healthcare, sustainability, and social services where analytics can be applied to everyday challenges.”

and R

Well TIBCO and Revolution ….hmmm…mmmm

I am not sure there is even a R Analytics Certification program at the least.

Economic: Indian Caste System -Simplification

I am often asked by Western and non Indian people regarding the caste system. It trips me a lot trying to explain the complexity, necessity and current scenario given the history.

Here is an effort- The Indian /Hindu caste system was primarily an economic system to divide labor. In the original Manusmriti ,named by the King Manu- it was flexible.

A son of blue collar worker could become a warrior if he was brave etc.

A couple of centuries later – the top castes primarily the priests decided to make it rigid. No more social intermingling or marriage between castes, and no more migration of occupation regardless of merit.

This led to a lot of lower caste people leaving Hinduism to join religions like Islam ( post 1000 AD, Muslim Invasions and Mughal Rule) and Christianity ( post the arrival of English).


Post 1947 , many of “lower castes” preferred to remain within Hinduism but adopted Buddhism as their primary worship mechanism.Also India‘s leaders in the 1940’s , many of whom were educated in UK as lawyers ( including Mahatma Gandhi, Subhash Chandra Bose, Jawahar Lal Nehru) decided this system had weakened the nation state and divided the energies of India, besides being obviously inhumane and degrading.

The Constitution of India was shepharded in 1950  by an assembly led by Dr. B R Ambedkar , one of the very first educated lower castes ( also called Harijan , after Mahatma Gandhi’s name for them, literally Hari -Jan people of the Lord).That Cosntitution endures as India remains the finest example of a Democracy in the non Western world.

The Indian constitution established 7.5 % jobs reservation in Government jobs and educational institutes at a college and masters level only for lowest and most educationally backward castes ( hence called scheduled castes), 15 % jobs reservation in Government jobs only for tribal people ( hence called scheduled tribes). The provision is renewed every 10 years. Think of it as a constitutionallu bound affirmative action.

In 1990, another 27.5 % of jobs and educational seats were reserved for castes that were socially okay but educationally backward. This caused some riots, delays, political actions, but was finally implemented by 2007.

Opponents of the new affirmative action say that this is like doing two wrongs to make a right. Supporters say data proves that reservation has led to social advancement ( especially in the State of Tamil Nadu).Rollback of the new system is a political impossibilty thanks to unity among hitherto repressed classes.

As an upper caste Hindu ( embarassingly enough my caste is both a warrior and a kingly royal caste , which gives me zero benefit in 2010 AD)……..

In God we Trust..All others must bring Data.

Unfortunately, when it comes to politics the same data is either hidden, partially hidden, or interpreted in different ways especially with regards to projecting sampling error or decisions.

Phew…!! That was an analytical layman definition of the Indian Caste System over 2000 years.

Note- The Indian soldier caste is Kshatriyas not Kshatritas..

AsterData gets $30 mill in funding

From the press release, the maker of Map Reduce based BI software gets 30 mill $ as Series C funding. Given the valuation recently by IBM to Netezza, AsterData seems set to cross the Billion Dollar valuation within the next 18-24 months IMO

Aster Data Closes $30 Million Series C Financing

Explosive Growth and Market Leadership Attracts New and Existing Investors

San Carlos, CA – September 22, 2010 – Aster Data, a market leader in big data management and advanced analytics, today announced that it has closed a $30 million Series C round of financing led by both new and existing investors. The company will use the new funding to accelerate growth, scale operations, and expand its global market share in the $20 billion database market – a market that is experiencing rapid growth as a result of both the explosion in data volumes across organizations and the urgent need to deliver a new class of analytics and data-driven applications. The Series C round of funding includes previous investors Sequoia Capital, JAFCO Ventures, Institutional Venture Partners, Cambrian Ventures, as well as an additional new strategic investor.  Also investing in this round is early investor David Cheriton, who previously backed high-growth companies including Google and VMware, and co-founded several successful technology companies.

Today’s Series C funding announcement underscores a year of strong innovation, execution, and overall momentum for the analytic database company. Key milestones include:

Strong sales growth: Since 2008, Aster Data has doubled revenue year-over-year and secured key customers that leverage Aster Data’s platform to address the big data management problem including MySpace, comScore, Barnes & Noble, and Akamai. Like so many organizations today,
Aster Data’s customers are experiencing explosive data growth across their organizations and recognize the need for rich, advanced analytics that give them deeper insights from their data.

Key executive hires: Quentin Gallivan, former CEO of both PivotLink and Postini and EVP of worldwide sales at Verisign, recently joined the company as Chief Executive Officer. In addition, earlier this year, John Calonico, previously at Interwoven, BEA, and Autodesk, joined as Chief Financial Officer; and Nitin Donde, formerly an executive at EMC and 3PAR, joined as Executive Vice President Engineering.  The strength and experience of Aster Data’s management team helps further establish a strong operational foundation for growth in 2010 and beyond.

Industry recognition: Aster Data was positioned in the “Visionaries” Quadrant of Gartner, Inc.’s

Data Warehouse Database Management Systems Magic Quadrant, published 2010 *; was recently named 2011 Tech Pioneer by the World Economic Forum; was named “Company to Watch” in the Information Management category of TechWeb’s Intelligent Enterprise 2010 Editors’ Choice Awards; and was awarded the 2010 San Francisco Business Times Technology and Innovation Award in the Best Product and Services Category.

Product Innovation: Aster Data continues to deliver ground-breaking capabilities to address the big data management and advanced analytics market need. Its recent announcement of
Aster Data nCluster 4.6 includes a column data store, making it the first hybrid row and column MPP DBMS with a unified SQL and MapReduce analytic framework for advanced analytics on large data sets. This year, Aster Data also delivered the most extensive library of pre-packaged MapReduce analytics totaling over 1000 functions, to ease and accelerate delivery of highly advanced analytic applications.

Aster Data’s analytic database, also called a ‘Data-Analytics Server’ is specifically designed to enable organizations to cost effectively store and analyze massive volumes of data. Aster Data leverages the power of commodity, general-purpose hardware, to reduce the cost to scale to support large data volumes and uniquely allows analysis of all data ‘in-database’ enabling richer and faster processing of large data sets. Aster Data’s in-database analytics engine uses the power of MapReduce, a parallel processing framework created by Google.

”The funding we received in our Series C round is a strong endorsement of Aster Data’s market leadership position and the high growth potential of the big data market,” said Quentin Gallivan, Chief Executive Officer, Aster Data. “The Aster Data team has executed exceptionally well to-date and I am excited to have the resources to accelerate the growth of the company as we expand our operations and execute aggressively across all fronts.”

The Comic Water Games (aka Common Wealth Games)

We in Delhi, India are a tough people. With summer temperatures from 46 Degree Celcius (114 Degree Fahrenheit) and Winter temperatures from 2-3 Degree Celcius (just above freezing), high pollution levels, the worst traffic jams (and highest per capita cars)- there is very little that intimidates the Average Delhiite-

But the Return of the British Empire is scaring us- and it is called Common Wealth Games. The Common Wealth is a group of countries that used to be colonized by Britain in her colonial days ( USA is not a member though- as they probably kicked way too much British butt while gaining independence).

And every 4 years they have CommonWealth games (read games for the non US English speaking world). So when our commie neighborhood– the Chinese went and got themselves an Olympics- we decided to get ourselves this CWG games too. Big deal- national pride- rising economic power and all that.

So far the Games has meant the following- lots of roads dug up, lot of stadiums in various degrees of preparation, a total cost of 2 Billion USD, rampant allegations of corruption due to the ten times increase in budget – including rather suspicious looking documents procured by our local press (yes Indian press is free as it is a democracy)

And add divine grace. Delhi has the wettest monsoon since 1978- it rains cats and dogs in September- and we now have a mini dengue malaria epidemic. 4 countries have declared the living quarters for athletes as uninhabitable , some have walked out, the inevitable terrorists injured two Taiwanese tourists this weekend (in a semi ironic email they said they were prepared as the government was prepared- it isn’t)

Today a bridge collapsed-

http://www.nytimes.com/2010/09/22/sports/22iht-GAMES.html?_r=1&hp

On Tuesday afternoon, a bridge next to Jawaharlal Nehru Stadium, the main Games venue, fell apart. The footbridge collapsed into three pieces, taking several workers with it and uprooting one side of the arch that supported it.

A police officer at the scene said that 27 people had been injured, four of them seriously, in the collapse.

“This will not affect the Games,” said Raj Kumar Chauhan, a Delhi minister for development, who spoke on the scene. “We can put the bridge up again, or make a new one.”

and

http://www.nytimes.com/2010/09/20/world/asia/20india.html?ref=sports

“We really need to learn how to plan,” said Vrinda Walavalkar, a public relations executive who is not connected to the Games.

“Maybe we feel we have so many lifetimes to achieve things” that it does not matter if it gets done this time, she said.

Mr. Gupta, the shopkeeper, found a metaphor in Hindu wedding tradition.

The groom’s party, known as the barat, traditionally marches to the bride’s house on horseback with his friends and family, he explained. When the barat appears, the bride has to come to the door, he said.

“If the bride is not ready, you patch her up and try to hide all her defects,” Mr. Gupta said, and then you send her outside.

————————————————————————————————————–

To some this may be shocking. To the average Delhi-ite battling traffic and rain , this is one more episode in the chaotic Capital. As a small solace- Delhi still has the best and cheapest street food this part of the world- with golgappas, tikki and chat. If only you can beat the rain to get them !

Also see http://en.wikipedia.org/wiki/Delhi if you like to know more.

Making NeW R

Tal G in his excellent blog piece talks of “Why R Developers  should not be paid” http://www.r-statistics.com/2010/09/open-source-and-money-why-r-developers-shouldnt-be-paid/

His argument of love is not very original though it was first made by these four guys

I am going to argue that “some” R developers should be paid, while the main focus should be volunteers code. These R developers should be paid as per usage of their packages.

Let me expand.

Imagine the following conversation between Ross Ihaka, Norman Nie and Peter Dalgaard.

Norman- Hey Guys, Can you give me some code- I got this new startup.

Ross Ihaka and Peter Dalgaard- Sure dude. Here is 100,000 lines of code, 2000 packages and 2 decades of effort.

Norman- Thanks guys.

Ross Ihaka- Hey, What you gonna do with this code.

Norman- I will better it. Sell it. Finally beat Jim Goodnight and his **** Proc GLM and **** Proc Reg.

Ross- Okay, but what will you give us? Will you give us some code back of what you improve?

Norman – Uh, let me explain this open core …

Peter D- Well how about some royalty?

Norman- Sure, we will throw parties at all conferences, snacks you know at user groups.

Ross – Hmm. That does not sound fair. (walks away in a huff muttering)-He takes our code, sells it and wont share the code

Peter D- Doesnt sound fair. I am back to reading Hamlet, the great Dane, and writing the next edition of my book. I am glad I wrote a book- Ross didnt even write that.

Norman-Uh Oh. (picks his phone)- Hey David Smith, We need to write some blog articles pronto – these open source guys ,man…

———–I think that sums what has been going on in the dynamics of R recently. If Ross Ihaka and R Gentleman had adopted an open core strategy- meaning you can create packages to R but not share the original where would we all be?

At this point if he is reading this, David Smith , long suffering veteran of open source  flameouts is rolling his eyes while Tal G is wondering if he will publish this on R Bloggers and if so when or something.

Lets bring in another R veteran-  Hadley Wickham who wrote a book on R and also created ggplot. Thats the best quality, most often used graphics package.

In terms of economic utilty to end user- the ggplot package may be as useful if not more as the foreach package developed by Revolution Computing/Analytics.

Now http://cran.r-project.org/web/packages/foreach/index.html says that foreach is licensed under http://www.apache.org/licenses/LICENSE-2.0

However lets come to open core licensing ( read it here http://alampitt.typepad.com/lampitt_or_leave_it/2008/08/open-core-licen.html ) which is where the debate is- Revolution takes code- enhances it (in my opinion) substantially with new formats XDF for better efficieny, web services API, and soon coming next year a GUI (thanks in advance , Dr Nie and guys)

and sells this advanced R code to businesses happy to pay ( they are currently paying much more to DR Goodnight and HIS guys)

Why would any sane customer buy it from Revolution- if he could download exactly the same thing from http://r-project.org

Hence the business need for Revolution Analytics to have an enhanced R- as they are using a product based software model not software as a service model.

If Revolution gives away source code of these new enhanced codes to R core team- how will R core team protect the above mentioned intelectual property- given they have 2 decades experience of giving away free code , and back and forth on just code.

Now Revolution also has a marketing budget- and thats how they sponsor some R Core events, conferences, after conference snacks.

How would people decide if they are being too generous or too stingy in their contribution (compared to the formidable generosity of SAS Institute to its employees, stakeholders and even third party analysts).

Would it not be better- IF Revolution can shift that aspect of relationship to its Research and Development budget than it’s marketing budget- come with some sort of incentive for “SOME” developers – even researchers need grants and assistantships, scholarships, make a transparent royalty formula say 17.5 % of the NEW R sales goes to R PACKAGE Developers pool, which in turn examines usage rate of packages and need/merit before allocation- that would require Revolution to evolve from a startup to a more sophisticated corporate and R Core can use this the same way as John M Chambers software award/scholarship

Dont pay all developers- it would be an insult to many of them – say Prof Harrell creator of HMisc to accept – but can Revolution expand its dev base (and prospect for future employees) by even sponsoring some R Scholarships.

And I am sure that if Revolution opens up some more code to the community- they would the rest of the world and it’s help useful. If it cant trust people like R Gentleman with some source code – well he is a board member.

——————————————————————————————–

Now to sum up some technical discussions on NeW R

1)  An accepted way of benchmarking efficiencies.

2) Code review and incorporation of efficiencies.

3) Multi threading- Multi core usage are trends to be incorporated.

4) GUIs like R Commander E Plugins for other packages, and Rattle for Data Mining to have focussed (or Deducer). This may involve hiring User Interface Designers (like from Apple 😉  who will work for love AND money ( Even the Beatles charge royalty for that song)

5) More support to cloud computing initiatives like Biocep and Elastic R – or Amazon AMI for using cloud computers- note efficiency arguements dont matter if you just use a Chrome Browser and pay 2 cents a hour for an Amazon Instance. Probably R core needs more direct involvement of Google (Cloud OS makers) and Amazon as well as even Salesforce.com (for creating Force.com Apps). Note even more corporates here need to be involved as cloud computing doesnot have any free and open source infrastructure (YET)

_______________________________________________________

Debates will come and go. This is an interesting intellectual debate and someday the liitle guys will win the Revolution-

From Hugh M of Gaping Void-

http://www.gapingvoid.com/Moveable_Type/archives/cat_microsoft_blue_monster_series.html

HOW DOES A SOFTWARE COMPANY MAKE MONEY, IF ALL

SOFTWARE IS FREE?

“If something goes wrong with Microsoft, I can phone Microsoft up and have it fixed. With Open Source, I have to rely on the community.”

And the community, as much as we may love it, is unpredictable. It might care about your problem and want to fix it, then again, it may not. Anyone who has ever witnessed something online go “viral”, good or bad, will know what I’m talking about.

and especially-

http://gapingvoid.com/2007/04/16/how-well-does-open-source-currently-meet-the-needs-of-shareholders-and-ceos/

Source-http://gapingvoidgallery.com/

Kind of sums up why the open core licensing is all about.

Movie Review- Peepli Live

A brilliant satire on Modern day India and impact of it’s progress on Agricultural India- the movie lampoons the multiple media channels that have mushroomed up, the various issues regarding India’s social welfare ‘schemes’ and of course the fact that 100,000 farmers have committed suicide and 8 million farmers have left farming since the economic reforms created progress- without being heavy and sometimes being cheeky at Indian Politicians , and Bureaucrats in general.

Watch it- its  a better quality Bollywood movie.

%d bloggers like this: