The $1 Bailout

Psst ! Wanna save some money for your organization. Lead a shareholder’s rally to accept federal bailout of a dollar. That puts a cap of $500,000 on the CEO’s salary. Easily the most profitable 1$ your company ever borrowed.

Where would all the CEO’s go ?Japan, China ,India pay less than this mostly. Unless you were the original owner of the company.

The top starting salary at my business school used to be $125,000  (that’s for locations out of US or UK). The top guy used to be hated by all the batch as his name got splashed in the newspapers , prompting the rest of parents to ask their child-

Hey , Why does he get so much while you don’t ? ( Now he wont)

On a serious note: If so many companies are declaring losses, that they will carry over on their books,they will actually get a tax benefit for the loss, as well as the bailout money at low rates of interest. Oh well, at least the CEO wont get rich ( if you think that half a million dollars is not rich).

There is something wrong in a world where a tea stall owner ( like in the movie Slumdog Millionaire)……….Wait ! There is something strange in a world where a tea stall owner makes more money than Citigroup and General Motors .Combined. But makes a thousand times less than their boss.

Do you make more money than Bear Sterns and Lehman Brothers ? Write in and share your perspectives.

SPSS and R

I rarely use SPSS now, but in college ( www.iiml.ac.in) my marketing professors kind of ensured I was buried in it for weeks. Much later I did to some ARIMA forecasting in SPSS for macro economic indicators prediction ( details coming up)–

 

However the SPSS help list is a great one ( SPSSX-L@LISTSERV.UGA.EDU) , not just for staying in touch with SPSS but also with the latest statistical modeling techniques. Here is an extract from the list ( www.listserv.uga.edu/archives/spssx-l.html ) on using SPSS and R together

 

Assuming version 16 or later, you need to install the R plug-in from Developer Central.  Then your R syntax can be run in the syntax window between

BEGIN PROGRAM R.

and

END PROGRAM R.

The output automatically appears in the SPSS Viewer with two cautions.  1) In version 16, R graphics are written to files and don’t appear in the Viewer.  Version 17 integrates the graphics directly.  2) When using R interactively, expression output appears in your console windows, e.g.,

summary(dta)

displays the summary statistics for a data frame, dta.  In non-interactive mode, which is what you are in when running BEGIN PROGRAM, you need to enclose the expression in a print function for it to display, e.g.,

print(summary(dta))

The documentation for the apis to communicate between SPSS and R is installed along with the plug-in, and there are examples in the Data Management book linked on Developer Central (www.spss.com/devcentral).

You might also go through the PowerPoint article on Developer Central, "Programmability in SPSS Statistics 17", which you will find on the front page of the site.  It includes a detailed example of using the R Quantreg package in SPSS as an extension command.  There is also a download in the R section on creating an SPSS dialog box that generates an R program directly.  Look for Rboxplot – Creating an R Program from a Dialog.  This has a simple dialog box that generates code for an R boxplot along with an article that explains what is happening.

 

Ajay ‘s 2 cents– SPSS treats R as an opportunity rather than a threat, partly because SPSS is a much lower priced software , and has been working to displace SAS in vain for some time now.

SAS ( the company and not the language) as the market leader has the most to lose due to

  • its high market share ( which it has maintained by aggressively seeking both legal action as well as by pumping in or investing or generously giving — huge amounts of money in hosting conferences,papers and research and keeping alumni and current employees happy and loyal),

and

  • premium pricing ( which comes under greater pricing pressure amid a general economic downturn amongst its preferred customers -especially banks and companies like Amazon , GE Money etc)

and

  • multi pronged competition with tacit support from bigger players waiting on sidelines
  • ( like IBM has an alliance with WPS which is almost a de facto Base SAS clone as it can take in SAS datasets, SAS code, and output SAS code, SAS datasets besides having it’s own Eclipse based design for the Workbench
  • Microsoft expanding data mining capabilities in SQL Server and initiatives like Microsoft Azure ( OS for Cloud Computers ) and Microsoft Mesh .
  • open source players like R, KNIME, Rapid Miner getting commercial momentum due to better value for cost ( 0 ).

and

  • data and code portability between SAS,SPSS,R due to PMML standards means switching barriers are getting lowered. There are almost no switching barriers between Base SAS and WPS in my testing experience.

The coming market share battles between SAS, and WPS and R will be interesting to watch for the analyst/customers — that is if the current economic crisis doesn’t claim any of the companies or the clients first. Alliances as well community networking among users and developers could be critical.

Still innovation flows from creative destruction of old ideas, mindsets, attitudes and yes even software.

SAS L Get out the Vote

Voting is still on for SAS L rookie of the year. 

One of the earliest Rookies nominated from India is ehm, me, which can be pretty rare.

Name            # of 2007  # of 2008
                       posts      posts
Ajay Ohri             0          351
Joe Matise           0          250
Karma Dorjetarap  4          219
Akshaya Nathilvar  0          169
Scott Bucher        0          154
Stefan Pohl          5          148
Richard Wright      0           79
Choy Junyu          0           73
Scott Raynaud      0           69
SAS 9 BI USER      0           55
Tom Smith           0           53
Jim Agnew           0           52
Josh Lee              0           51

 

If you are on the SAS-L list, you can vote for the following

You can vote (one vote per person please) at:
http://ires.ku.edu/~ipsr/SGF2009/saslbof.htm
Voting will end February 12th.

KNIME

Check out KNIME from www.knime.org if you are looking for modular data extensibility and ability to do exploratory analysis. You can use it for data modeling using decision tree and extend it further. Thanks to Bob Schultz and REVolution Computing guys as well as Mike Zeller of www.zementis.com for leading- pointing me this way.

  • Yes they (KNIME.org) have a commercial version as well as free version.
  • No, they wont charge you in hidden costs. Including training or learning time etc.
  • Yes , they do use PMML for porting data from platforms.
  • Best of all , it is great website with video tutorials and segmented data downloads ( German efficiency !!) while the www.r-project.org website is functional but uses HTML 4.0 ( which is from the seventies.or the eighties. or whatever)
  • No, they wont charge you for it !!!

From the website –

Welcome!

KNIME, pronounced [naim], is a modular data exploration platform that enables the user to visually create data flows (often referred to as pipelines), selectively execute some or all analysis steps, and later investigate the results through interactive views on data and models.

KNIME was developed (and will continue to be expanded) by the Chair for Bioinformatics and Information Mining at the University of Konstanz, Germany. The group headed by Michael Berthold is also using KNIME for teaching and research at the University. Quite a number of new data analysis methods developed at the chair are integrated in KNIME. Let us know if you are looking for something in particular, not all of those modules are part of the standard KNIME release just yet…

KNIME base version already incorporates over 100 processing nodes for data I/O, preprocessing and cleansing, modelling, analysis and data mining as well as various interactive views, such as scatter plots, parallel coordinates and others. It integrates all analysis modules of the well known Weka data mining environment and additional plugins allow R-scripts to be run, offering access to a vast library of statistical routines.

KNIME is based on the Eclipse platform and, through its modular API, easily extensible.

image

image

Coming up- Technical comparison of KNIME with Rapid Miner (http://rapid-i.com/content/view/26/82/) – which is similar in both free version and commercial licensing.These are both data mining rather than predictive analytics products.

I wish they host both KNIME as well as Rapid Miner on the Cloud using the Ohri Framework ( a joke which began on the SAS Consulting Group) on a Windows 64 OS , with remote desktop like functionality.So me just  logins , uploads the data,press button, wait for a sec and downloads the results.

Sigh  !!

( All screenshots in this post are acknowledged of www.knime.org)

Trusting Google

If Google is to believed the error was a human error in their bad site list, when someone wrote “\” as a bad file. This led to all sites being flagged as malware.

When that happened, customers for a time sample of 40 minutes , did the following

1) Went ahead and clicked on site they knew was okay

2) Wrote to Google on the error

3) Clicked on some sites but didn’t click on less trustable sites

The data collected from that sample is now being studied by Google. Why are they studying it ? Because in some way that clicking data, including time of search, time of clicks, frequency of repeated searches can lead to a ranking system for flagging malware sites which use popular keywords using the Adwords system ,and serving the newly discovered viruses in recent history ( including the ones which create dummy bots ) of computers.

Has Adwords been corrupted? Can Adwords be infiltrated ? Would Google tell us or try and fix the problem and then tell us?

As Andy Grove said ‘ Where the Paranoid survive”. Store all information of your Google searches, your Google Analytics data,your Orkut ,your emails and your YouTube for last nine months and anyone can have a pretty fair idea of what work,play ,hobbies you are up to. Remember Click fraud makes money for Google too- and even a 1 % increase in Click fraud rates increases Google’s quarterly earnings.

I trust Google and the “ Don’t be evil “ philosophy. But the philosophy and an apology cannot be the only safeguards for the privacy for billions of humans.

 

We Trust God. Everyone else has to bring data. Even Google. But guess what – Google wont share the data even for how they build the Chinese walls between commercial ads and search results.That’s more like a closed –source route,isn’t it.

Don’t worry. Just trust Google.

OT:Frank Sinatra for curing the Recession

If recession is all a matter of sentiment , individual and aggregated.Banks who used to be aggressive are too cautious now to lend, while individuals who used to spend are too cautious to even spend adequately for goods.

The following songs can really work wonders for specific sections of the economy-

 

1) Lets’ face the music and dance-

This one is for financial sector which is using the bailout money to pay bonuses , and shore up their balance sheet , while lobbying for a “bad bank’ to take care of all toxic assets. They need to start dancing again  ( as my ex-Boss of all Bosses ,Mr Chuck Prince said) – and resume taking some moderate risks to start lending again.

2) My Way- 

The Health Secretary and Pharma regulators need to sing “ My Way”.The healthcare sector needs to cut down on excessive insurance costs , and profiteering from their patents. Rather than limit salaries of bankers only, limiting the salaries of Big Pharmacy ,greater use of generics and broader coverage of people in America and even other parts like China ,and India would help the work force to cope.

This will ensure a healthy workplace ( which is quite stressed out)

If Venezuela can import 20,000 Cuban Doctors a year and export Oil, then why can’t the United States .

3) America the Beautiful-

This one is for the upper class and upper middle American consumers who have taken to a savings spree while resisting tax increases.

While spending on credit fueled items is still not recommended especially for middle class chaps , it would be good for some consumers to spend more and save appropriately. The creation of the estate tax, would help spread share the fertilizer of tax money to everyone in the economy. Anyone earning above a million dollars can choose between getting extra taxes or being forced for community service- shopping in malls compulsorily and then giving it away to charity.

We can start with the bonus claw back of bankers.

4) My funny Valentine-

This one is for the diplomats who need to swallow humble pie without swallowing crap, and start making more friends than enemies. Guess what – wars are expensive , and maybe some allies can chip in with the costs if you listen to them.

Inviting foreign leaders for talks, conferences without being rude and use charm and respectful and firm diplomacy can lower defense costs considerably.

1 fighter plane can create a lot of schools- and fund a lot of scholarships.

5)I ‘ve got you under my skin – 

The immigration, education and tourism sectors need to rethink some of the Homeland Security‘s tactics without compromising on safety off course. Taking repeated fingerprints , and brisk frisking despite X rays is not the best way to welcome guests to your country . Millions of illegal immigrants would be happy to pay taxes and contribute positively – if they are brought in the legal migrant category. The sons of Irish and African immigrants also need to re think the policy of putting cap on technically qualified people for Asian immigrants mostly especially in education and software ( while balancing approach for any fraud and abuse in immigrant’s pay).

 

These are just humble suggestions and maybe Frank Sinatra isn’t relevant to today’s economy and policy making. I however would rather listen to “Send in the Clowns”.

These are the author’s personal views. You are welcome to suggest other songs for the current state.

Interview:Richard Schultz , CEO REvolution Computing

Here is an interview with the CEO of REvolution Computing, Richard Schultz. Mr. Schultz offers his perspectives on aspects of the open source, predictive analytics, cloud computing as well his vision for R Commercial.

Note from Ajay-As I blogged previously, commercial establishments now have an option to use R commercially with a full service contract and all guarantees which they expect and get from existing analytics software vendors.

Ajay -Linux has not really succeeded in capturing Windows /Desktop Operating market. What are the technical and business reasons that you think R will succeed in analytics desktop software market.

Richard- To start, Linux was never really targeted at the Windows desktop market, but rather at deseating proprietary Unix deployments (particularly in finance), which it did quite successfully.  This is a similar trend to what we’re seeing in the R world – it’s not that R is generally replacing Excel, for instance.  In addition, with the large and growing base of both users and contributors, the vibrancy of the R community has taken on a life of its own.

As to R and Windows, two things are worth noting:

1. Microsoft has moved rapidly to embrace R and REvolution for that matter.

2. Windows is still the predominate operating system in large commercial enterprises. Because we deploy R on multiprocessors, which are now common on all computers including those pre-loaded with Windows, REvolution R is very much at home in both Windows, Mac, and Linux environments.

Ajay- What are the biggest challenges to Revolution Computing while explaining R Pro to users of traditional statistics softwares. What are the biggest advantages?

Richard- The biggest challenge is getting the word out that there now exists validated and supported R products designed for commercial use. But that’s changing rapidly, as your own interest in REvolution Computing demonstrates. Our biggest advantages are several:

1. we are focused on building a close and collegial relationship with the open source R community;

2. our company has a deep history in super computing and parallelization;

3. with, by Intel’s estimate, over 1 million R users and growing, there is a large community eager to adapt our products as its members advance their careers in the business and research worlds.

Ajay- Which softwares do you think will be affected the most by R’s spread across colleges and companies. What do you believe will be their strategies to compete.


Richard – I want to be politic here. Let me say that the programming software likely most affected by the rise of R is probably proprietary.

We see many opportunities to partner and leverage the strengths of REvolution’s products specifically – high performance, handling of large data, validation, IDE / user interface.

Ajay- How do you intend to incorporate the cloud computing and Software as a Service Model for R Pro. When , if at all, do you think it be possible  for a person to simply upload a zipped csv file, work on a remote cloud computer for analytics and forecasting, and just pay for the hired software,hardware,bandwidth.

Richard – We were thinking of something based on the Ohri framework.  ;-). ( Ajay- Touché!)

In fact, we have deployed, and are deploying cloud-based REvolution R for clients, and it’s something we expect to evolve as those technologies evolve.


Ajay- Asian countries have huge demand for analytics, and are more price conscious on softwares. What would your strategy to sell in Asia /China and India be.

Richard – Open source can be a tremendous win for users in Asia / China / India.  The upfront costs are low, the technology is leading-edge, and there is a distribution network for support.  REvolution has partners, and is continuing to build its partner network to be able to reach these markets.  We expect to accelerate our efforts in these regions toward the end of 2009.

Ajay- What has been the story so far for your career. What prompted you to join/start Revolution Computing. What would be the advice you would give to young science graduates in today’s recession.

Richard – My own background is in computer science, business… and music. Through school I held various positions at IBM, and after graduate school, I worked at Dunn & Bradsteet in a product management role and developed a taste for entrepreneurship. I’ve started two companies so far, MetaServer, a business intelligence middleware company that catered to the insurance industry, and REvolution Computing. Today, MetaServer is part of Oracle. And I continue to play music – guitar and piano. One of these days we’ll get a REvolution Computing band together.

My advice to young science graduates is the same recession or no: follow your enthusiasms; find a passion outside of work like playing music; master open source program languages because that is the future and the future is here.

About Richard Schultz –Chief Executive Officer,REvolution Computing

Richard guides REvolution’s long-range business strategy and leads the company’s teams on a daily basis. His experience developing and growing Business Intelligence software companies includes founding and leading Metaserver, Inc., now a part of Oracle, from inception to sale. Richard has been named Innovator of the Year by Business New Haven; served on the board of the Connecticut Venture Group; and been the keynote speaker for CIO Forum and other technology industry events.  A graduate of Washington University with degrees in Computer Science, Business and Music, Richard also holds a Masters degree in Computer Science from the State University of New York at Stonybrook and has held senior positions at Dunn and Bradstreet and IBM.

Ajay -REvolution Computing has been a leader in this field and going by the latest product launch –well you can try it yourself and see from here http://www.revolution-computing.com