Big Noise on Big Data

Increasingly Big Data is used in writing where Business Analytics was used, and data mining is thrown in as a word just to keep liberal art majors happy that they are reading a scientific article.

Some Big Words I have noticed in my Short life-

Big Data? High Performance Analytics? High Performance Computing ? Cloud Computing? Time Sharing? Data Mining? SEMMA? CRISP-DM? KDD? Business Intelligence? Business Analytics and Optimization? (pick a card and any card)

(or Just Moore’s Law catching up with the analytics)

Some examples-

Replace Big Data with Analytics in these articles and let me know if you can make out much of a difference

  • Big Data on Campus

http://www.nytimes.com/2012/07/22/education/edlife/colleges-awakening-to-the-opportunities-of-data-mining.html

  • From the man who famously said BI is dead, is now burying Business Analytics within the new buzzword , SAS CMO Jim Davis

How to transform big data from an obstacle into an asset

http://blogs.sas.com/content/corneroffice/2012/07/22/how-to-transform-big-data-from-an-obstacle-into-an-asset/

(Related- Is big data over hyped? by Jim Davis

http://www.sas.com/knowledge-exchange/business-analytics/featured/is-big-data-over-hyped/index.html )

I am sure by 2015, Jim Davis, NYT and the merry men of analytics will find some other buzzwords to rally the troops. In the meantime, let me throw out the flag and call it Big  .

Credit Downgrade of USA and Triple A Whining

As a person trained , deployed and often asked to comment on macroeconomic shenanigans- I have the following observations to make on the downgrade of US Debt by S&P

1) Credit rating is both a mathematical exercise of debt versus net worth as well as intention to repay. Given the recent deadlock in United States legislature on debt ceiling, it is natural and correct to assume that holding US debt is slightly more risky in 2011 as compared to 2001. That means if the US debt was AAA in 2001 it sure is slightly more risky in 2011.

2) Politicians are criticized the world over in democracies including India, UK and US. This is natural , healthy and enforced by checks and balances by constitution of each country. At the time of writing this, there are protests in India on corruption, in UK on economic disparities, in US on debt vs tax vs spending, Israel on inflation. It is the maturity of the media as well as average educational level of citizenry that amplifies and inflames or dampens sentiment regarding policy and business.

3) Conspicuous consumption has failed both at an environmental and economic level. Cheap debt to buy things you do not need may have made good macro economic sense as long as the things were made by people locally but that is no longer the case. Outsourcing is not all evil, but it sure is not a perfect solution to economics and competitiveness. Outsourcing is good or outsourcing is bad- well it depends.

4) In 1944 , the US took debt to fight Nazism, build atomic power and generally wage a lot of war and lots of dual use inventions. In 2004-2010 the US took debt to fight wars in Iraq, Afghanistan and bail out banks and automobile companies. Some erosion in the values represented by a free democracy has taken place, much to the delight of authoritarian regimes (who have managed to survive Google and Facebook).

5) A Double A rating is still quite a good rating. Noone is moving out of the US Treasuries- I mean seriously what are your alternative financial resources to park your government or central bank assets, euro, gold, oil, rare earth futures, metals or yen??

6) Income disparity as a trigger for social unrest in UK, France and other parts is an ominous looming threat that may lead to more action than the poor maths of S &P. It has been some time since riots occured in the United States and I believe in time series and cycles especially given the rising Gini coefficients .

Gini indices for the United States at various times, according to the US Census Bureau:[8][9][10]

  • 1929: 45.0 (estimated)
  • 1947: 37.6 (estimated)
  • 1967: 39.7 (first year reported)
  • 1968: 38.6 (lowest index reported)
  • 1970: 39.4
  • 1980: 40.3
  • 1990: 42.8
    • (Recalculations made in 1992 added a significant upward shift for later values)
  • 2000: 46.2
  • 2005: 46.9
  • 2006: 47.0 (highest index reported)
  • 2007: 46.3
  • 2008: 46.69
  • 2009: 46.8

7) Again I am slightly suspicious of an American Corporation downgrading the American Governmental debt when it failed to reconcile numbers by 2 trillion and famously managed to avoid downgrading Lehman Brothers.  What are the political affiliations of the S &P board. What are their backgrounds. Check the facts, Watson.

The Chinese government should be concerned if it is holding >1000 tonnes of Gold and >1 trillion plus of US treasuries lest we have a third opium war (as either Gold or US Treasuries will burst)

. Opium in 1850 like the US Treasuries in 2010 have no inherent value except for those addicted to them.

8   ) Ron Paul and Paul Krugman are the two extremes of economic ideology in the US.

Reminds me of the old saying- Robbing Peter to pay Paul. Both the Pauls seem equally unhappy and biased.

I have to read both WSJ and NYT to make sense of what actually is happening in the US as opinionated journalism has managed to elbow out fact based journalism. Do we need analytics in journalism education/ reporting?

9) Panic buying and selling would lead to short term arbitrage positions. People like W Buffet made more money in the crash of 2008 than people did in the boom years of 2006-7

If stocks are cheap- buy. on the dips. Acquire companies before they go for IPOs. Go buy your own stock if you are sitting on  a pile of cash. Buy some technology patents in cloud , mobile, tablet and statistical computing if you have a lot of cash and need to buy some long term assets.

10) Follow all advice above at own risk and no liability to this author 😉

 

Youtube is coming Home

A continuing series on better design interfaces for my favorite music channel – You Tube

Some things I like.

The shrink- expand button.

The wasted space for advertisement – to the left of the video that is hugely static in terms of changes. It should be rotated more often.

The non existing average time of play- does everyone watch the whole video . or is the whole video watched 56 million times.

the inability to scroll and zoom into the video analytics.

the completely outdated comments button- which can be better used to create a SOCIAL community. but all it shows is top ranked comment, and click before dropping down. I liked the NYT approach to segmented comments including Editors Picks, Most Recommended, Highlights.

The video response feature that can be easily gamed to ensure video views /phishes.

The comments page numbers at the bottom instead of being at the top for the casual scanner of comments.

              Next

Facebook is the first button rather than second button in the minimum shared view list. Is that true? Can these buttons be self learning to my preferred social network instead of a default. (hint- use Google prediction API)

There is no provision to replay a video, unless you put into a playlist- which fortunately has been quite changed, even though the urls for playlists should have a separate url shortener than you.tube

A much better recommended playlist of related videos- they should be customized to the eclectic taste of the signed in user than the actual content. Maybe Try something like iTunes Genius feature.

No provision for a paid , premium channel even for countries that are blocked en masse from watching certain videos, hence depend on illegal video responses.

Stuxnet DeMystified

Detail of a New York Times Advertisement - 1895
Image via Wikipedia

A fascinating article in New York Times details the fascinating details of the Stuxnet virus, apparently the most successful cyber weapon in recent times.

Given that Industrial Controllers are a part of a everything from factories to missile launch configurations, I believe this is a fascinating area of study for the world’s research scientists including creating variants and defenses for this.

https://www.nytimes.com/2011/01/16/world/middleeast/16stuxnet.html

Also a 2008 presentation by Siemens that the NYT was kind enough to link to- (whither Wikileaks ??)

Kill R? Wait a sec

1) Is R efficient? (scripting wise, and performance wise) _ Depends on how you code it- some Packages like foreach can help but basic efficiency come from programmer. XDF formats from Revoscalar -the non open R package further improve programming efficiency

2) Should R be written from scratch?

You got to be kidding- It depends on how you define scratch after 2 million users

This has been done with S, then S Plus and now R.

3) What should be the license of R (if it was made a new)?

GPL license is fine. You need to do a better job of executing the license. Currently interfaces to R exist from SPSS, SAS, KXEN , other companies as well. To my knowledge royalty payments as well as formal code sharing does not agree.

R core needs to do a better job of protecting the work of 2500 package-creators rather than settling for a few snacks at events, sponsorships, Corporate Board Membership for Prof Gentleman, and 4-5 packages donated to it. The only way R developers can currently support their research is write a book (ny Springer mostly)

Eg GGplot and Hmisc are likely to be used more by average corporate user. Do their creators deserve royalty if creators of RevoScalar are getting it?

If some of 2 million users gave 1 $ to R core (compared to 9 million in last round of funding in Revolution Analytics)- you would have enough money to create a 64 bit optimized R for Linux (missing in Enterprise R), Amazon R APIs (like Karim Chine’s efforts), R GUIs (like Rattle’s commercial version) etc etc

The developments are not surprising given that Microsoft and Intel are funding Revolution Analytics http://www.dudeofdata.com/?p=1967

R controversies come and go (this has happened before including the NYT article and shakeup at Revo)

An interesting debate on whether R should be killed to make an upgrade to a more efficient language.

From Tal (creator R Bloggers) and on R help list-

There is currently a (very !) lively discussions happening around the web, surrounding the following topics:
1) Is R efficient? (scripting wise, and performance wise)
2) Should R be written from scratch?
3) What should be the license of R (if it was made a new)?

Very serious people have taken part in the debates so far.  I hope to let you know of the places I came by, so you might be able to follow/participate
in these (IMHO) important discussions.

The discussions started in the response for the following blog post on
Xi’An’s blog:
http://xianblog.wordpress.com/2010/09/06/insane/


Followed by the (short) response post by Ross Ihaka:
http://xianblog.wordpress.com/2010/09/13/simply-start-over-and-build-something-better/


Other discussions started to appear on Andrew Gelman’s blog:
http://www.stat.columbia.edu/~cook/movabletype/archives/2010/09/ross_ihaka_to_r.html

And (many) more responses started to appear in the hackers news website:
http://news.ycombinator.com/item?id=1687054

I hope these discussions will have fruitful results for our community,
Tal

—————-Contact
Details:——————————————————-
Contact me: Tal.Galili@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)

My 0 cents ( see it would 2 cents but it;s free)

S A S GOOD LIFE UNDER SIEGE – NYT

There was a time when when the word NYT invoked that’s where we read about news and politics. In 2009, the most happening news on statistical software came from NYT ( KD Nuggets and the Journal of Statistical Software are not too happy about that either).

The latest article calls SAS as a software giant under siege-and it’s Good Life under threat.

[tweetmeme source=”decisionstats”]

This inspired me to an old movie poster I saw once- It’s also called Under Siege.

Given the fact that the under siege SAS earned 2.4 Billion Dollars last year alone

and the market capitalisation o New York Times is 1.25 Billion Dollars.

Why doesn’t DR Goodnight buy the New York Times itself for 600 million dollars and have enough change left over for………. err a Happy Thanksgiving.

—————————————————————————————————————————————————————————————-

LIES

TRUE LIES

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

AND STATISTICS

 

The New York Times makes an error

Here is a story in the New York Times about a guy I drank beer with while listening to Pink Floyd at full volume, whom I played football and won the gold medal with in business school, and who danced at my wedding. To summarize, I know Mr Sumit Sapra well. The New York Times wrote a story on him, and his photographs in a variety of poses to portray him as yet another get rich quick Indian immigrant. They do need to sell more copies of the tree destroying paper edition, but even the online edition got it wrong.

New York Times Article

INDIA, Suddenly Starved for Investment

(Ajaystarved is a term to show affinity third world has for starving- maybe the Chinese get dip in investment but India gets starved)

Sumit Sapra is a member of that ambitious, impatient generation of young Indians who rode the crest of the global economy. In five years, he changed jobs three times, quadrupling his salary along the way. Even when satisfied with his position, he kept his rsum posted on job sites, in case better offers came along. And he splurged. In three years, he bought three cars, moving up a notch in luxury each time. For weekend jaunts, he bought a motorcycle.

http://www.nytimes.com/2009/05/05/business/global/05rupee.html?_r=1

Sumits Rebuttal

http://saprasumit.blogspot.com/

Five years, three jobs, seems wild, doesn’t it? What they conveniently forgot to mention is that a few of these changes were due to circumstances and the need to make a livelihood. Most of us post our resumes on to various recruiting channels like websites, consultants, etc. when we are looking for a job. Those resumes stay there even after we find a job, does that imply that we are constantly on the lookout for another one?

And more

The article talks about the fact that I bought three cars in three years, though I bought four and not three! – "In three years, he bought three cars, moving up a notch in luxury each time." It fails to point out that all of these were used cars, bought at about one third of their original price and also that I am an automobile enthusiast and I do this primarily because of my love for cars. In fact a few of these cars were bought at prices lower than what I sold my previous car for! If you can call buying a USD 8-10k car splurging on luxury then what the heck, I did splurge! the piece de resistance of this article is that it talks about me buying a motorbike for weekend jaunts, not realizing that this is India, not the United States, where people buy motorbikes to commute and not for fun. I’ve had this so-called "weekend jaunt" motorbike for more than 3 years, I bought it before I could afford to buy a car, you see and I didn’t see the need to sell it.

and to a common ex Employer I also started my analytics career with.

Despite being laid off, at some level I also feel sorry for my ex employers, General Electric Co. as even they have not been spared by these sensation seeking merchants, or so called journalists. Yes, things are bad and I am the first one to realize that the going is not as good as it used to be, but that does not give the license to anyone to go around the world proclaiming doomsday is around the corner. As a wise man once said, "With great power, comes great responsibility", though to be honest I have heard this in a movie, I guess most of you know which one!

I have read the New York Times since the age of 19 till 32. And Mr Friedman, the mustached Pulitzer  Pulverizing actually stole the term The World is Flat from Nandan Nilekani ( who said the world is getting flattened).

The New York Times has portrayed India in a semi sarcastic light before- read here my earlier response to a very sensitive portrayal of India after terrorists attacked us (before the MUMBAI Blasts Note the date)

India RATTLED by Blasts

NYT thinks India is rattled after the blasts

Ajay Ohri on August 13th, 2008

Sent to The NYT Editor- After a headline that said India rattled after blasts to describe a series of blasts that killed 60 people in two days of consecutive blasts.
Subject: Unsolicited Submission From an Unknown, Unrattled Indian
Dear NY Times.com  Editor,
I am glad you used the word rattled to describe India, a nation of 1
billion

 

May the Good Lord above forgive the New York Times its sins. They know not what they were doing. Having bankrupted themselves fighting a general election on Os behalf , they trust Mexican Carlos Slim for loans but not American money (even from a Warren Buffet), And the Indians Weren’t the Indians the guys who drive the taxis there?

 

Please help save the New York Times from itself by joining the Facebook Cause Save the New York Times here

 

http://apps.facebook.com/causes/170855/8347178

Saving the New York Times from itself

Ajay Ohri on May 2nd, 2009

The iconic newspaper New York Times, flagpole for progressive, liberal and communist thinkers (depending on where you stand) is under attack again.
It is under attack from the stupidity of its old school old fashioned presses, who believe cutting thousands of trees every year to make loss making newspapers is better than just putting the News […]

 

As for my old friend Sumit Sapra who has been laid off by General Electric, brother you deserved this article. Next time play Russian music while we sip vodka as Beer, Pink Floyd and Buying Old Cars is too much for these New Yorkers to bear. They have had a grudge with anything named Indian ever since the Indians beat The Yankees 14-0