Analytics Software

Predictive Analytics- The Book

Book on Analytics

R for Quantitative Finance

Happy July 4th

To all my American friends.


Informatin Asymmetry is the most evil business

What is information asymmetry?

information asymmetry deals with the study of decisions in transactions where one party has more or better information than the other. This creates an imbalance of power in transactions which can sometimes cause the transactions to go awry, a kind of market failure in the worst case. Examples of this problem are adverse selection,[1] moral hazard, and information monopoly

Most commonly, information asymmetries are studied in the context of principal–agent problems. Information asymmetry causes misinforming and is essential in every communication process

Adverse selection, anti-selection, or negative selection  refers to a market process in which undesired results occur when buyers and sellers have asymmetric information (access to different information); the “bad” products or services are more likely to be selected.

The principal–agent problem or agency dilemma occurs when one person or entity (the “agent“) is able to make decisions that impact, or on behalf of, another person or entity: the “principal“. The dilemma exists because sometimes the agent is motivated to act in his own best interests rather than those of the principal.

Monopolies of knowledge arise when ruling classes maintain their political power through their control of key communications technologies.[3] An example of this occurs in ancient Egypt where a complex writing system conferred a monopoly of knowledge on literate priests and scribes.


  1. This especially is true in enterprise software
  2. and online advertising and spam
  3. and commodities across the globe (oil spikes after iraq, oil slumps after heating oil data, climate data, or even releases from strategic reservoirs)
  4. and internet spying which may be for economic espionage or trade negotiations but are justified as looking for terrorists.
  5. and inflation in the developing and poor countries
  6. and lobbying in the developed and rich countries


People who enable information asymmetry are corrupted people, misled by their own greed and agent-employees in decisions that run counter to the principles when they founded their corporation.

Do you think information asymmetry is evil? Or do you think we should jump on the bandwagon and play the game. Click those ads, while we share your data with the government!


Latest Interview – Rapid Miner CEO Ingo Mierswa

Here is an interview I did with the CEO of Rapid Miner, Ingo Mierswa. Ingo, who is something of a prodigy and genius with multi-lingual capabilities, stellar academic and business record talks on navigating the journey for an open source startup.

Popularized by Michael (Monty) Widenius, one of the founders of MySQL and an investor in RapidMiner, business source is a commercial software license model that offers many of the benefits of open source, but with a built-in time delay on users being able to access new versions of our products.



  1. Guide to Data Science Cheat Sheets 2014/05/12
  2. Book Review: Data Just Right 2014/04/03
  3. Exclusive Interview: Richard Socher, founder of etcML, Easy Text Classification Startup 2014/03/31
  4. Trifacta – Tackling Data Wrangling with Automation and Machine Learning 2014/03/17
  5. Paxata automates Data Preparation for Big Data Analytics 2014/03/07
  6. etcML Promises to Make Text Classification Easy  2014/03/05
  7. Wolfram Breakthrough Knowledge-based Programming Language – what it means for Data Science? 2014/03/02

10 for 10 – Packt lowers cost of books for students and researchers alike

The high cost of textbooks and science books is an open scandal. Despite this publishers are barely profitable, and the ecosystem is ripe for disruption.

Packt is one such player. I have reviewed many books for them ( in return I get ebooks and books – some of which I give to my students).

Now they have an intriguing offer.

As you are aware, this month, Packt is celebrating 10 years of success with over 2000 Titles in its Library. To celebrate this huge milestone, we have come up with an exciting opportunity for collaboration which you might be interested in.

Packt is offering all of its eBooks and Videos at just $10 each. This campaign is specifically aimed towards thanking all our customers for their support and opening up our comprehensive range of titles just for $10 each. This promotion covers every title and customers can stock up on as many copies as they like until July 5th. I hope you find this as a great opportunity to explore what’s new and maintain your personal and professional development.

Interested- you can see

Disclosure- The author was offered 2 free ebooks as part of this campaign on social media. Books is one thing he is willing to blog for ;)

Google Trends for Game of Thrones and Lost

Is Game of Thrones more popular than Lost

Not quite. But its getting there!


Screenshot 2014-06-28 13.22.29

Analysing Google Plus posts using R language #rstats

Here is a short post in retrieving information from the Google+ API using R, and then analysing it.

To create an API key:

  1. Go to the Google Developers Console.
  2. Create or select a project.
  3. In the sidebar on the left, select APIs & auth.
  4. In the displayed list of APIs, find the Google+ API and set its status to ON.
  5. In the sidebar on the left, select Credentials.
  6. Create an API key by clicking Create New Key. Select the appropriate kind of key: Server key  Then clickCreate.


and the R code

options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl")))
myProfile=harvestProfile("+AjayOhri", parseFun = parseProfile)
myposts=harvestPage("+AjayOhri", parseFun = parsePost, results = 1, nextToken = NULL, cr = 1)
plot(myposts$ti,myposts$nC) #number of comments
plot(myposts$ti,myposts$nP) #number of likes or plus 1
plot(myposts$ti,myposts$nR) #number of reshares

some screenshots and images Screenshot 2014-06-26 13.33.08

Screenshot 2014-06-26 13.32.56

You can also see the Rpubs document here Now you can do text analysis and sentiment analysis on myposts$msg and do social media analysis on what makes people like what kind of content. 

For better results, use a google plus id (page or person) which has a lot of PUBLIC posts!


ggvis is awesomeness personified #rstats


Hu ha! Latest sexy software from our man Dr Hadley Wickham and his ninjas at RStudio. Now YOU can make a Business Intelligence software for FREE. How good is it? time will tell if someone can use it to give Tableau Software and Qlikview a run for the money

Seriously- I would like to see ONE implementation of RHadoop and Shiny with ggplot2 and d3

(Big data analytics indeed ;) )



ggvis is a data visualization package for R which lets you:

  • Declaratively describe data graphics with a syntax similar in spirit to ggplot2.
  • Create rich interactive graphics that you can play with locally in Rstudio or in your browser.
  • Leverage shiny’s infrastructure to publish interactive graphics usable from any browser (either within your company or to the world).

The goal is to combine the best of R (e.g. every modelling function you can imagine) and the best of the web (everyone has a web browser). Data manipulation and transformation are done in R, and the graphics are rendered in a web browser, using Vega. For RStudio users, ggvis graphics display in a viewer panel, which is possible because RStudio is a web browser.

Please note that the API has changed significantly between ggvis 0.1 and 0.3. Documentation for the old version is here.

Screenshot 2014-06-25 21.07.50

Analytics Conference

R in the Cloud

Learn R


Get every new post delivered to your Inbox.

Join 831 other followers