I have been busy-

1) Finally my divorce came through. My advice – dont do it without a pre-nup ! Alimony means all the money.

2) Spending time on Quora after getting bored from LinkedIn, Twitter,Facebook,Google Plus,Tumblr, WordPress

See this answer to-

 What are common misconceptions about startups?

1) we will change the world
2) if we get 1% of a billion people market, we will be rich
3) if we have got funding, most of the job is done
4) lets pay ourselves high salaries since we got funded
5) our idea is awesome and cant be copied, improvised, stolen, replicated
6) startups are painless
7) it is a better life than a corporate career
8) long term vision is important than short term cash burn
9) we will never sell out or exit. never
10) its a great idea to make startups with friend

Say hello to me – http://www.quora.com/Ajay-Ohri/answers

3) Writing freelance articles on APIs for Programmable Web

Why write pro? See point 1)

Recent Articles-




4) Writing poetry on http://poemsforkush.com/. It now gets 23000 views a month. I wish I could say my poems were great, but the readers are kind (364 subscribers!) and also Google Image Search is very very kind.

5) Kicking tires with next book ” R for Cloud Computing” and be tuned for another writing announcement

6) Waiting for Paul Kent, VP, SAS Big Data to reply to my emails for interview after HE promised me!! You dont get to 105 interviews without being a bit stubborn!

7) Sighing on politics engulfing my American friends especially with regards to Chic-fil-A and Romney’s gaffes. Now thats what I call a first world problem! Protesting by eating or boycotting chicken sandwiches! In India we had the world’s biggest blackout two days in a row- and no one is attending the Hunger Fast against corruption protests!

8) Watching Olympics! Our glorious nation of 1.2 billion very smart people has managed to win 1 Bronze till today!! Michael Phelps has won more medals and more gold than the whole of  India has since the Olympics Games began!!

9) Consulting to pay the bills. includes writing R code, making presentations. Why consult when I have writing to do? See point 1)

10) Reading New York Times to get insights on Big Data and Analytics. Trust them- they know what they are doing!

SAS and Hadoop

Awesomely informative post on sascom magazine (whose editor I have I interviewed before here at http://www.decisionstats.com/interview-alison-bolen-sas-com/ – )

Great piece by Michael Ames ,SAS Data Integration Product Manager.



Also see SAS’s big data thingys here at


Solutions and Capabilities Using SAS® In-Memory Analytics

  • High-Performance Analytics – Get near-real-time insights with appliance-ready analytics software designed to tackle big data and complex problems.
  • High-Performance Risk – Faster, better risk management decisions based on the most up-to-date views of your overall risk exposure.
  • High-Performance Liquidity Risk Management – Take quick, decisive actions to secure adequate funding, especially in times of volatility.
  • High-Performance Stress Testing – Make faster, more precise decisions to protect the health of the firm.
  • Visual Analytics – Explore big data using in-memory capabilities to better understand all of your data, discover new patterns and publish reports to the Web and iPad®.

(Ajay- I liked the Visual Analytics piece especially for Big Data )



Who made Who in #Rstats

While Bob M, my old mentor and fellow TN man maintains the website http://r4stats.com/ how popular R is across various forums, I am interested in who within R community of 3 million (give or take a few) is contributing more. I am very sure by 2014, we can have a new fork of R called Hadley R, in which all packages would be made by Hadley Wickham and you wont need anything else.

But jokes apart, since I didnt have the time to

1) scrape CRAN for all package authors

2) scrape for lines of code across all packages

3) allocate lines of code (itself a dubious software productivity metric) to various authors of R packages-


1) scraping the entire and 2011’s R help list

2) determine who is the most frequent r question and answer user (ala SAS-L’s annual MVP and rookie of the year awards)

I did the following to atleast who is talking about R across easily scrapable Q and A websites

Stack Overflow still rules over all.

http://stackoverflow.com/tags/r/topusers shows the statistics on who made whom in R on Stack Overflow

All in all, initial ardour seems to have slowed for #Rstats on Stack Overflow ? or is it just summer?

No the answer- credit to Rob J Hyndman is most(?) activity is shifting to Stats Exchange


You could also paste this in Notepad and some graphs on Average Score / Answer or even make a social network graph if you had the time.

Do NOT (Go/Bi) search for Stack Overflow API or web scraping stack overflow- it gives you all the answers on the website but 0 answers on how to scrape these websites.

I have added a new website called Meta Optimize to this list based on Tal G’s interview of Joseph Turian,  at http://www.r-statistics.com/2010/07/statistical-analysis-qa-website-did-stackoverflow-just-lose-it-to-metaoptimize-and-is-it-good-or-bad/


There are only 17 questions tagged R but it seems a lot of views is being generated.

I also decided to add views from Quora since it is Q and A site (and one which I really like)


Again very few questions but lot many followers

Interview: Hjálmar Gíslason, CEO of DataMarket.com

Here is an interview with Hjálmar Gíslason, CEO of Datamarket.com  . DataMarket is an active marketplace for structured data and statistics. Through powerful search and visual data exploration, DataMarket connects data seekers with data providers.


Ajay-  Describe your journey as an entrepreneur and techie in Iceland. What are the 10 things that surprised you most as a tech entrepreneur.

HG- DataMarket is my fourth tech start-up since at age 20 in 1996. The previous ones have been in gaming, mobile and web search. I come from a technical background but have been moving more and more to the business side over the years. I can still prototype, but I hope there isn’t a single line of my code in production!

Funny you should ask about the 10 things that have surprised me the most on this journey, as I gave a presentation – literally yesterday – titled: “9 things nobody told me about the start-up business”

They are:
* Do NOT generalize – especially not to begin with
* Prioritize – and find a work-flow that works for you
* Meet people – face to face
* You are a sales person – whether you like it or not
* Technology is not a product – it’s the entire experience
* Sell the current version – no matter how amazing the next one is
* Learn from mistakes – preferably others’
* Pick the right people – good people is not enough
* Tell a good story – but don’t make them up

I obviously elaborate on each of these points in the talk, but the points illustrate roughly some of the things I believe I’ve learned … so far 😉

9 things nobody told me about the start-up business


Both Amazon  and Google  have entered the public datasets space. Infochimps  has 14,000+ public datasets. The US has http://www.data.gov/

So clearly the space is both competitive and yet the demand for public data repositories is clearly under served still. 

How does DataMarket intend to address this market in a unique way to differentiate itself from others.

HG- DataMarket is about delivering business data to decision makers. We help data seekers find the data they need for planning and informed decision making, and data publishers reaching this audience. DataMarket.com is the meeting point, where data seekers can come to find the best available data, and data publishers can make their data available whether for free or for a fee. We’ve populated the site with a wealth of data from public sources such as the UN, Eurostat, World Bank, IMF and others, but there is also premium data that is only available to those that subscribe to and pay for the access. For example we resell the entire data offering from the EIU (Economist Intelligence Unit) (link: http://datamarket.com/data/list/?q=provider:eiu)

DataMarket.com allows all this data to be searched, visualized, compared and downloaded in a single place in a standard, unified manner.

We see many of these efforts not as competition, but as valuable potential sources of data for our offering, while others may be competing with parts of our proposition, such as easy access to the public data sets.


Ajay- What are your views on data confidentiality and access to data owned by Governments funded by tax payer money.

HG- My views are very simple: Any data that is gathered or created for taxpayers’ money should be open and free of charge unless higher priorities such as privacy or national security indicate otherwise.

Reflecting that, any data that is originally open and free of charge is still open and free of charge on DataMarket.com, just easier to find and work with.

Ajay-  How is the technology entrepreneurship and venture capital scene in Iceland. What things work and what things can be improved?

HG- The scene is quite vibrant, given the small community. Good teams with promising concepts have been able to get the funding they need to get started and test their footing internationally. When the rapid growth phase is reached outside funding may still be needed.

There are positive and negative things about any location. Among the good things about Iceland from the stand point of a technology start-up are highly skilled tech people and a relatively simple corporate environment. Among the bad things are a tiny local market, lack of skills in international sales and marketing and capital controls that were put in place after the crash of the Icelandic economy in 2008.

I’ve jokingly said that if a company is hot in the eyes of VCs it would get funding even if it was located in the jungles of Congo, while if they’re only lukewarm towards you, they will be looking for any excuse not to invest. Location can certainly be one of them, and in that case being close to the investor communities – physically – can be very important.

We’re opening up our sales and marketing offices in Boston as we speak. Not to be close to investors though, but to be close to our market and current customers.

Ajay- Describe your hobbies when you are not founding amazing tech startups.

HG- Most of my time is spent working – which happens to by my number one hobby.

It is still important to step away from it all every now and then to see things in perspective and come back with a clear mind.

I *love* traveling to exotic places. Me and my wife have done quite a lot of traveling in Africa and S-America: safari, scuba diving, skiing, enjoying nature. When at home I try to do some sports activities 3-4 times a week at least, and – recently – play with my now 8 month old son as much as I can.




Hjalmar GislasonHjálmar Gíslason, Founder and CEO: Hjalmar is a successful entrepreneur, founder of three startups in the gaming, mobile and web sectors since 1996. Prior to launching DataMarket, Hjalmar worked on new media and business development for companies in the Skipti Group (owners of Iceland Telecom) after their acquisition of his search startup – Spurl. Hjalmar offers a mix of business, strategy and technical expertise. DataMarket is based largely on his vision of the need for a global exchange for structured data.


To know more, have a quick  look at  http://datamarket.com/

Facebook IPO- Do you feel lucky?

2 Jan 2011 dealbook.nytimes.com

Facebook has raised $500 million from Goldman Sachs and a Russian investor in a transaction that values the company at $50 billion

29 Jan 2011 -www.bloomberg.com-$82.9-billion

14 Jun 2011-CNBC———————-$100 billion

27 Jun 2011 -news.cnet.com———-$70 billion

27 Sep 2011-Venturebeat.com——-$82.5 billion

100 billion valuation divided by 1000 million subscribers

=100 $ net present value of ad profit (note if 80 billion valuation with 800 million subscribers it is the same)

=250 $ net present value of ad revenues (assuming 40 % profitability)

=2500 $ net present value of online purchases by Facebook ad clicking customer

(assuming advertisers dedicate 10% of revenue to advertising by Facebook)

and the lucky Russian Investor who invested at 50 billion valuation only to see it double in six months, where else has he inVested


Digital Sky Technologies co-founder Yuri Milner, who co-invested in the Goldman-Facebook deal, enviably poised in the middle. DST has been investing early and aggressively in some of the biggest names in the tech bubble boom like Facebook (DST first invested in May 2009), Zynga (the company that makes Farmville and Cityville for Facebook), and Groupon (the dudes that just turned down Google’s $6 billion).

NOTE -Both groupon and Zynga IPO  investors lost money as they are now below IPO price.


More on Digital Sky Tech and Yuri Milner and the free internet in Putin’s Russia

Digital Sky got particular attention because of its broad control of the Russian Internet. DNI noted that the company is “a dominant force in the Runet,” owning the most popular Websites in the former Soviet Union, including Russia, Ukraine, Kazakhstan, Georgia, and Armenia as well as others in the Czech Republic and Poland. By some estimates it reported “over 70 percent of all page views in the Russian-language Internet are on its companies’ Websites.”



From Wall Street Journal-

May 1, 2011


Last month, a private-market transaction of 100,000 shares of Facebook Class B Common Stock priced at $32.00 apiece gave the website a valuation of $80 billion. Two months ago, Facebook was valued at $65 billion, when investment firm General Atlantic reportedly bought 0.1 percent of Facebook by purchasing roughly 2.5 million Facebook shares from former Facebook employees. Three months ago, Kleiner Perkins Caufield & Byers (KPCB) invested $38 million in Facebook, which was only worth 0.00073 percent of the social network, but still resulted in a valuation of $52 billion.





Something is gotta give?

Go ahead and  Please. Buy Facebook Stock !

Do you feel lucky?





Statistics on Social Media

Some official statistics on social media from the owners themselves

1) Facebook-


Date -17 Nov 2011


People on Facebook

Rcpp Workshop in San Francisco Oct 8th

 Rcpp Workshop in San Francisco  Oct 8th 

Following the successful one-day master class on Rcpp preceding this year’s R/Finance conference, a full-day master class on Rcpp and related topics which will be held on Saturday, October 8, in San Francisco.

Join Dirk Eddelbuettel for six hours of detailed and hands-on instructions and discussions aroundRcppinline,  RInsideRcppArmadilloRcppGSLRcppEigen and other packages—in an intimate small-group setting.

The full-day format allows combining an introductory morning session with a more advanced afternoon session while leaving room for sufficient breaks. We plan on having about six hours of instructions, a one-hour lunch break and two half-hour coffee breaks (and lunch and refreshments will be provided).

Morning session: “A Hands-on Introduction to R and C++”

The morning session will provide a practical introduction to the Rcpp package (and other related packages).  The focus will be on simple and straightforward applications of Rcpp in order to extend R and/or to significantly accelerate the execution of simple functions.

The tutorial will cover the inline package which permits embedding of self-contained C, C++ or FORTRAN code in R scripts. We will also discuss  RInside, to easily embed the R engine code in C++ applications, as well as standard Rcpp extension packages such as RcppArmadillo and RcppEigen for linear algebra (via highly expressive templated C++ libraries) and RcppGSL.

Afternoon session: “Advanced R and C++ Topics”

The afternoon tutorial will provide a hands-on introduction to more advanced Rcpp features. It will cover topics such as writing packages that use Rcpp, how Rcpp modules and the new R ReferenceClasses interact, and how Rcpp sugar lets us write C++ code that is often as expressive as R code. Another possible topic, time permitting, may be writing glue code to extend Rcpp to other C++ projects.

We also expect to leave some time to discuss problems brought by the class participants.

October 8, 2011 – San Franciso

AMA Executive Conference Center
@ the Marriott Hotel
55 4th Street, 2nd Level
San Francisco, CA 94103
Tel.             415-442-6770

Register Now!

Instructor Bio

Dirk Eddelbuettel Dirk E has been contributing packages to CRAN for nearly a decade. Among these are RQuantLib, digest, littler, random, RPostgreSQL, as well the Rcpp family of packages comprising Rcpp, RInside, RcppClassic, RcppExamples, RcppDE, RcppArmadillo and RcppEigen. He maintains the CRAN Task Views for Finance as well as High-Performance Computing, and is a founding co-organiser of the annual R / Finance conferences in Chicago. He has Ph.D. in Financial Econometrics from EHESS (Paris), and works in Chicago as a Quantitative Strategist.