Google’s dream has been lost : Rise of American Cyber Imperialism

Google was built of burying the concept of information asymmetry and spreading knowledge. Yes the ads were there, but they were just a way to make money without being evil

It was just a continuation of the process that the Gutenberg press and the Internet brought about. Products like Adsense and acquisitions like Youtube, Android and Blogger proved that Google not only helped you find content it helped you create content too.

But if the old desktop Monopoly was thwarted, a new more sinister monopoly has been born. With an increasingly corrupted political system manipulated by political lobbying, search engine queries are now logged , and neatly transferred to the American Government ‘s non elected branches. Using existing loopholes in existing law, we now face a frightening future in which the novel 1984 is more likely to be a Google sponsored movie coming soon to a computer screen near you.

When you click every ad that Google shows, remember you are helping fund the NSA as well. The rise of a neo imperialism led primarily by US/ Anglo Saxon /Western   alliances shows why the Chinese and Russian governments are actually right in being skeptical about the glasnost and perestroika that American establishment offered.

Power tends to corrupt and absolute power tends to corrupt absolutely. Today, Data is Power and the biggest collector of Data has chosen to hide behind a decade long slogan, Trust Me, I am not Evil.

Dude, Seriously!

The Galactic Empire is being built on Data ……


Some tips on creating a useful blog for beginners

1) Blog post title should be self explanatory

2) Use categories and tags for better navigation

3) Use a theme which attracts not distracts

4) Simple language in blog writing works best

5) Useful blogs get more traffic than autobiographical blogs. Unless you are a celebrity.

6) People who enjoy writing blogs create better blogs

7) Writing a blog  is like jogging. Do it every day , even when its boring and painful. or Do it as much as your schedule permits.


Interview -Dr Eric Siegel Author Predictive Analytics

Here is an interview with Dr Eric Siegel, founding chair of Predictive Analytics Conference and author of the recent bestseller in analytics, Predictive Analytics.

Ajay- What has been the response to your book

Eric- Since its launch in February, Predictive Analytics has held the #1 bestseller slot in two Amazon categories (planning & forecasting and econometric) and I have been gratified to see it receive positive reviews ( Amazon readers have mostly rated it 5 stars; the inevitable tail of negative reviews have almost all been from more technically inclined readers looking for a “how to” or more mathematical book. (They bought the wrong book and blame the book!) I’ve found most such readers are more than capable of understanding – after a few minute conversation – that there’s a place in the world for a book about their field written for a broader readership (I explain this here: 5 reasons the book matters to experts –, and in fact the industry overview, new case studies, and treatment of uplift modeling is often of great interest to even senior hands-on experts.
Ajay- You lead an extremely busy life with conferences travel and consulting. Do you plan to write another book and on what topic?
Eric- It’s likely to be a long while, since Predictive Analytics achieved my goal to introduce the field, provide a broad industry overview, and cover the advanced topics that interest me most (in a conceptual manner, but with copious citations for the more technical readers to drill down further). My attention now turns back to improving and broadening the coverage of Predictive Analytics World conference agendas (
In the meantime, I’d suggest readers check out Kaiser Fung’s new book Numbersense, as well as forthcoming books from Dean Abbott.
Ajay- How do you think PAW has positively impacted the Analytics fraternity through the world.
Eric- The conference has been a central place to engender and catalyze positive industry movement. Predictive Analytics World covers all the bases for both expert practitioners as well as newcomers. As the universal, cross-vendor meeting place that brings together the who’s who of predictive analytics, PAW presents not only unique opportunities to gain knowledge, but the industry’s premier networking event.
Eric Siegel, Ph.D., founder of Predictive Analytics World and Text Analytics World, and Executive Editor of the Predictive Analytics Times, makes the how and why of predictive analytics understandable and captivating. In addition to being the author of Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, Eric is a former Columbia University professor who used to sing to his students, and a renowned speaker, educator, and leader in the field.
Both Predictive Analytics Conference and Dr Eric have been supporters of this website for past three years

Different Forks of R- Will SAS create a new version of R too #rstats


A quick and dirty list….

1) Revolution R – Revolution R Community is Revolution Analytics’ free distribution of the open source R programming language — enhanced for users looking for faster performance and greater stability. It’s perfect for learning R and basic analysis

2) Oracle Enterprise R

Integrates the Open-Source Statistical Environment R with Oracle Database 11g
Oracle R Enterprise allows analysts and statisticians to run existing R applications and use the R client directly against data stored in Oracle Database 11g—vastly increasing scalability, performance and security. The combination of Oracle Database 11g and R delivers an enterprise-ready, deeply integrated environment for advanced analytics. Users can also use analytical sandboxes, where they can analyze data and develop R scripts for deployment while results stay managed inside Oracle Database.

3) Tibco Enterprise Runtime for R

TERR, a key component of Spotfire Predictive Analytics, is an enterprise-grade analytic engine that TIBCO has built from the ground up to be fully compatible with the R language, leveraging our long-time expertise in the closely related S+ analytic engine. This allows customers to continue to develop in open source R, but to then integrate and deploy their R code on a commercially-supported and robust platform—without the need to rewrite their code.

Prototypes are often developed in R, but then typically re-implemented in another language for production purposes because R was not built for enterprise usage. TERR brings enterprise-class scalability and stability to the agile R-language, and enables statisticians to broadly share their analyses through TIBCO Spotfire Statistics Services or by directly embedding the TERR engine.

4) pqR -   You gotta love Radford Neal’s throwing down the gauntlets to the old sleepy heads! At JSM , Montreal the R Core member announced they have agreed to incorporate his changes, signalling a major departure in the way changes have been signaled at R.

pqR is a new version of the R interpreter. It is based on R-2.15.0, distributed by the R Core Team (at, but improves on it in many ways, mostly ways that speed it up, but also by implementing some new features and fixing some bugs.

One notable improvement is that pqR is able to do some numeric computations in parallel with each other, and with other operations of the interpreter, on systems with multiple processors or processor cores.


5) Renjin Renjin is a JVM-based interpreter for the R language for statistical computing Renjin is a new implementation of the R language and environment for the Java Virtual Machine (JVM), whose goal is to enable transparent analysis of big data sets and seamless integration with other enterprise systems such as databases and application servers.

Renjin is still under development, with a target of a version “1.0” in late 2013, but in the meantime it is being used in production for a number of our client projects, and supports most CRAN packages, including some with C/Fortran dependencies.

6) Riposte (?)

Riposte, a fast interpreter and JIT for R.

Justin Talbot Zach Devito

We only do development on OSX and Linux. It’s unlikely that our JIT will work on Windows.

Planned work for July-December 2013. The first three bullet points are currently in progress on the library branch. Partial work will be integrated to main by the end of July.

  • [x] Load the standard base R library without errors
    • This will require support for about 15 primitive and external functions
  • [ ] Support all R primitive operators (~200, 50 supported as of July 2013)
    • [x] The most common 40 or so will be appear as bytecodes in the Riposte VM, primarily control flow operators and a small set of common arithmetic
    • [ ] The rest will be implemented in the Riposte core library
    • [x] Implement new .Map, .Scan, or .Fold FFI functions to allow vector fusion through primitives implemented as external calls in the core library
  • [ ] Support for the 200 most common internal functions (out of ~580, 30 supported as of July 2013)


SAP , IBM Netezza already have specialized packages for R.

The question is SAS which supports interaction with R through SAS/IML, even Base R, and JMP- can it be willing to go the extra mile for customers and create SAS/R . The fact that they made their products compatible with R shows they acknowledge and respect R’s appeal ( contrary to old sleepyheads who think all SAS is good and all base R is divine)

SAS/ R can be the third major product for the SAS Institute after SAS and JMP platforms. Any takers, ladies and gentlemen?

jim g

Are we in an Analytics Recession- a decline in SAS is a decline for all #rstats

I was intrigued by David Smith’s blog post at and played with some of the terms associated with analytics and data science.
Some points on that-
1) The term SAS  is broader than the Statistical Software.
2) The term R is even broader. Accordingly I searched for R language- again it is a narrower term
3) Even by David’s own graph- SAS jobs have declined by 33% over two years, while R jobs have increased by 50%. However some jobs list both SAS and R so will be counted twice.
4) Even by David’s graph , SAS jobs are still twice as many as R jobs. So the overall market for analytics job is declined
5) I have no way to giving an exact conclusion unless I have access to the data, or I fire up a scraper myself.
6) Jobs remains a key area of concern for students and for future growth
7) Python statistical packages may need to be included shortly. I think sometimes it is easier to teach applied statistics (and data mining) to talented coders than teach scripting to talented statisticians.
8) Is hypothesis testing dead in the era of Big Data. What is a t test or chi square for a million rows. Almost all the better theory for such is locked in Bing or Google Research
9) The continued existence of Microsoft OS should be a sobering thought to people claiming ultimate victory too soon. Software needs to be sold, and sometimes the better sold software triumphs over the better designed software.
10) I would think that R has completely dominated the academic statistical market the same way SAS was doing it for business analytics some years back. However there exists much work to be done given some limitations of that software.
11) However SAS Institute revenue continues to grow. One reason for that could be the Institute work for big money clients including the US Government. Despite Drew Conway- I have yet to come across too many cases of the Big Fed using R.
SAS Jobs
The below is a

How to be a better writer


Background- I wrote this as an accident while trolling on Quora. I was not confident of what I wrote- in fact I wrote it anonymous except people kept asking me why! It was pure serendipity- I wrote it less than 4 minutes and submitted without thinking. Then edited once based on feedback.

Some one clearly more smarter than me made my tips for writing into a picture

and it went popular on Tumblr just like it did on Quora!

Apparently if some guy like Wil Wheaton likes your words, it can go viral!  It has 41799 notes ( reblogs+hearts) on Tumblr as of now.

Words . Reposted by a member of STAR TREK:NG. I can now die a happy Geek! The Internet is a funny thing!

Thank you everyone! Now if only Google learnt to include OCR for Images as part of text search!

  1. Write 50 words . That’s  a paragraph.
  2. Write 400 words . That’s a page.
  3. Write 300 pages. That’s a manuscript.
  4. Write everyday. That’s a habit.
  5. Edit and Rewrite. That’s how you get better.
  6. Spread your writing for people to comment. That’s called feedback.
  7. Dont worry about rejection or publication. That’s a writer.
  8. When not writing, read. Read from writers better than you. Read and Perceive.

But overall, just write more to get better.

1887+ votes on Quora!! 🙂 Probably my most viewed content ever- !

61036 people  have viewed this answer!

Also it got a mention here-

Ajay Ohri: The 8 Rules of Writing

Now I think I should take some of my own advice and get back to writing

Using ifelse in R for creating new variables #rstats #data #manipulation

The ifelse function is simple and powerful and can help in data manipulation within R. Here I create a categoric variable from specific values in a numeric variable

> data(iris)

> iris$Type=ifelse(iris$Sepal.Length<5.8,”Small Flower”,”Big Flower”)
> table(iris$Type)
Big Flower Small Flower
77           73

The parameters  of ifelse is quite simple


ifelse(test, yes, no)

an object which can be coerced to logical mode.

return values for true elements of test.

return values for false elements of tes


%d bloggers like this: