2010 in review and WP-Stats

The following is an auto generated post thanks to WordPress.com stats team- clearly they have got some stuff wrong

1) Defining the speedometer quantitatively

2) The busiest day numbers are plain wrong ( 2 views ??)

3) There is still no geographic data in WordPress -com stats (unlike Google Analytics) and I cant enable Google Analytics on a wordpress.com hosted site.

The stats helper monkeys at WordPress.com mulled over how this blog did in 2010, and here’s a high level summary of its overall blog health:

The Blog-Health-o-Meter™ reads Wow.

Crunchy numbers

The Louvre Museum has 8.5 million visitors per year. This blog was viewed about 97,000 times in 2010. If it were an exhibit at The Louvre Museum, it would take 4 days for that many people to see it.

In 2010, there were 367 new posts, growing the total archive of this blog to 1191 posts. There were 411 pictures uploaded, taking up a total of 121mb. That’s about 1 pictures per day.

The busiest day of the year was September 22nd with 2 views. The most popular post that day was Top 10 Graphical User Interfaces in Statistical Software.

Where did they come from?

The top referring sites in 2010 were r-bloggers.com, reddit.com, rattle.togaware.com, twitter.com, and Google Reader.

Some visitors came searching, mostly for libre office, facebook analytics, test drive a chrome notebook, test drive a chrome notebook., and wps sas lawsuit.

Attractions in 2010

These are the posts and pages that got the most views in 2010.

Top 10 Graphical User Interfaces in Statistical Software April 2010
8 comments and 1 Like on WordPress.com,

Wealth = function (numeracy, memory recall) December 2009
1 Like on WordPress.com,

Matlab-Mathematica-R and GPU Computing September 2010
1 Like on WordPress.com,

About DecisionStats July 2008

The Top Statistical Softwares (GUI) May 2010
1 comment and 1 Like on WordPress.com,

Google Chrome Extension To Check WordPress.com Stats (techie-buzz.com)

Privacy Browsing Extensions in Google Chrome

Using two Chrome Extensions, Disconnect and AdBlock you can be sure of having a vary very clean browsing experience-it is recommended especially if you dont like the auto sharing of your personal preferences and cannot be bothered by the Byzantine maze of social media privacy fineprint.

https://chrome.google.com/extensions/detail/jeoacafpbcihiomhlakheieifhpjdfeo

Disconnect by Brian Kennish

(184) – 44,284 users – Weekly installs: 24,086

Stop major third parties and search engines from tracking the webpages you go to and searches you do.

Install

* Search depersonalization is now optional and off by default. Click the “d” button then the “Depersonalize searches” checkbox to turn this feature on (or back off in case you have trouble getting to Google or Yahoo services). For help with anything else, see the known issues below and ask questions at http://j.mp/dnewgroup.

§

If you’re a typical web user, you’re unintentionally sending your browsing and search history with your name and other personal information to third parties and search engines whenever you’re online.

Take control of the data you share with Disconnect!

From the developer of the top-10-rated Facebook Disconnect extension, Disconnect lets you:

• Disable tracking by third parties like Digg, Facebook, Google, Twitter, and Yahoo, without requiring any setup or significantly degrading the usability of the web.

• Truly depersonalize searches on search engines like Google and Yahoo (by blocking identifying cookies not just changing the appearance of results pages), while staying logged into other services — e.g., so you can search anonymously on Google and access iGoogle at once.

• See how many resource and cookie requests are blocked, in real time

and

https://chrome.google.com/extensions/detail/gighmmpiobklfepjocnamgkkbiglidom

Extensions > AdBlock




AdBlock
 (6937) - 1,615,373 users - Weekly installs: 153,032



The most popular Chrome extension, with over 1.5 million users! Blocks ads all over the web.


Verified author: chromeadblock.com





Install

=================

New in version 2.1: Translated into dozens of languages!
New in version 2.0: Ads are blocked from downloading, instead of just being removed after the fact!

=======================

The official AdBlock For Chrome!  Block all advertisements on all web pages.  Your browser is automatically updated with additions to the filter: just click Install, then visit your favorite website and see the ads disappear!

FAQs:1. This is the official AdBlock extension: the original ad blocker written from the ground up to be optimized in Chrome.  There's an unrelated, older Firefox project called Adblock Plus, and they're working on making a Chrome version out of the old AdThwart codebase.  At the moment AdBlock blocks some ads that AdThwart only hides, but they're working to improve it.  It's available at bit.ly/id2Gqx; if you have trouble with AdBlock, they're good guys and a fine alternative!

Most Popular Firefox Extensions and Posts of 2010 [Video] (lifehacker.com)
Ex-Googler Helps Users Disconnect From the Social Web (readwriteweb.com)
10 Google Chrome Extensions for a Faster Browser (friedbeef.com)
IE9 Gets Built-In Ad Blocking (informationweek.com)
Click&Clean, Browser Add-On To Delete Temporary Data, Improve Privacy (ghacks.net)

Test Drive a Google Chrome Notebook: Last Two Days left

Image via Wikipedia

Test drive a Chrome notebook.

We have a limited number of Chrome notebooks to distribute, and we need to ensure that they find good homes. That’s where you come in. Everything is still very much a work in progress, and it’s users, like you, that often give us our best ideas about what feels clunky or what’s missing. So if you live in the United States, are at least 18 years old, and would like to be considered for our small Pilot program, please fill this out. It should take about 15 minutes. We’ll review the requests that come in and contact you if you’ve been selected.

This application will be open until 11:59:59 PM PST on December 21, 2010.

What type of user are you?

https://services.google.com/fb/forms/cr48advanced/

Business

Education

Non-Profit

Developer

Individual

Get Invites to Test Drive Google Chrome OS Notebooks (quickonlinetips.com)
Apply for Test Driving Chrome Notebook (kish.in)
Test drive a (free) Google Chrome-48 notebook (ebiquity.umbc.edu)
Report: Google Offering Chrome Notebook ‘Test Drives’ (nytimes.com)
With Chrome Notebook, your data won’t be lost – Google Video (techattitude.com)
How to Get Free Chrome NoteBook Cr-48 [USA users only] (geniusgeeks.com)
Test drive a Chrome notebook. – Google (services.google.com)
Would You Like To Be A Google Chrome OS Tester? (lockergnome.com)

Google Books Ngram Viewer

Here is a terrific data visualization from Google based on their digitized books collection. How does it work, basically you can test the frequency of various words across time periods from 1700s to 2010.

Like the frequency and intensity of kung fu vs yoga, or pizza versus hot dog. The basic datasets scans millions /billions of words.

Here is my yoga vs kung fu vs judo graph.

http://ngrams.googlelabs.com/info

What’s all this do?

When you enter phrases into the Google Books Ngram Viewer, it displays a graph showing how those phrases have occurred in a corpus of books (e.g., “British English”, “English Fiction”, “French”) over the selected years. Let’s look at a sample graph:

This shows trends in three ngrams from 1950 to 2000: “nursery school” (a 2-gram or bigram), “kindergarten” (a 1-gram or unigram), and “child care” (another bigram). What the y-axis shows is this: of all the bigrams contained in our sample of books written in English and published in the United States, what percentage of them are “nursery school” or “child care”? Of all the unigrams, what percentage of them are “kindergarten”? Here, you can see that use of the phrase “child care” started to rise in the late 1960s, overtaking “nursery school” around 1970 and then “kindergarten” around 1973. It peaked shortly after 1990 and has been falling steadily since.

(Interestingly, the results are noticeably different when the corpus is switched to British English.)

Corpora

Below are descriptions of the corpora that can be searched with the Google Books Ngram Viewer. All of these corpora were generated in July 2009; we will update these corpora as our book scanning continues, and the updated versions will have distinct persistent identifiers.

Informal corpus name	Persistent identifier	Description
American English	googlebooks-eng-us-all-20090715	Same filtering as the English corpus but further restricted to books published in the United States.
British English	googlebooks-eng-gb-all-20090715	Same filtering as the English corpus but further restricted to books published in Great Britain.

Top Google Ngram Searches (paul.kedrosky.com)
Find out what’s in a word, or five, with the Google Books Ngram Viewer (googleblog.blogspot.com)
Historical Word Frequency and Google Books (volokh.com)
Culturomics: Hacking The Library of Babel (reason.com)
New Visualization Tool from Google With Data From 5.2 Million Digitized Books (readwriteweb.com)
Web N-Gram Now More Accessible (bing.com)

2011 Forecast-ying

I had recently asked some friends from my Twitter lists for their take on 2011, atleast 3 of them responded back with the answer, 1 said they were still on it, and 1 claimed a recent office event.

Anyways- I take note of the view of forecasting from

http://www.uiah.fi/projekti/metodi/190.htm

The most primitive method of forecasting is guessing. The result may be rated acceptable if the person making the guess is an expert in the matter.

Ajay- people will forecast in end 2010 and 2011. many of them will get forecasts wrong, some very wrong, but by Dec 2011 most of them would be writing forecasts on 2012. almost no one will get called on by irate users-readers- (hey you got 4 out of 7 wrong last years forecast!) just wont happen. people thrive on hope. so does marketing. in 2011- and before

and some forecasts from Tom Davenport’s The International Institute for Analytics (IIA) at

http://iianalytics.com/2010/12/2011-predictions-for-the-analytics-industry/

Regulatory and privacy constraints will continue to hamper growth of marketing analytics.

(I wonder how privacy and analytics can co exist in peace forever- one view is that model building can use anonymized data suppose your IP address was anonymized using a standard secret Coco-Cola formula- then whatever model does get built would not be of concern to you individually as your privacy is protected by the anonymization formula)

Anyway- back to the question I asked-

What are the top 5 events in your industry (events as in things that occured not conferences) and what are the top 3 trends in 2011.

I define my industry as being online technology writing- research (with a heavy skew on stat computing)

My top 5 events for 2010 were-

1) Consolidation- Big 5 software providers in BI and Analytics bought more, sued more, and consolidated more. The valuations rose. and rose. leading to even more smaller players entering. Thus consolidation proved an oxy moron as total number of influential AND disruptive players grew.

2) Cloudy Computing- Computing shifted from the desktop but to the mobile and more to the tablet than to the cloud. Ipad front end with Amazon Ec2 backend- yup it happened.

3) Open Source grew louder- yes it got more clients. and more revenue. did it get more market share. depends on if you define market share by revenues or by users.

Both Open Source and Closed Source had a good year- the pie grew faster and bigger so no one minded as long their slices grew bigger.

4) We didnt see that coming –

Technology continued to surprise with events (thats what we love! the surprises)

Revolution Analytics broke through R’s Big Data Barrier, Tableau Software created a big Buzz, Wikileaks and Chinese FireWalls gave technology an entire new dimension (though not universally popular one).

people fought wars on emails and servers and social media- unfortunately the ones fighting real wars in 2009 continued to fight them in 2010 too

5) Money-

SAP,SAS,IBM,Oracle,Google,Microsoft made more money than ever before. Only Facebook got a movie named on itself. Venture Capitalists pumped in money in promising startups- really as if in a hurry to park money before tax cuts expired in some countries.

2011 Top Three Forecasts

1) Surprises- Expect to get surprised atleast 10 % of the time in business events. As internet grows the communication cycle shortens, the hype cycle amplifies buzz-

more unstructured data is created (esp for marketing analytics) leading to enhanced volatility

2) Growth- Yes we predict technology will grow faster than the automobile industry. Game changers may happen in the form of Chrome OS- really its Linux guys-and customer adaptability to new USER INTERFACES. Design will matter much more in technology on your phone, on your desktop and on your internet. Packaging sells.

False Top Trend 3) I will write a book on business analytics in 2011. yes it is true and I am working with A publisher. No it is not really going to be a top 3 event for anyone except me,publisher and lucky guys who read it.

3) Creating technology and technically enabling creativity will converge at an accelerated rate. use of widgets, guis, snippets, ide will ensure creative left brains can code easier. and right brains can design faster and better due to a global supply chain of techie and artsy professionals.

Google Chrome OS Hardware Vanishes In The Cloud (informationweek.com)
Google’s Chrome OS notebook gets unboxed, prodded, and praised (linuxfordevices.com)
Why we removed the WikiLeaks visualizations | Tableau Software (tableausoftware.com)
Tableau Software Adds In-Memory Database Engine (customerthink.com)
Sales Forecasting Methods & Models (thinkup.waldenu.edu)
Revolution Analytics Introduces Enterprise-Class Application Integration, Deployment & Administration for R (eon.businesswire.com)
HTC estimates it will ship 60 million handsets in 2011 [TNW Mobile] (thenextweb.com)
Lexalytics Predicts State of Text and Sentiment Analysis Market 2011 (prweb.com)
Social Business Forecast: 2011 The Year of Integration (e1evation.com)
Forecast: iPad share of Net traffic to double in a year (news.cnet.com)

Trying out Google Prediction API from R

So I saw the news at NY R Meetup and decided to have a go at Prediction API Package (which first started off as a blog post at

http://onertipaday.blogspot.com/2010/11/r-wrapper-for-google-prediction-api.html

1)My OS was Ubuntu 10.10 Netbook

Ubuntu has a slight glitch plus workaround for installing the RCurl package on which the Google Prediction API is dependent- you need to first install this Ubuntu package for RCurl to install libcurl4-gnutls-dev

Once you install that using Synaptic,

Simply start R

2) Install Packages rjson and Rcurl using install.packages and choosing CRAN

Since GooglePredictionAPI is not yet on CRAN

3) Download that package from

https://code.google.com/p/google-prediction-api-r-client/downloads/detail?name=googlepredictionapi_0.1.tar.gz&can=2&q=

You need to copy this downloaded package to your “first library ” folder

When you start R, simply run

.libPaths()[1]

and thats the folder you copy the GooglePredictionAPI package you downloaded.

5) Now the following line works

Under R prompt,

> install.packages("googlepredictionapi_0.1.tar.gz", repos=NULL, type="source")

6) Uploading data to Google Storage using the GUI (rather than gs util)

Just go to https://sandbox.google.com/storage/

and thats the Google Storage manager

Notes on Training Data-

Use a csv file

The first column is the score column (like 1,0 or prediction score)

There are no headers- so delete headers from data file and move the dependent variable to the first column (Note I used data from the kaggle contest for R package recommendation at

http://kaggle.com/R?viewtype=data )

6) The good stuff:

Once you type in the basic syntax, the first time it will ask for your Google Credentials (email and password)

It then starts showing you time elapsed for training.

Now you can disconnect and go off (actually I got disconnected by accident before coming back in a say 5 minutes so this is the part where I think this is what happened is why it happened, dont blame me, test it for yourself) –

and when you come back (hopefully before token expires) you can see status of your request (see below)

> library(rjson)
> library(RCurl)
Loading required package: bitops
> library(googlepredictionapi)
> my.model <- PredictionApiTrain(data="gs://numtraindata/training_data")
The request for training has sent, now trying to check if training is completed
Training on numtraindata/training_data: time:2.09 seconds
Training on numtraindata/training_data: time:7.00 seconds

Note I changed the format from the URL where my data is located- simply go to your Google Storage Manager and right click on the file name for link address ( https://sandbox.google.com/storage/numtraindata/training_data.csv)

to gs://numtraindata/training_data (that kind of helps in any syntax error)

8) From the kind of high level instructions at https://code.google.com/p/google-prediction-api-r-client/, you could also try this on a local file

Usage

## Load googlepredictionapi and dependent libraries
library(rjson)
library(RCurl)
library(googlepredictionapi)

## Make a training call to the Prediction API against data in the Google Storage.
## Replace MYBUCKET and MYDATA with your data.
my.model <- PredictionApiTrain(data="gs://MYBUCKET/MYDATA")

## Alternatively, make a training call against training data stored locally as a CSV file.
## Replace MYPATH and MYFILE with your data.
my.model <- PredictionApiTrain(data="MYPATH/MYFILE.csv")

At the time of writing my data was still getting trained, so I will keep you posted on what happens.

An R interface to the Google Prediction API (revolutionanalytics.com)
Google Prediction Goes to the Movies (technoverseblog.com)
11 new APIs: Google Predictions, Amazon User Management (programmableweb.com)
R at Google (r-bloggers.com)
Google API Console Opens Up Millions of Queries Daily (programmableweb.com)
Canonical Design Team: So, you want to provide an API for the world to use? (design.canonical.com)

Test drive a Chrome notebook.

Wanna test out the new Chrome OS.

Go to https://services.google.com/fb/forms/cr48basic/

and fill the form

Test drive a Chrome notebook.

We have a limited number of Chrome notebooks to distribute, and we need to ensure that they find good homes. That’s where you come in. Everything is still very much a work in progress, and it’s users, like you, that often give us our best ideas about what feels clunky or what’s missing. So if you live in the United States, are at least 18 years old, and would like to be considered for our small Pilot program, please fill this out. We’ll review the requests that come in and contact you if you’ve been selected.

https://services.google.com/fb/forms/cr48basic/

Test drive a Chrome notebook. – Google (services.google.com)
Report: Google Offering Chrome Notebook ‘Test Drives’ (nytimes.com)
Google Chrome hardware pilot program: ‘Not for the faint of heart’ (news.cnet.com)

Crunchy numbers

Where did they come from?

Attractions in 2010

Related Articles

Please share:

Disconnect by Brian Kennish

AdBlock

Related Articles

Please share:

Test drive a Chrome notebook.

This application will be open until 11:59:59 PM PST on December 21, 2010.

What type of user are you?

Related Articles

Please share:

What’s all this do?

Corpora

Related Articles

Please share:

Related Articles

Please share:

Usage

Related Articles

Please share:

Test drive a Chrome notebook.

Related Articles

Please share: