Personal- Google's bADSENSE

If you notice I removed the ads from this site, the Goggle Ad Sense ads. The reason for that was I found it no corelation at all between what I was writing and what kind of ads I saw.

Maybe it is my location- India, but after watching ads for career job sites, video games, marketing networks and computer training alongside some of my writing- I decided to split with the big G and call it an end to Google’s bad Adsense.

The irony of a data mining blog failing to get relevant data mining ads from a data mining search engine.

(NOT coming up- Decisionstats T Shirts with quotes from interviews)

Now back to coding and research.

Bring it on Bing

A few notes on Bing

screenshot-ajay-ohri-bing-mozilla-firefox

  • The design is better ( read newer). Google still thinks design is something they studied and forgot in semester 1 of engineering – but the Ipod like design is cool.
  • I like the preview link  feature- just hover the mouse to get a sleak preview of what the searh page goes to- it saves time I think A LOT.
  • Surprisingly the results are more and in different order than Google
  • Images result was again different than Google but I liked the images options on left margin
  • Google results are still more pertinent ( but not much) on the first page but Bing’s archive seemed fresher ( like catching my Linkedin profile changed url while Google gave an error)

Overall summary- it is NEW and DIFFERENT and GOOD. Good enough to add to the toolbar. But not great enough to leave 8 year old habits of Googling it. Unless Google guys really bung it up.

Citation- http://bing.com

screenshot-ajay-ohri-bing-images-mozilla-firefox

More R please

some R news

0 The R Foundation Website I guess the http://www.r-project.org team is busy prettyfying before the annual R users conference kicks in- the website of www.r-project.org ( I was told it looks has the aesthetic visual appeal of dead cat splattered on the autobahn a very HTML 4.0 kind of retro look )

I cant believe the R Site and R core honchos finds the following image the prettiest image to represent graphical abilities of R

The R core site has tremendous functionality and demand though I wonder if they can just put up some ads and get some funding/ two way research tie- up with Google —Google uses R extensively, and can help with online methods as well, and is listed as supporting organization at http://www.r-project.org/foundation/memberlist.html …..

The R archives are a collection of emails and thats not documentation at all – but

1 Revolution R Website and particularly David Smith’s blog is a great way to stay updated on R news at http://blog.revolution-computing.com/

I have covered REvolution R before, and they are truly impressive.

http://www.decisionstats.com/2009/01/31/interviewrichard-schultz-ceo-revolution-computing/

It seems the domain name revolutioncomputing.com was squatted ( by NC?) so thats why the hyphenated web name. It is a very lucid website- though I do request them to put more video/podcasts and a Tweet this button would be great :))

and another more techie post here

http://blog.revolution-computing.com/2009/05/verifying-zipfs-powerdistribution-law-for-cities.html

Another great source is the Twitter – it seems that Twitter R users use the hashtag #rstats to search for R kind of news and code – that should help R bloggers and at a later date users.

Click here for checking it out

http://search.twitter.com/search?q=#stats

2 Some more R forums and sites

Forum for R Enterprise Users http://www.revolution-computing.com/forum

A R Tips Site http://onertipaday.blogspot.com/

The R Journal ( yes there is a journal for all hard working R fans) http://journal.r-project.org/

R on Linkedin http://www.linkedin.com/groups?about=&gid=77616

and the Analytic Bridge community group for R

http://www.analyticbridge.com/group/rprojectandotherfreesoftwaretools

2 Here is a terrific post by Robert Grossman

at http://blog.rgrossman.com/2009/05/17/running-r-on-amazons-ec2/

I liked the way he built the case for using R on Amazon EC2 ( Business case not Use case) and then proceeded to a step by step tutorial simple and powerful blog post. I hope R comes out with a standardized Online R Doc like that which is a single point search able archive for code – something like the SAS online doc (which remains free for WPS users 😉 ) but the way the web is evolving it seems the present mish mash method would continue

the main steps to use R on a pre-configured AMI.

Set up.
The set up needs to be done just once.

1. Set up an Amazon Web Services (AWS) account by going to:

aws.amazon.com.

If you already have an Amazon account for buying books and other items from Amazon, then you can use this account also for AWS.
2. Login to the AWS console
3. Create a “key-pair” by clinking on the link “Key Pairs” in the Configuration section of the Navigation Menu on the left hand side of the AWS console page.
4. Clink on the “Create Key Pair” button, about a quarter of the way down the page.
5. Name the key pair and save it to working directory, say /home/rlg/work.

Launching the AMI. These steps are done whenever you want to launch a new AMI.

1. Login to the AWS console. Click on the Amazon EC2 tab.
2. Click the “AMIs” button under the “Images and Instances” section of the left navigation menu of the AWS console.
3. Enter “opendatagroup” in the search box and select the AMI labeled
“opendatagroup/r-timeseries.manifest.xml”, which
is AMI instance “ami-ea846283″.
4. Enter the number of instances to launch (1), the name of the key pair that you have previously created, and select “web server” for the security group. Click the launch button to launch the AMI. Be sure to terminate the AMI when you are done.
5. Wait until the status of the AMI is “running.” This usually takes about 5 minutes.

Accessing the AMI.

1. Get the public IP address of the new AMI. The easiest way to do this is to select the AMI by checking the box. This provides some additional information about the AMI at the bottom of the window. You can can copy the IP address there.
2. Open a console window and cd to your working directory which contains the key-pair that you previously downloaded.
3. Type the command:
ssh -i testkp.pem -X root@ec2-67-202-44-197.compute-1.amazonaws.com

Here we assume that the name of the key-pair you created is “testkp.pem.” The flag “-X” starts a session that supports X11. If you don’t have X11 on your machine, you can still login and use R but the graphics in the example below won’t be displayed on your computer.

Using R on the AMI.

1. Change your directory and start R

#cd examples
#R
2. Test R by entering a R expression, such as:

> mean(1:100)
[1] 50.5
>
3. From within R, you can also source one of the example scripts to see some time series computations:

> source(‘NYSE.r’)
4. After a minute or so, you should see a graph on your screen. After the graph is finished being drawn, you should see a prompt:

CR to continue

Enter a carriage return and you should see another graph. You will need to enter a carriage return 8 times to complete the script (you can also choose to break out of the script if you get bored with the all the graphs.
5. When you are done, exit your R session with a control-D. Exit your ssh session with an “exit” and terminte your AMI from the Amazon AWS console. You can also choose to leave your AMI running (it is only a few dollars a day).

Acknowledgements: Steve Vejcik from Open Data Group wrote the R scripts and configured the AMI.

AjayTerrific R companies, blogs, tweets, research and sites, but do let me know your feedback . Just un-other R day.

Google Custom Search

Here is a revised version of the Custom Search Engine that I first talked of last year- this year it now includes Business Intelligence Sites.

Try it out and let me know of you want to help create a customized Data Mining Engine- Note it already has 800 plus analytics and Business Intelligence Sites.

I got much better results than Google when searching for R, but thats to be expected 🙂

Google Custom Search

Here is a revised version of the Custom Search Engine that I first talked of last year- this year it now includes Business Intelligence Sites.

Try it out and let me know of you want to help create a customized Data Mining Engine- Note it already has 800 plus analytics and Business Intelligence Sites.

I got much better results than Google when searching for R, but thats to be expected 🙂

For the past one week I have been fending off repeated denial of service attacks on Decisionstats.com

My email addresses are widely known thanks to my participation on technical forums and my willingness to reach out and make friends ( 8500+ on Linkedin alone).

While the attacks were particularly vicious including email accounts, server changes, ip addresses to deny certain geographies, poisoned RSS feed/scripts, and scripts in folders to re direct posts.

Thanks to the fine team supporting me and some friendly advice from bloggers at Google- the worst is over. I have not hesitated to bash Google at will over its big size and even added a few cartoons on them – (even now I think they should sell Orkut to Facebook and add Twitter to create a spinoff Google Social Media for an IPO and take that particularly search and advertising)

But Google Search and Gmail and Google Analytics and Particularly Youtube are free- and thanks to them knowledge has spread way around the world. Also working behind the scenes the Google team spends a huge amount fighting spam and all for free. Or a bit of advertising on a website.

Thanks a lot Google men, and here is a You Tube video ( which you may have seen many times before but still) that celebrates the one ness of a global online community ( despite the occasional sith attacks)

saP or saS or sasR or saaS

Some pending news and posts- It appears that the company SAP is moving closer to major acquisitions. This includes launching more and more applications that are analytical in nature as well coming together in an alliance with hardware major Teradata. Teradata off course is a very close partner to SAS Institute. So could SAP and SAS and or Terdata be moving closer to a major announcement on BI and BA merging.

The open source database movement with Hadoop is the one which can be the real game changer in the managed database industry and AsterData is the company to watch here.

However R with its modular extensions is a different paradigm in language developement and SAS no longer has the nimbleness or flexibity in creating such apps- at the same time it has lost a fair deal of credibility in the young academia (due to R) as well cost sensitive consumers (due to WPS)

The succession issue of Jim Goodnight continues to be the biggest problem for SAS Institute- Jim is not getting younger and his second line is not expected to be of the same class as the Sall/ Goodnight partnership. Of all the major companies in software, Jim Goodnight stood alone in remaining private and thus managed to escape distractions of share prices while building up the franchise. Surviving oil shocks, cold wars, three recessions Mr Goodnight has cared for his local community as well despite being active in SAS and fending off sustained attempts by open source languages.

. An automatic partner for Mr Goodnight should have been Google or even Google Labs with the Brin/Page duo being the top data miners ( commerically) of this generation as Sall/Goodnight were 30 years ago.

SAP may spend a lot of its cash but the supply chain paradigm is best served by SaaS and exemplified by Salesforce.com and Force.com developers.

As the ancient Chinese said- May you live in interesting times.