Revolution Computing Releases Commercial R –The Analytics Market just grew better

I just downloaded R Comp’s latest release of REvolution R. The individual Win 32 version is free, while Enterprise version with Win 64 versions. Tech support is included in services contract for the software which should help with any corporate willing to take R on a trial basis.

 

From the press release ,

REvolution Computing Makes High Performance ‘REvolution R’

Available For Download

New Haven, CT – January 28, 2009 – REvolution Computing, a leading provider of open source predictive analytics solutions, today announced that it has made a public version of its commercial grade REvolution R program available for download from its website. REvolution R is REvolution Computing’s distribution of the popular R statistical software, optimized for use in commercial environments.

With the latest release of REvolution R, REvolution Computing has added significant performance enhancements to the base system, which can prove to be of great value in both commercial and research settings. A key feature includes the use of powerful optimized libraries capable of boosting performance by a factor of 5 or 10 for commonly used operations. In addition, REvolution R has been put through a quality process designed to meet regulatory agency audit standards, making the subscription version reliable for use in mission critical research and production.

“In making our latest release of REvolution R available for download, REvolution Computing is providing all R users the ability to take advantage of optimized and validated software previously available only to commercial users,” said REvolution Computing CEO, Richard Schultz. “In a true commercial open source way, we have reached the point in our development that we are able to offer significant value to both sets of our community users – REvolution R for all users, and REvolution R Enterprise, with additional commercial-grade capabilities and support, available by annual subscription.”

REvolution’s commercial distribution, REvolution R Enterprise, features advanced functionality, including ParallelR, which speeds deployment across both multiprocessor workstations and clusters to enable the same codes to be used for prototyping and production. REvolution R Enterprise is functional with 64-bit platforms and Linux enterprise platforms and provides for telephone support and response guarantees.

Some background on the company itself ………..from the company itself-

 

About REvolution Computing

New Haven, Connecticut-based REvolution Computing is the leading commercial provider of software and support for the statistical computing language known as “R.” 

Our products, including REvolution R and REvolution R Enterprise, enable statisticians, scientists and others to create superior predictive models and derive meaning from large sets of mission-critical data in record time. REvolution Computing

 

works closely with the R community to incorporate the latest developments in open source R, and with our clients to support their efforts to produce groundbreaking innovations in life sciences, financial services, defense technology and other industries where high-level analytics are crucial to success. At REvolution Computing, “We do the math.”

The product names “RPro,” “ParallelR,” “REvolution R,” and “REvolution R Enterprise,” are trademarks of REvolution Computing.

 

This basically gives the company first mover

advantage in commercial R. The timing is also fortunate as companies across the world look to cut costs (unfortunately labor costs are being cut faster than software costs) as well as move beyond traditional analytics softwares that performed ah so well in the sub prime prediction market.

REvolution R is available for download on Windows and Intel MacOS X, both in 32-bit mode at http://www.revolution-computing.com/downloads/revolution-r.php

Using Google Docs for Web Scraping

While trying to scrape some data from a Website , I chanced upon the getXML function which is pretty neat, as it basically allows you to import the XML feed of a webpage and then parse the data appropriately.

 

Here is an example-

 

Using the getXML function I parsed all links for “analytics consultant in India” search results in Google.

The GetXML function works as follows (from the support page here )

Functions:

=importXML("URL","query")

  • URL – the URL of the XML or HTML file
  • query – the XPath query to run on the data given at the URL. For example, "//a/@href" returns a list of the href attributes of all <a> tags in the document (i.e. all of the URLs the document links to). For more information about XPath, please visithttp://www.w3schools.com/xpath/
  • Example: =importXml("www.google.com", "//a/@href"). This returns all of the href attributes (the link URLs) in all the <a> tags on www.google.com home page

 

You can see it here-

http://spreadsheets.google.com/pub?key=pS9vSxWuwOllXHdueY0TDdg

or Using the Embed Function

 

Web Crawling Automation

Apart from the various ways you can use PERL, or other scripting languages for Automated Web crawling- this is a relatively low technology solution for people who want to download web pages , or web data.It can also be called as web scraping for some people.

 

The First Method is by using the package RCurl package (from R-Help Archives) .

The R –List is also found here http://www.nabble.com/R-help-f13820.html.

 

> library(RCurl)
> my.url <- "
http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?_r=2"
> getURL(my.url)

A variation is the following line of code-

getURL(my.url, followlocation = TRUE)

The information being sent from R and received by R from the server.

getURL(my.url, verbose = TRUE)

The second is by using the package RDCOMClient in R

> library(RDCOMClient)
> my.url <- "
http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?_r=2"
> ie <- COMCreate("InternetExplorer.Application")
> txt <- list()
> ie$Navigate(my.url)
NULL
> while(ie[["Busy"]]) Sys.sleep(1)
> txt[[my.url]] <- ie[["document"]][["body"]][["innerText"]]
> txt
$`
http://www.nytimes.com/2009/01/07/technology/business-computing/
07program.html?_r=2`

[1] "Skip to article Try Electronic Edition Log …

The third way ( a personal favorite) is by using the Firefox add in IMacros from www.iopus.com if you need to extract huge amounts of data and copy and paste into text and excel files. The Add in works almost the same way as the Record Macro feature works in Excel, with a difference it records all the clicks, download ,url’s etc from the browser.

It can even automate website testing, and data entry tasks.

While Firefox add-in is free the Internet Explorer costs 49 USD.

Happy Republic Day

India celebrates the 59th year of being a Republic ( It took us 3 years to write ,debate and finalize our constitution till 1950 , when we finally got the Constitution and the President as a figure-head kind of Republic democracy).

59 years ,

One billion people,

300 million slum dogs,

many billions of software exports,

more billions of oil imports,

one cricket world cup,

one high tech unmanned mission to Moon from us and

many low tech manned terrorist strikes inflicted on us

later-

The Indian Republic still stands as the only democracy in its neighborhood with substantial secular rights to minority religions and viewpoints.

May the republic still shine- freely.

Amen.

Tweet-Updated-Using Twitter for better Marketing

A relatively late entrant to the www.twitter.com phenomenon, I started uploading my blog posts on my  twitter account.Here are some insights which I saw in action and maybe they are common knowledge but here goes-

1) Twitter automatically converts links into www.tinyurl.com links so it shortens even the longest link that you have

2) Uploading address book, including anyone who ever wrote an email to you as part of a discussion or reading group, takes a tiny amount of time. Then click follow all ( or at least those for a particular profile –here analytics and data) and you are off.

3) Twitter manners seem to consider it customary to follow people who are following you.Thus an audience or initial leads are assured. Rest content is king.

4) Reading tweets ( or twitter messages) is a great break as it gives you a real time insight on what is happening within the world of your domain or people who belong to same profession or same personal profile as to you. However writing personal tweets takes time,and a healthy dose of self love.

5) Twitter is free. And there are enough twitter tools to ensure it gets updated from your RSS feed automatically so it is one more tool to ensure publicity for your self or your organization.

6) Search for people giving or receiving same services as you provide to get maximized target response.

7) Link up your Face book, and your Yahoo instant messenger with Twitter using applications built exactly for this.

No ,LinkedIn does not have a Twitter app but that should change soon.

 

8) Watch out for useless spam stuff from people whom you don’t know well.Spamming or just being reported leads to suspended accounts and much useless grief.

Happy twittering with tweets on www.twitter.com ( ..what a tongue twister !!)

 

And an update from my favorite tech blog http://bits.blogs.nytimes.com/

Starbucks dishes out updates on special offers and nutritional and store information using Twitter. The online retailer Zappos, Comcast and Southwest Airlines have also created official accounts on Twitter to interact with consumers and respond directly to complaints.

Bank of America’s Twitter stream is maintained by David Knapp, a representative in Phoenix.

And why is http://bits.blogs.nytimes.com/ my favorite-

It shows blogs with better command of English than of technology are better reading than blogs with superb grasp of technology but not of English.

 

In case you want to say hi/ tweet/shout ……..this is where, my twitter sit ’er

http://twitter.com/decisionstats

R in a CorpoRate Environment

Any concerns of using R in a corporate environment especially for compliance reasons can be mitigated from reading the following documents.

R: Regulatory Compliance and Validation Issues A Guidance Document for the Use of R in Regulated Clinical Trial Environments

and

Keeling & Parvur’s "A comparative study of the reliability to nine statistical software
packages, May 1, 2007 Computational Statistics & Data Analysis, Vol.51,pp 3811-3831.

 

Thanks to Bob for pointing this out on the R-Help list.

Technorati Tags:

Slum Dogs Come

Young slum dogs chipping away,

writing code,plugging away.

Take the place under shiny sun someday,

Slum puppies wont go away.

You let them in,

They are hungry for more, they stay,

Nobody ever gave them a break on the way,

Grew up fast,slum childhood wasn’t a child’s play.

Still here they are firing away,

Full steam ahead, and

Damn no Torpedo’s to dissuade.

Before you could pause, object

Cut them short saying Boy hey.

Slum dog walks away,

In his teeth , the shiny bone of the day.

Blood on his fur ,its there

Long enough to stay.

The dog beens much worse,

Much tougher days.

His brain the only weapon ,

he chooses to play.

Brain red hot, it keeps firing away.

That dog wont roll down, play dead, no way.

Been through much pain already this way,

Now numb, The Slum Dogs come here to stay.