Running R through environments in PiCloud

PiCloud had an interesting announcement, they support non-Python things in custom environments, but R is pre-built in a new Base Environment.

http://blog.picloud.com/2012/10/24/new-base-environment-ubuntu-precise/

Enter Ubuntu Precise 12.04

Our latest environment is pre-configured with many of the latest libraries, making it easier than ever to move your computation to the cloud. Here are some of the notable packages:

  • NumPy 1.6.2
  • SciPy 0.11
  • Pandas 0.9.0
  • Scikits Learn 0.8.1
  • OpenCV 2.4.2
  • Java 7
  • R 2.14.1
  • Ruby 1.9.1
  • PHP 5.3.10

. To use Precise, specify the environment of a job as ‘base/precise’. In Python:

1 cloud.call(f, _env='base/precise')

BigML creates a marketplace for Predictive Models

BigML has created a marketplace for selling Datasets and Models. This is a first (?) as the closest market for Predictive Analytics till now was Rapid Miner’s marketplace for extensions (at http://rapidupdate.de:8180/UpdateServer/faces/index.xhtml)

From http://blog.bigml.com/2012/10/25/worlds-first-predictive-marketplace/

SELL YOUR DATA

You can make your Dataset public. Mind you: the Datasets we are talking about are BigML’s fancy histograms. This means that other BigML users can look at your Dataset details and create new models based on this Dataset. But they can not see individual records or columns or use it beyond the statistical summaries of the Dataset. Your Source will remain private, so there is no possibility of anyone accessing the raw data.

SELL YOUR MODEL

Now, once you have created a great model, you can share it with the rest of the world. For free or at any price you set.Predictions are paid for in BigML Prediction Credits. The minimum price is ‘Free’ and the maximum price indicated is 100 credits.

White Box Models

Clicking on the white open lock will open up your model to the rest of the world. Anyone can now buy your model, explore it, use it to make predictions

Black Box Models

If you choose the black box setting (the black open lock icon), other BigML users will NOT be able to view or clone your model, but they will be able to use it to make predictions.

——

DOWNLOAD YOUR MODEL

BigML.com have added downloads to our models. Simply choose the format you want and you can copy/paste the code or text. There is a range of formats that they offer currently: JSON PML, PMML, Python, Ruby, Objective-C, Java, the rules of the decision tree in plain text and a Summary overview of your model. Around the corner are MS Excel downloads and R (of course!).

PUBLICIZE YOUR MODEL

There’s also an ’embed’ function, so now you can embed the little poster of your model in your blog post or website, so it is easy to share it in your own environment.

————————————————————————————————————————–

It is nice to see Models and Data getting the APPY treatment and hopefully, it will encourage other vendors Iike Google Prediction API etc to further spend thought and effort to reward data mining individuals directly without going through corporate intermediaries while ensuring intellectual property safeguards .

An R package market for enterprises? for Python libraries? JMP addins? A market for SAS Macros- who knows what the future shall hold. But overall, this is a very positive step by the BigML.com team. The App marketplace has helped revolutionize mobile and desktop computing and hopefully it will do the same for Business Analytics.

 

httR by Hadley #rstats

The awesome Hadley Wickham has just released the next version of httr package. Prof Hadley is currently on leave from Rice Univ and working with the tremendous geeks at R Studio . New things in the httr package-

 

http://blog.rstudio.org/2012/10/14/httr-0-2/

httr, a package designed to make it easy to work with web APIs. Httr is a wrapper around RCurl, and provides:

  • functions for the most important http verbs: GET, HEAD, PATCH, PUT, DELETE and POST.
  • support for OAuth 1.0 and 2.0. Use oauth1.0_token and oauth2.0_token to get user tokens, and sign_oauth1.0 and sign_oauth2.0to sign requests. The demos directory has six demos of using OAuth: three for 1.0 (linkedin, twitter and vimeo) and three for 2.0 (facebook, github, google).

I especially like the OAuth functionality as I occasionaly got flummoxed with existing R OAuth packages , and this should hopefully lead to awesome new social media analytics posts by the larger R blogger community. Also given the fact that unauthenticated API requests to Twitter are greatly expanded by OAuth authenticated requests- (see https://dev.twitter.com/docs/rate-limiting )

  • Unauthenticated calls are permitted 150 requests per hour. Unauthenticated calls are measured against the public facing IP of the server or device making the request.
  • OAuth calls are permitted 350 requests per hour and are measured against the oauth_token used in the request.

 

some creative use cases should see an incredible amount of cross social media analysis (not just one social media channel ) at a time.

R for Social Media Analytics ? Watch this space.. 😉

 

 

 

Statsoft is offering free software in Greece, Portugal, Spain

Statsoft is offering free software to companies in Greece, Spain & Portugal-the software is available FREE to all businesses in the 3 countries.  It’s not just to NGOs & governmental groups.

—-

StatSoft, Inc.  {Headquarters office Tulsa, OK, with 30 Global offices around the world]  is providing FREE SOFTWARE to GREECE,  PORTUGAL, and SPAIN as an economic assistance to these countries of Continental Europe to assist them in getting back into “more normal economic positions”…

If you’d like to comment on this you can on the following  YAHOO FINANCE page:

http://finance.yahoo.com/news/statsoft-aid-struggling-european-economies-100023370.html;_ylt=A2KJjakoL39QN0IAMHPQtDMD

 

Alternatively, here is the dto Aid Struggling European Economies with Free Enterprise Analytics Software press release linked to the StatSoft page.

Online Education- MongoDB and Oracle R Enterprise

I really liked the course developed by 10 gen for MongoDB (there are two tracks for Developers and DBAs at https://education.10gen.com/)

The interface is very nice and is a step upwards from Coursera’s ( https://www.coursera.org/) pioneering work (and even http://www.codecademy.com/#!/exercises/0 )– each video has a small question, the videos are not cluttered, and the voice and transcription quality is impeccable. Lastly a certification for people who clear 65% acts as an academic incentive, they get a certificate.

yes it is free.

 

Oracle recently launched a series of nicely made R tutorials at https://apex.oracle.com/pls/apex/f?p=44785:24:0::NO::P24_CONTENT_ID,P24_PREV_PAGE:6528,1but I wish Oracle R had some certifications too!

If only more techie companies like SAS Institute (expensive SAS training), IBM (cluttered website), Revolution Analytics (expensive partners in Certification), Google (unpolished Python lectures)

put an effort with polished e-learning interfaces than be dependent on external partners…..or internal gurus…interfaces matter especially in education.

\Well Anyways!!

Happy Mongo DBing/ Oracle R!

 

Databases in the cloud

One more day of me mucking around MySQL and Amazon (hoping to get to the R)

R now part of Amazon Linux AMI

Based on this post, Amazon now had decided to bundle R with Amazon Linux AMI

http://aws.typepad.com/aws/2012/10/amazon-linux-ami-201209-now-available.html

R 2.15: Also coming from your requests, we have added the R language to the Amazon Linux AMI.  We are here to serve your statistical analysis needs!  Simply yum install R and off you go.

ps- back to work. sorry for the delayed posts . I am working on book 2 for Springer- “R for Cloud Computing” . If you have any case studies of R on Amazon,Google, Oracle or Azure clouds please let me know.

pps- With 48 mb, is R too big to bundle in the many default Linux distros . Thoughts?