SAS Data Loader for Hadoop is now a 90 day free trial

From-

http://www.cloudera.com/content/cloudera/en/downloads/quickstart_vms/cdh-5-3-x.html

 

SAS Data Loader for Hadoop eliminates the complexities of writing MapReduce code, with a simple, point-and-click interface that empowers business analysts to prepare, integrate and cleanse big data faster and easier than ever. In addition, data scientists and programmers can run SAS code on Hadoop in parallel for better performance and greater productivity.

 


Get Started

  1. Download and install Cloudera QuickStart VM for CDH 5.3x.
  2. Download and install either VMware Player 6.0 or later (for Windows) or VMware Fusion for OS X 6.0 (for Mac).
  3. Download and install your 90-day free trial of SAS Data Loader for Hadoop.

and

from

http://www.sas.com/en_us/software/data-management/data-loader-hadoop.html

 

SAS for R Users

I recently managed to get a copy of SAS University Edition.  Screenshot from 2015-03-04 19:54:34

1) Here were some problems I had to resolve- The download size is 1.5 gb of a zipped file ( a virtual machine image). Since I have a internet broadband based in India it led to many failed attempts before I could get it. The unzipped file is almost 3.5 gb. You can get the download file here http://www.sas.com/en_us/software/university-edition/download-software.html.

Secondly the hardware needed is 64 bit, so I basically upgraded my Dell Computer. This was a useful upgrade for me anyway.

2) You can get an Internet Download Manager to resume downloading in case your Internet connection has issues downloading a 1.5 gb file in one go. For Linux you can see http://flareget.com/download/

and for Windows http://www.internetdownloadmanager.com/download.html

 

3) I chose VM Player for Linux because I am much more comfortable with VM Player ( Desktop free version). I got that from here ~200 MB https://my.vmware.com/web/vmware/free#desktop_end_user_computing/vmware_player/6_0

Screenshot from 2015-03-04 19:39:17

4) Finally I installed VM Player and Open an Existing Virtual Machine to boot up SAS University Edition  Screenshot from 2015-03-04 19:43:08

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

I was able to open the SAS Studio at the IP Address provided.

Screenshot from 2015-03-04 21:52:32

5)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

I downloaded a   Dataset from this collection here

https://archive.ics.uci.edu/ml/datasets/Adult

 

6) Then I uploaded it to within the SAS Studio System

Screenshot from 2015-02-28 17:06:34Screenshot from 2015-03-03 12:29:07

7) Lastly I was able to run some basic commandsScreenshot from 2015-03-03 12:27:48

Screenshot from 2015-03-04 21:54:06

I was really impressed by the enhancements made to the interface, the ability to search command help through a drop down, the color coded editor and of course the case insensitive SAS language (though I am not a fan of the semi colon I loved using Ctrl + / for easy commenting and uncommenting)

  1. For a SAS turned R turned SAS coder- here are some views
  2. SAS has different windows for coding, log and output. R generally has one
  3. SAS is case insensitive while R is case sensitive. This is a blessing especially for variable and dataset names.
  4. SAS deals with Datasets than can be considered the same as Rs Data Frame.
  5. R’s flexibility in data types is not really comparable to SAS as it is quite fast enough.
  6. SAS has a Macro Language for repeatable tasks
  7. SQL is embedded within SAS as Proc SQL and in R through sqldf package
  8. You have to pay for each upgrade in SAS ecosystem. I am not clear on the transparent pricing, which components does what and whether they have a cloud option for renting by the hour. How about one web page that lists product description and price.
  9. SAS University Edition is a OS agnostic tool, for that itself it is quite impressive compared to say Academic Edition of Revolution Analytics.
  10. R is object oriented and uses [] and $ notation for sub objects. SAS is divided into two main parts- data and proc steps, and uses the . notation and var system
  11. SAS language has a few basic procs but many many options.
  12. How good a SAS coder you are often depends on what you can do in data manipulation in SAS Data Step
  13. Graphics is still better in R ggplot. But the SAS speed is thrilling.
  14. RAM is limited in the University Edition to 1 GB but I found that still quite fast. However I can upload only a 10 mb file to the SAS Studio for University Edition which I found reasonable for teaching purposes.

 

 

Comprehensive Learning Path in R

I have built a comprehensive learning path for professionals, students and researchers at http://www.analyticsvidhya.com/learning-paths-data-science-business-analytics-business-intelligence-big-data/learning-path-r-data-science/

Rather than simply put a list of resources, I have tried to create a structured path which is agnostic to any one source instead takes in best sources for each step or phase in the analytics work flow.

There are links to resources by Hadley Wickham, Revolution , Data Camp, videos, live projects, slideshares, tutorials done in a systematic manner.

Have a look and let me know how this can be made better

LeaRning Path on R – Step by Step Guide to Learn Data Science on R

Screenshot from 2015-03-04 09:17:13

Predicting Oscars

And the Oscars SHOULD go to

Birdman or (The Unexpected Virtue of Ignorance) – Alejandro González Iñárritu, John Lesher, and James W. Skotchdopole

Movie Review Birdman

This is an awesome movie with the term coming to mind as a mind-fish movie. Superb cast, superb acting and the camera makes the movie as it was done in one take.

Birdman- shall set us free. It talks of art, sexuality, theatre, fragile egos, and our obsession with fantasy and comic book  movies that is strangling art slowly and surely.

birdmanposter

Microsoft acquires leading vendor of enterprise R -Revolution Analytics

The news-

http://blog.revolutionanalytics.com/2015/01/revolution-acquired.html

Revolution Analytics joins Microsoft

by David Smith, Chief Community Officer

On behalf of the entire Revolution Analytics team I am excited to announce that Revolution Analytics is joining forces with Microsoft to bring R to even more enterprises. Microsoft announced today that it will acquire Revolution Analytics.

and

http://blogs.microsoft.com/blog/2015/01/23/microsoft-acquire-revolution-analytics-help-customers-find-big-data-value-advanced-statistical-analysis/

Microsoft to acquire Revolution Analytics to help customers find big data value with advanced statistical analysis

I’m very pleased to announce that Microsoft has reached an agreement to acquire Revolution Analytics. Revolution Analytics is the leading commercial provider of software and services for R, the world’s most widely used programming language for statistical computing and predictive analytics. We are making this acquisition to help more companies use the power of R and data science to unlock big data insights with advanced analytics.

——-

Detailed Note to follow, but in the meantime.

Training in R on the WeekendR in February

I have agreed to teach R on the Weekend. As a change from my usual online trainings these will be in the class. I am collaborating with http://weekendr.in/r-training.html

For an initial price the cost is Rs 5500 (~100 USD) for 8 sessions of 3 hours each in the classroom. This is only for New Delhi, India as of now.

You can review the course here http://weekendr.in/r-training.html

Screenshot from 2015-01-23 13:40:41