SAS for R Users

I recently managed to get a copy of SAS University Edition.  Screenshot from 2015-03-04 19:54:34

1) Here were some problems I had to resolve- The download size is 1.5 gb of a zipped file ( a virtual machine image). Since I have a internet broadband based in India it led to many failed attempts before I could get it. The unzipped file is almost 3.5 gb. You can get the download file here http://www.sas.com/en_us/software/university-edition/download-software.html.

Secondly the hardware needed is 64 bit, so I basically upgraded my Dell Computer. This was a useful upgrade for me anyway.

2) You can get an Internet Download Manager to resume downloading in case your Internet connection has issues downloading a 1.5 gb file in one go. For Linux you can see http://flareget.com/download/

and for Windows http://www.internetdownloadmanager.com/download.html

 

3) I chose VM Player for Linux because I am much more comfortable with VM Player ( Desktop free version). I got that from here ~200 MB https://my.vmware.com/web/vmware/free#desktop_end_user_computing/vmware_player/6_0

Screenshot from 2015-03-04 19:39:17

4) Finally I installed VM Player and Open an Existing Virtual Machine to boot up SAS University Edition  Screenshot from 2015-03-04 19:43:08

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

I was able to open the SAS Studio at the IP Address provided.

Screenshot from 2015-03-04 21:52:32

5)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

I downloaded a   Dataset from this collection here

https://archive.ics.uci.edu/ml/datasets/Adult

 

6) Then I uploaded it to within the SAS Studio System

Screenshot from 2015-02-28 17:06:34Screenshot from 2015-03-03 12:29:07

7) Lastly I was able to run some basic commandsScreenshot from 2015-03-03 12:27:48

Screenshot from 2015-03-04 21:54:06

I was really impressed by the enhancements made to the interface, the ability to search command help through a drop down, the color coded editor and of course the case insensitive SAS language (though I am not a fan of the semi colon I loved using Ctrl + / for easy commenting and uncommenting)

  1. For a SAS turned R turned SAS coder- here are some views
  2. SAS has different windows for coding, log and output. R generally has one
  3. SAS is case insensitive while R is case sensitive. This is a blessing especially for variable and dataset names.
  4. SAS deals with Datasets than can be considered the same as Rs Data Frame.
  5. R’s flexibility in data types is not really comparable to SAS as it is quite fast enough.
  6. SAS has a Macro Language for repeatable tasks
  7. SQL is embedded within SAS as Proc SQL and in R through sqldf package
  8. You have to pay for each upgrade in SAS ecosystem. I am not clear on the transparent pricing, which components does what and whether they have a cloud option for renting by the hour. How about one web page that lists product description and price.
  9. SAS University Edition is a OS agnostic tool, for that itself it is quite impressive compared to say Academic Edition of Revolution Analytics.
  10. R is object oriented and uses [] and $ notation for sub objects. SAS is divided into two main parts- data and proc steps, and uses the . notation and var system
  11. SAS language has a few basic procs but many many options.
  12. How good a SAS coder you are often depends on what you can do in data manipulation in SAS Data Step
  13. Graphics is still better in R ggplot. But the SAS speed is thrilling.
  14. RAM is limited in the University Edition to 1 GB but I found that still quite fast. However I can upload only a 10 mb file to the SAS Studio for University Edition which I found reasonable for teaching purposes.

 

 

Comprehensive Learning Path in R

I have built a comprehensive learning path for professionals, students and researchers at http://www.analyticsvidhya.com/learning-paths-data-science-business-analytics-business-intelligence-big-data/learning-path-r-data-science/

Rather than simply put a list of resources, I have tried to create a structured path which is agnostic to any one source instead takes in best sources for each step or phase in the analytics work flow.

There are links to resources by Hadley Wickham, Revolution , Data Camp, videos, live projects, slideshares, tutorials done in a systematic manner.

Have a look and let me know how this can be made better

LeaRning Path on R – Step by Step Guide to Learn Data Science on R

Screenshot from 2015-03-04 09:17:13

Predicting Oscars

And the Oscars SHOULD go to

Birdman or (The Unexpected Virtue of Ignorance) – Alejandro González Iñárritu, John Lesher, and James W. Skotchdopole

Interviewed on my analytics adventures

I just got interviewed rather extensively at http://www.analyticsvidhya.com/blog/2015/02/interview-expert-ajay-ohri-founder-decisionstats-com/

Interview with Industry expert – Ajay Ohri

Kunal: You started data science career much before people would have heard about it and it became one of the hottest field around. What were the challenges that you faced during the initial stages of your professional career?

Ajay: Cool question man. Yeah it used to be called business analytics, then data analytics and now its data science. What will they call it next?

Initial challenges: R was raw (this was 2007) , SAS was expensive, even Open Office was not so good as it is now. Getting a pipeline of work, leads for clients, converting leads to contracts and chasing people to pay me after work done were initial challenges.

you can read the rest of the interview at http://www.analyticsvidhya.com/blog/2015/02/interview-expert-ajay-ohri-founder-decisionstats-com/

Hacking Climate Change -ultimate data challenge

Hacking climate change is more of a data challenge as we try and backtest and forecast our models. Unfortunately the science is hostage to the politics, but sharing data openly, including cutting edge results without worrying on ancient national interests would be the first step in a planet wide effort to save the planet.

Movie Review Birdman

This is an awesome movie with the term coming to mind as a mind-fish movie. Superb cast, superb acting and the camera makes the movie as it was done in one take.

Birdman- shall set us free. It talks of art, sexuality, theatre, fragile egos, and our obsession with fantasy and comic book  movies that is strangling art slowly and surely.

birdmanposter

A Writer’s Dilemma A Data Scientist’s Decision

CAM00682Writing sucks as a way of paying money. You have to constantly ask for gigs and favours so you can pay the bills, till your publisher sends you the royalties for writing statistics books (which is not much)

Recently I was approached by someone to do research on Indian nuclear policy . I mentioned my billing rates as 80 pounds per hour, but said person wanted me to raise it to 110 pounds per hour.

Only one small hitch. The  sub part of Indian nuclear policy that I was asked to write a report on – was to locate , interview and find out India’s top nuclear scientists for small reactors.

India has of course a big research interest in nuclear energy but we seemed to be going in for big reactors and thorium reactors.

The only two small nuclear reactors in India- and one of them is in INS Arihant ( India’s nuclear submarine launched last year). In effect I was being asked to make a list of top 20 likely candidates who had helped India with a nuclear submarine reactor.

I have said no to the person, I have been subjected to verbal insults, threats and innuendo.

But Data Scientists trust in God. Everybody else has to work harder for the data.

I hope this is a lesson for fellow researchers , data scientists because as I said, writing is a lousy way of making money.

Was I wrong? Am I just living in a fantasy land? Do you believe I am a criminal and a thug?

2015-1