How do I shift to a data science career

I get this question a  lot – How do I shift to a data science career. I have been doing data analysis since 2004 (in SAS) when we used to call it business analytics , and since 2007 in R, Since 2014 in Python,  by when we re branded business analytics as data science. So here are a few basics to people trying to SHIFT to data science.

My answer is learn coding, learn math, and most importantly know when to use what for insights. Data scientists are as good as the insights they create or miss not the code they write.

See this first

A slideshare I put forward last year for Summer School

Do this self examination-

  1. What are you good at – programming , stats, or business
  2. What are you bad at- programming , stats or business
  3. What can you learn and at what proficiency

Learning Programming

Learning R, SAS, Python is easy but there is a confusing clutter of resources out there on the internet.

SAS Language -should be learnt from SAS University Edition and for the SAS Certification Exam.

Dont wanna be SAS Certified (its just 100$ psst)

Here is some free SAS Training by Decisionstats

There is no certification in R or Python, though Hadoop has it just like SAS has it.

For R- learn R and RStudio till you can master some of the code here

Screenshot from 2016-08-23 10-48-29

or see all the R packages here at CRAN VIEWS

For Python-

A shorter tutorial on Python by the author is here

Learn PANDAS and SCIKIT-LEARN  example


Learning Statistics and Techniques


Data Mining in R

Where to learn machine learning

Learning Business

This comes with experiences and domain research and study.


I hope this helps. I will follow with specific answers to specific career questions in data science soon.



Some thoughts on the Revolution Sellout

The revolution will not be televised, brother –Gil Scott-Heron


Veteran R Community members must recall R founder’s Ross Ihaka ‘s warning against Revolution Analytics not being truly open source,

and the sale to Microsoft will be keeping Revolution R open source in the time being ,

it did proved Ross Ihaka was right.

How do you help create an open source revolution in statistics by selling a company to Microsoft beats me.

And how do you just take 6000 packages for free from open source community, add 6-9 packages of your own and then repackage the bundle as a new innovation?

Even though Revolution analytics created 3 CEO JOBs,including SPSS founder Norman Nie, and 1 name change  (from computing to analytics) and  1 mass firing ( with a 50% layoff they wont be winning the best employer award),  in the end what drives software is lots of sales and not lots of blogs

(quoting Larry Ellison‘s purchase of Sun ).

In addition

love for computing and not hypocrisy on love for money should drive science.

A potato is a potato.

In Australia or Seattle or San Fransisco

SAS and Jupyter work well together now


While  R community continues to move ahead with  RStudio (open source still),  and other interfaces,

SAS is moving forward to embrace Jupyter in it’s free University Edition. The word Jupyter itself is made from Julia, Python and R. Note whether you are a R fan or Py fan or a SAS fan, you should compare and contrast the quality of blogs, the documentation and the interface on your own. As a blogger and data scientist (?) I actually love all science

Screenshot from 2016-08-12 19-24-19

Using Jupyter and SAS together with SAS University Edition

A few months ago I shared the news about Jupyter notebook support for SAS. If you have SAS for Linux, you can install a free open-source project called sas-kernel and begin running SAS code within your Jupyter notebooks. In my post, I hinted that support for this might be coming in the SAS University Edition. I’m pleased to say that this is one time where my crystal ball actually worked — Jupyter support has arrived!

(Need to learn more about SAS and Jupyter? Watch this 7-minute video from SAS Global Forum.)


How do I run Jupyter Notebook in SAS University Edition using VirtualBox?

In order to run Jupyter Notebook in SAS University Edition, you must first add the SAS University Edition vApp to VirtualBox. When you specify the URL to run Jupyter Notebook, you must specify the port number for Jupyter Notebook.

  1. Follow the steps to add the SAS University Edition vApp to VirtualBox.
    Note If you want to access files from or save files to your local computer from Jupyter Notebook in SAS University Edition, you must also set up a shared folder. For more information, see the following topics:

  2. If you downloaded a new version in July 2016, the additional port is automatically added for you. Skip this step and proceed to step 3.

Battleground states prime point of digital attacks to delay election results

In a normal election cycle, battleground states are prime areas to win or lose an election. Yet as election campaigning, electoral fund raising, and even voting itself has gone digital, the ease by which people can use has not been matched by security.

This is due to systematic denial of funds by CTOs and CIOs to digital security for both campaigns as well as Federal and local digital cyber agencies. Can the USA prevent cyber attack interference in key battleground states in this election cycle.

Just as 3D printing has evolved to make guns and will evolve further, electronic manipulation of voting machines has evolved further- but security budgets and priorities have not. What about Postal ballots? Can they be tampered or intercepted with.

Who benefits when doubt is sown in the minds of voters? As Al Gore. He invented the internet.

Who benefits when a few districts in Ohio show electronic tampering?

Quo Vadis? (Where are you going?) Quis custodiet ipsos custodes? (Who guards the guardians)

Rumours on rigging would just use the algorithm (dīvide et īmpera) and if backed even by a few slivers of actual cyber attacks and tampering would undermine it even more. Yes there is no way to protect ALL the voting systems so its cyber football game of interception – and the current lack of big time offensive weapons  as rebuttal in cyber attacks makes remote attacks on election systems both possible and plausible.

there are 9,000 jurisdictions in the United States that have a hand in carrying out the balloting, many of them with different ways of collecting, tallying and reporting votes.

(sighs and goes back to watching the Olympics)

Should Agencies like Secret Service protect digital assets of leaders or friends or family during elections or later

The revelation that Russian Intel (or Snowden working with Russian Intel) hacked into H R Climton’s Campaign Surveys should be both bad news as well as good news to cyber activists.

First of all it increases the demand in terms of jobs for legal cyber security

Secondly it pinpoints the need of security as an important component to decision makers at a time when they are most likely to pay attention ( oh ! My website got hacked! You got my attention!)

However the bad news is

Digital Assets of protected members would be secured by same agencies ( or different agencies) – a jurisdiction nightmare

Hacking into friends of friends, family after official government protection over is a crime, but when Govts hack it is difficult for a teenager /friend of teenager  who hacked their Facebook. Ditto goes for Senate Staffers or Staffers combating cyber crime.

Will cyber crime and cyber war between nations make things personal not just business, due to the ease, low cost and plausible deniability

(Note- this is a strategic what if scenario, No Pokemons were hurt during the making of this post)

Psalm 2

Why do the nations conspire[a]
    and the peoples plot in vain?
The kings of the earth rise up
    and the rulers band together

(unrelated bonus