What exactly does a Data Scientist do as a job?

Got an interesting query on LinkedIn

  • what are roles of a data scientist

    Now since I have been for almost 14 years doing something related to data science even before data science became a term on Wikipedia https://en.wikipedia.org/wiki/Data_science – here are my views

    a data scientist is simply a person who can

      write code = in R,Python,Java, SQL, Hadoop (Pig,HQL,MR)   etc

                         = for data storage, querying, summarization,  visualization

                         = how efficiently, and in time (fast results?)

                                                     = where on databases, on cloud, servers

       and understand  enough statistics

             to                              derive insights from data

        so            business can make decisions

    It involves coding, it involves presenting insights, it involves gathering requirements like a consultant. So you need the following

    ability to write complex SQL queries

    ability to move ,create,delete files on command prompt in Linux

    code in Python and in R and in SAS

    do machine learning (in R caret/party/e1071 packages and in Python scikit learn and in Spark MLLIB) and SAS Enterprise Miner

    ability to  learn new languages quickly (Hadoop, Hive , Pyspark)

    do analysis on small data using statistics (R/Python/SAS) and on big data

    make presentations on insights to senior management

    > Lots of roles for a single term -data scientist

Author: Ajay Ohri


