Got an interesting query on LinkedIn
-
What exactly does a Data Scientist do as a job
-
what are roles of a data scientist
Now since I have been for almost 14 years doing something related to data science even before data science became a term on Wikipedia https://en.wikipedia.org/wiki/Data_science – here are my views
a data scientist is simply a person who can
write code = in R,Python,Java, SQL, Hadoop (Pig,HQL,MR) etc
= for data storage, querying, summarization, visualization
= how efficiently, and in time (fast results?)
= where on databases, on cloud, servers
and understand enough statistics
to derive insights from data
so business can make decisions
It involves coding, it involves presenting insights, it involves gathering requirements like a consultant. So you need the following
ability to write complex SQL queries
ability to move ,create,delete files on command prompt in Linux
code in Python and in R and in SAS
do machine learning (in R caret/party/e1071 packages and in Python scikit learn and in Spark MLLIB) and SAS Enterprise Miner
ability to learn new languages quickly (Hadoop, Hive , Pyspark)
do analysis on small data using statistics (R/Python/SAS) and on big data
make presentations on insights to senior management
> Lots of roles for a single term -data scientist
Author: Ajay Ohri
http://about.me/ajayohri View all posts by Ajay Ohri