Here is a brief talk (two talks actually) I gave at Allianz Trivandrum Kerala India on 9 November 2016. Forgive the typos
Year: 2016
Turn Turn Turn
PESTLE Analysis of Barack H Obama’s Presidency
On Nov , 2008 I wrote the article on using PESTLE for analyzing 43 POTUS.
https://decisionstats.com/2008/11/02/pestle-analysis-of-george-w-bushs-presidency-2/
The article tries to use the PESTLE ( details below ) analysis framework for analyzing the 43 President of United States. It may be a bit biased , as all articles written by men are. It will try its best to be faithful to the facts , and not to the opinions surrounding this President.
The PESTLE framework stands for –
- Political
- Economic
- Social
- Technological
- Legal
- Environmental.
It will rate the Presidency as a Presidency, means it will evaluate his tenure as leader of the United States, not of the free world or anything else he was not elected or declared by the Supreme Court to be ( pun unintended). Scores are from 1 to 10, but you are free to comment with your own rating scores.
In summary ,
Rexer Analytics Data Mining Survey Results
Latest Results from Rexer Analytics
HIGHLIGHTS from the 2015 Data Science Survey:• SURVEY & PARTICIPANTS: 59-item online survey conducted in 2015. Participants: 1,220 analytic professionals from 72 countries. This is the 7th survey in this ongoing research program.• CORE ALGORITHM TRIAD: Regression, Decision Trees, and Cluster analysis remain the most commonly used algorithms in the field.• THE ASCENDANCE OF R: 76% of respondents report using R. This is up dramatically from just 23% in 2007. More than a third of respondents (36%) identify R as their primary tool.• JOB SATISFACTION: Job satisfaction in the field remains high, but has slipped since the 2013 survey. A number of factors predict Data Scientist job satisfaction levels.• DEPLOYMENT: Deployment continues to be a challenge for organizations, with less than two thirds of respondents indicating that their models are deployed most or all of the time. Getting organizational buy-in is the largest barrier to deployment, with real-time scoring and other technology issues also causing significant deployment problems.• TERMINOLOGY: The term “Data Scientist” has surged in popularity with over 30% of us describing ourselves as data scientists now compared to only 17% in 2013.Please contact us if you have any questions about the attached report or this research program. Summary reports for all 7 surveys (2007-2015) are free and available by contacting DataMinerSurvey@RexerAnalytics.com. Additional information about the surveys is available at www.RexerAnalytics.com, including verbatim best practices and insights shared by respondents in previous surveys — see the Data Miner Survey links in the top right corner of our home page.Information about Rexer Analytics and our consulting services is also available at www.RexerAnalytics.com.
Internships for Data Scientists at DecisionStats in New Delhi India
http://www.letsintern.com/internship/IT-internships/Decisionstats/Data-Science-Interns/68307
It’s your chance to work in the field of Data Science Interns with Decisionstats for IT Internship.
About the Internship: They are unpaid.
The data scientists will create , edit and make data science research and assist in writing.
The intern will be given on the job training for data science and analytics in Python SAS and R .
The data science intern will create , edit and make schedules and assist in coordination.
The data science intern will be given on the job training for managing in a start up environment, web analytics and search engine marketing as well as an understanding of digital business.
The data science intern will also proof read, edit and write content including blog posts and social media.
The data science intern will be given on the job training for social media, web analytics and search engine optimization as well as an understanding of digital business.
Number of Internships available: 5
Perks:
Certificate, Letter of recommendation, Flexible work hours, Informal dress code, 5 days a week.
Who can apply:
Only those candidates can apply who are available for full time (in-office) internship. They can start the internship between 30th Sep’16 and 30th Oct’16.
1. are available for minimum 2 months duration.
2. are living or staying in Delhi.
3. are pursuing any degree but have relevant skills and interest.
4. are currently in any year of study or are recent graduates.
International Students can also apply

Interview Kiran Rama India’s Number One Data Scientist
Here is an interview with Kiran Rama. He is currently Director, Data Sciences & Advanced Analytics at VMWare. I have chosen Kiran as India’s number one data scientist for the following reasons
- He has both an impeccable academic record as well as steady work experience across multiple companies
- He has demonstrated his expertise in competitions like Kaggle and KDD cup (which is tougher)
- He spends more time doing and expanding data science in India

Here is the interview with Kiran Rama, India’s Number One Data Scientist as per 2016 as per Decisionstats.com
- “2012 India Innovator of the Year” Award from Michael Dell
- 3 patents filed at US PTO on various aspects of e-commerce and marketing analytics
- World Quality Day Finalist in 2010
- Won the Best Project Award in Global Consumer & Small Business Analytics for 4 consecutive quarters
- Software Errors: Predict which line in software code is likely to be an error for a US based startup
- Accident evaluation analysis for a US semi sized startup
- Predict which music label to recommend to a startup
- Trying to predict futures prices in the stock market for a US Startup
- HLA Imputation of Genomic data
- Leveraging Data Sciences to come up with customer segments for Flipkart’s digital properties
- Coming up with an email rules engine to determine the best customers to target per category
- Setting up mobile app analytics at Flipkart
- First ever digital buyer journey data sciences project at VMW
- “Propensity to Buy” models for several products of VMW, for the Technical Account Manager organization,..
- “Propensity to Sell” models for the partner organization of VMW
- “Propensity to Respond” models
- Deployment models
- …
- Debugging Skills: You cannot give up as a data scientist and should be a person who can sit at one place and continuously debug for hours. Data Science techniques usage will involve installations, OS issues, nitty-gritty aspects of the code,… etc
- Programming Skills: You cannot be a data scientist if you cannot program. You need to be good at programming. Comments like code is available on the net and I will copy-paste do not work. I judge a data scientist by different parameters and one of the most important ones is the quality of the code!
- Knowledge of a Programming Language that has a machine learning library (R or Python are an example. R has access to many of the libraries on the CRAN repository while Python has the world beating scikit-learn package)
- Strong understanding of the mathematical and computer science and statistical background of the data sciences techniques behind the techniques
- Ability to translate a business problem into a data sciences problem. This involves key decisions like which is the target, is this a prediction or classification problem, what is the right cross-validation technique, what algorithms to use for data mining, what should be the right evaluation criteria, how the model will likely be deployed,…
- Strong business/domain understanding can lead to great feature engineering and great success while deployment.
- Ability to present the results to stakeholders and get buy-in for implementation is very important as well
- Python and R have better and wider machine learning libraries than SAS
- Most of the academic work and latest advancements are in Python & R
- Python is better than R because there are more things you can do in Python including software development. Trust me – there is no money in machine learning libraries. There is money only in applications and closer you are to software development + machine learning, the better
- Most of the high paying startups and young firms use Python/R and not SAS
- It is easier to learn Python/R and then if you happen to work for an old behemoth that is a SAS shop, pick up SAS as well
- Python/R are actual programming languages and better than SAS. SAS uses macros and not functions. SAS uses proprietary dataset format that is largely inefficient. SAS requires you to know different syntax for different methods and also different types of plots. On the other hand, the interface to call any function in R or Python is the same. Example: predict function in R. Since everything is returned as an object in R & Python it is easier to examine them (contrast looking at the object sub-objects to running multiple commands in SAS to find the output datasets – the infamous “ods trace on” in SAS,………etc)
- “The Art of R Programming” by Norman Matloff
- “Python for Data Analysis” by Wes Mc Kinney
- Learning from Data by Mostafa
- Applied Data Mining by Paolo Giudici
- Machine Learning by Tom Mitchell

- Build your own repository of functions and methods that you can re-use
- Understand what the winners of prior competitions did. For example: my code above
- Keep yourself current with the latest techniques. For example: xgboost
- Choose the right cross-validation technique. Else, you will overfit
- Be paranoid about leakage and look for ways to fix leakage in everything including data preparation, feature engineering and modeling
- Feature Engineering is the key. Even with lesser data, better features will do better than big data
- Try different methods that are varied. Example: one learner can be tree-based, one bagging, one boosting based, one neural network….etc
- Always ensemble. It can give 2-5% lift
- Regularized Logistic Regression (glmnet in R)
- Bagging technique: Random Forest
- Boosting Technique: Gradient Boosting Machine, Extreme Gradient Boosting
- Collaborative Filtering Techniques: LIBFM
- Non linear learners like Neural Networks
- Bayesian Methods like BayesTree, bartMachine
- Support Vector Machines – LIBSVM library
- Fast learners like Vowpal Wabbit
About-
Kiran is a Data Sciences Leader with more than 12 years of experience across marketing, digital (web/mobile), retail, pricing, partner, sales. Experience across B2C, e-commerce & B2B data sciences. One of the Top 10 Player on Kaggle – data mining competition platform – in 2013 and half of 2014 world-wide, Kiran is also KDD 2014 Prize Winner and Holder of 3 US patents. 2012 Innovator of the Year award from Michael Dell.
You can read about him here https://www.linkedin.com/in/rkirana
Saving Private Snowden
No apologies no pardons
Full disclosure on what was stolen and what was not
Helping cooperate to make system more secure
Unconditional surrender
Obama is a lawyer and so is Clinton. They don’t get privacy but they do get popular opinion. So a popular campaign for Snowmen pardon needs to be organized.
Personally I doubt if Edward can come come home before 2020