Ajay Ohri interviews Dr Bradley Jones for StatisticsViews.com

I had the good fortune and privilege to interview a  genuine statistical hero, Dr Bradley Jones

http://www.statisticsviews.com/details/feature/8510051/For-me-the-fun-of-working-with-scientists-and-engineers-is-helping-them-generate.html

He holds a patent on the use of DOE for minimizing registration errors in the manufacture of laminated circuit boards and is the inventor of the prediction profile plot for interactive exploration of multiple input and output response surfaces. In both 2009 and 2011, he received the American Society for Quality’s Brumbaugh Award for the paper making the largest contribution to industrial quality control. He also won the 2010 Lloyd S. Nelson Award for the article having the greatest immediate impact to practitioners. Jones is the Editor-in-Chief of the Journal of Quality Technology, a Fellow of the ASA and co-author of the award winning Optimal Design of Experiments with Peter Goos.

Typically, DOE is taught by rote using pre-packaged designs. This makes it hard for an engineer to see the practical applicability of DOE. In addition, most DOE texts devote most of their pages to analysis rather than the core principles of design. Students do not learn how to evaluate and compare prospective designs for their appropriateness to a specific problem. The textbooks (and professors) need to catch up with the software.

You can read the complete article at http://www.statisticsviews.com/details/feature/8510051/For-me-the-fun-of-working-with-scientists-and-engineers-is-helping-them-generate.html

Famous in X but Failure in Y

Some of my friends on the internet and in real life love food. Note the distinction between internet friends and real life friends. There is more to genuine long lasting relationships than exchange of engaging bits bytes and moving your mouse on icons to say I love this, I plus one that, I really adore it.

Well my friends and I, we love food. Some of us , in fact most of us eat food. Some of us click pictures and share in on the anti-social media. Anti-social because it is anti-real life socializing. A few of us cook food. One or two write recipes. Occasionally one of us tries to make food his business by floating the idea of opening a restaurant. This is despite the fact that just eating food is ahem easy and running a restaurant  business  is inherently risky.

Making your passion into a business is a dream and privilege that is offered to very few of us.  Athletes, technology startup founders, Drug Lords.

Occasionally one may be a success in one line of the business. Someone who loves food can write good books, but will you exchange your mom’s apple pie for that telegenic chef. Someone who writes good books on food can automatically run a very good restaurant. No. Good in books doesnot mean good in business in the same thing.

Genius doesnt travel. IF you are reading this, probability says you are not a genius anyway.

 

 

Twitter Analysis Redefined

Because code keeps changing on Twitter


#dev.twitter.com and apps.twitter.com to generate these tokens
#install.packages("twitteR")
#install.packages("ROAuth")
#PACKAGES
library(twitteR)
library(ROAuth)
#ACCESS URLS
reqURL <- "https://api.twitter.com/oauth/request_token"
accessURL <- "https://api.twitter.com/oauth/access_token"
authURL <- "https://api.twitter.com/oauth/authorize"

#ACCESS KEYS
consumerKey <- "4LEjfrnbzMQvxpJzRKnx6v0JM"
consumerSecret <- "aCsJA6jEHhpqFioKmxwtu9BzMm0TnOFQyZv6mgCUo1j82PzRIn"
access_token="3232641518-IFIlyB5oJ7QbFXT3arO218BWbycGMA6q5NO1b7k"
access_secret='XPfpH3l6QjCnRxpZwHtMbRMqnwmhmxlZqFZxnxgEg35K4'

#HANDSHAKE
setup_twitter_oauth(consumerKey,
consumerSecret,
access_token="3232641518-IFIlyB5oJ7QbFXT3arO218BWbycGMA6q5NO1b7k",
access_secret='XPfpH3l6QjCnRxpZwHtMbRMqnwmhmxlZqFZxnxgEg35K4')

a=searchTwitter("delhi", n=2000)
tweets_dfa = twListToDF(a)
tweets_dfa
b=searchTwitter("mumbai", n=200)
tweets_dfb = twListToDF(b)
c=searchTwitter("bangalore", n=200)
tweets_dfc = twListToDF(c)
tweets=rbind(tweets_dfa,tweets_dfb,tweets_dfc)
#tweets
write.csv(tweets,file="tweets.csv")
head(tweets)
library(tm)
library(wordcloud)
b=Corpus(VectorSource(tweets$text), readerControl = list(language = "eng"))
b=tm_map(b, PlainTextDocument)
inspect(b)
b<- tm_map(b, content_transformer(tolower))
#Changes case to lower case
b<- tm_map(b, stripWhitespace) #Strips White Space
b <- tm_map(b, removePunctuation) #Removes Punctuation
inspect(b)
tdm <- TermDocumentMatrix(b)
m1 <- as.matrix(tdm)
v1<- sort(rowSums(m1),decreasing=TRUE)
d130,]
wordcloud(d2$word,d2$freq,colors =brewer.pal(7,"Set1"))

Rplot

Workflows in R compared to Workflows in Python

A workflow consists of an orchestrated and repeatable pattern of business activity enabled by the systematic organization of resources into processes that transform materials, provide services, or process information.

Both R and Python have similar workflows but slightly different syntax. one of the biggest difference is how they refer to parts of object ( $ [] in R while [] in Python) as well as how they apply functions ( fun(object) in R while object.fun() in Python)

 

a workflow in Python

http://nbviewer.ipython.org//gist/decisionstats/4142e98375445c5e4174

Screenshot from 2015-10-24 08:31:22

a workflow in R

http://rpubs.com/ajaydecis/rworkflow

Screenshot from 2015-10-24 08:31:07

Garbage Collection in a technology startup

Garbage collection (GC) is a form of automatic memory management. The garbage collector, or justcollector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program. Thats garbage collection from Jimmypedia. When you dont do enough garbage collection in a program you can end up with Stack Overflow.

Tech Startups have garbage collections too. A garbage collector looks only for garbage. In the edges, on the floor, below the carpet, under ths stairs. In meetings, in team discussions, in stock options, in business plans, in cash burn projections. These are negative anti social emotionally dysfunctional people. Their over abundant IQ (Intelligence Quotient)is balanced by their teenager like EQ ( Emotional Quotient). To balance the cyncial Einstein, you generally need a shiny eyed startup founder who has dreams of ringing the NASDAQ bell every night.

Tech Startups also have unicorn catterpillars. These are people who think they will shed their legs, wrap them in a silky cocoon and become unicorn butterfliess with wings. The shiny eyed founder can become an unicorn butterfly very fast, till the garbage is collected from the cocoon and the oyster returns to being an oyster than turning into a pearl.

Tech Startups have over caffienated engineers and under caffienated salesmen. I wonder how the industry would react if they introduce mandatory drug testing for startups. Maybe we will all migrate to Canada under a Treudian utopia of hemp and grass.

 

Web Analytics is funny statistics

I have a simple question for my Web Analytics software. I want to know who is reading what, and how much are they being impacted ?

In return my Web Analytics gives me dashboards that can be line charts, bar plots, path diagrams (including Google Analytics).

  • Some questions for my Web Analytics to answer-
  • Will it count 500 CEOs reading my blog as less significant as 5000 college students. Thats not a problem if I am on a social network or is it?
  • I get 15000 unique viewers every month . How many people is that? Does that mean the same 500 people visited every day. Does it mean every day a different 500 people visited. Yes I know Google Analytics has some kind of pie chart (horrible) split and returning and new users- but HOW MANY PEOPLE DID I reach?
  • What did they do after the read my blog? Where did they go? Google shares Adsense revenue. Can it share data too- lets call it DataSense. Even create a new internet data bureau (like we have credit data bureau for financial data)
  • How can I use the web analytics software to give me a forecast of future traffic ( by a time series plot with an added regressor of number of posts per category type ?)
  • How can I get some ANALYTICS to take a decision from the web analytics- (A Siri for Web Analytics?-  You last posted X days ago. Please consider posting. Please consider delaying posting to a more appropriate time?)
  • Is there more to life for a blogger than views and visitors. Is there some way we can measure satisfaction?
  • Is there a SEO penalty for boasting on blog traffic boasting when meeting another blogger. Is there a SEO incentive for openly sharing your web statistics
  • Can Google Analytics give a big data dump for open data analytics (sigh). Can you use custom JS libraries for making your own dashboard with GA

Screenshot from 2015-10-22 08:01:27

Choices

You wake up every day with a bank balance of 12 hours of productive work time. every night as you go to sleep the balance goes to zero. you wake up every day with a finite energy balance of a few kilojoules willing to be expended. the balance is upto you but it cannot be carried over the next day.

You wake up with choices and you go to sleep with having made the decisions on the choices. You can focus on what lies forward and stay positive.OR . You can swallow the negativity and be swallowed in it’s swamp.

Intelligent men can make bad choices. Choices that you make today will be with you in the future.