I was recently interviewed by Bigstep as part of their Expert Interview program. Click here to read the interview and let me know what you think!
Category: Analytics
Accelerating R: RStudio and the new R Consortium
To paraphrase Yogi Berra, “Predicting is hard, especially about the future”. In 1993, when Ross Ihaka and Robert Gentleman first started working on R, who would have predicted that it would be used by millions in a world that increasingly rewards data literacy? It’s impossible to know where R will go in the next 20 years, but at RStudio we’re working hard to make sure the future is bright.
Today, we’re excited to announce our participation in the R Consortium, a new 501(c)6 nonprofit organization. The R Consortium is a collaboration between the R Foundation, RStudio, Microsoft, TIBCO, Google, Oracle, HP and others. It’s chartered to fund and inspire ideas that will enable R to become an even better platform for science, research, and industry. The R Consortium complements the R Foundation by providing a convenient funding vehicle for the many commercial beneficiaries of R to give back to the community, and…
View original post 138 more words
Install Package in Python from Github
You can use
pip install git+git://github.com/yhat/ggplot.git
or
pip install --upgrade https://github.com/yhat/ggplot/tarball/master
Summer School in Analytics
This is the brochure for Summer School in Analytics- a ten day intensive program in classroom at Delhi, India on R, Python and SAS languages.
KDnuggets Poll -Is Rapid Miner 3 times more used as SAS
16th annual KDnuggets Software Poll continued to get huge attention from analytics and data mining community and vendors, attracting about 2,800 voters, who chose from a record number of 93 different tools.
from
http://www.kdnuggets.com/2015/05/poll-r-rapidminer-python-big-data-spark.html
What seems a rather disquieting sampling error-
RapidMiner remains the most popular suite for data mining/data science, but it got fewer votes than last year
The top 10 tools by share of users were
-
R, 46.9% share ( 38.5% in 2014)
-
RapidMiner, 31.5% ( 44.2% in 2014)
-
SQL, 30.9% ( 25.3% in 2014)
-
Python, 30.3% ( 19.5% in 2014)
-
Excel, 22.9% ( 25.8% in 2014)
-
KNIME, 20.0% ( 15.0% in 2014)
-
Hadoop, 18.4% ( 12.7% in 2014)
-
Tableau, 12.4% ( 9.1% in 2014)
-
SAS, 11.3 (10.9% in 2014)
I really dont think Rapid Miner has three times SAS users. I have no doubts on the credibility of the poll but there seems either sampling bias or something plain wrong here
!!!!
and 44.2 % of users used Rapid Miner last year ( I dont think one in two data miners uses Rapid Miner)
So there is some error here- or maybe different ways of counting a user or not!!
Moobhi Review- Piku Emotion in Motion
Shoojit Sircar has written a love poem to the saga of probashi Bongalis, Kolkatta longing and the fine yet quixotic and sometimes insular Bong culture. He has relied on shortcuts and stereotypes to finish the story in the time alloted. Deepika looks great with Kajal laced Bengali Eyes, but someone needs to tell her to get accent training. Irrfan can act better with his eyes and mouth closed, than Karan Johar can act with his entire body.
Amitabh Bachchan just disappears into his role as Bhaskar Da. Moushmi Chatterjee lifts occasional sag into the story pace. What a nice story? If only non Bengalis knew more about their culture than just Bengali sweets.
Dealing with zip files in R #rstats
> setwd("/home/ajay/Downloads") > a=dir() > class(a) [1] "character" > grep(".zip",a) [1] 37 38 41 43 88 96 133 > b=grep(".zip",a) > a[b] [1] "alissa-coming-soon-v2-0(1).zip" [2] "alissa-coming-soon-v2-0.zip" [3] "CAX_EMC_Journalist_Data.zip" [4] "CAX_EMC_Racer_Data.zip" [5] "matlab_R2015a_glnxa64.zip" [6] "Photos.zip" [7] "unvbasicvapp__9411003__vmx__en__sp0__1.zip"
> unzip("CAX_EMC_Racer_Data.zip")
> c=dir() library(Hmisc)
> c[c %nin% a] [1] "CAX_EMC_Racer_Garmin_Camera.csv" [2] "CAX_EMC_Racer_Garmin_Watch_Data.csv" [3] "CAX_EMC_Racer_Motorcycle_Data.csv"
ps- I know Hadley's convenient wrappR packages are all the rage now, but nothing, i repeat
nothing beats Frank Harell and Ripley's cool packages
