Here is an interview with Aaron Rangel , CEO and creator of BlueSky Statistics which is an open sourced statistical software based on R
Ajay- Describe your career in statistical computing
Aaron- I was first exposed to the power of predictive analytics as a graduate student. Being a software industry professional and working for a startup, most of my early projects in statistical computing were around analyzing web and financial data as a hobbyist using R. This fascination led me to join SPSS as a Product Manager. At SPSS, I was very fortunate to be exposed to how predictive analytics and business intelligence was driving better decision making in a wide variety of industries. My experience both at iManage and SPSS, where I built intuitive applications with graphical user interfaces, convinced me of the value of creating a powerful GUI based application for R which had been soaring in popularity. For me it was a no brainer, R the lingua franca of statistical analysis, when married with a powerful intuitive user interface (typically found in commercial enterprise applications) would provide unprecedented value for the analyst and open source R community.
Ajay- Describe why and how you created this product
Aaron- I created the product for the following reasons
- I wanted to make learning and using R easier. Even though R is extremely powerful both in terms of the breadth and depth of analytics offered, as a beginner several years ago, I was intimidated by the number of packages, the idiosyncrasies of R syntax, the fact that I had to write or modify code for some of the simplest tasks. I strongly believe that an intuitive application with point and click graphical user interface that automates R syntax generation and offers attractive output for the top 100 frequently used analytical functions will save time with repetitive exploratory analysis, data preparation and standard modeling. BlueSky Statistics does not prevent analysts from writing R code and fully supports creating and executing R functions. Our goal is to automate routine tasks with a GUI and write R code for value adding analytics. The bottom line is analysts will be more efficient and will have more time for creative, value adding work.
- I wanted to create a one stop shop for the best work in the R community. With 6200 plus packages with a lot of capabilities duplicated across packages, I wanted to create an analytics workbench that showcases the best packages and best practices that R has to offer for analysts and programmers across levels of expertise.
- Increase the adoption of R in both the analyst and business user community by focusing on ease of use.
Ajay- Why did you choose R for the back end?
Aaron- Without a doubt the openness and extensibility. In fact at BlueSky Statistics, we have made every effort to preserve this openness and flexibility. BlueSky Statistics is available in both open source and commercial editions. Additionally, if you want to create a regression dialog with several options to be consumed by a sophisticated analyst or you want to create a simple regression for a statistics 101 class, BlueSky Statistics allows you to throttle the level of sophistication you want to expose. More importantly you can do this without writing a single line of code. Delivering targeted applications with analytical functions trimmed down to ensure that analysts pick the right options or students have a targeted application for learning is very simple to deliver.
Ajay- What are your plans for this product
Aaron- We have already delivered a comprehensive set of data preparation, exploratory analysis, data modeling and data visualization capabilities. We will continue to build our modeling and machine learning capabilities over the next few months. Our longer term goal is to create a collaborative open source analytics platform through which specialized analytics can be accessed to address a wide variety of business problems across industry verticals, all powered by R.
Ajay- Who do you think is the target audience for this?
- The non-programmer analyst community who are accustomed to and need an easy to use user interface available in the commercial statistics marketplace at much higher price points than BlueSky Statistics. We want the adoption of R to proliferate amongst all analysts not just the savvy R programmers.
- Newly minted data scientists and machine who are looking to learn R and want to accelerate the R learning curve as well as make avail of the efficiency of a rich GUI at their workplace.
- Analysts and R programmers across the experience spectrum. The benefits here are multi-fold
- Efficiencies realized by automating routine data preparation, exploratory analysis, reporting and modeling.
- As easy way to keep abreast with the latest statistical techniques, visualizations and data preparation methods in the R community. Our goal is to provide a one stop shop for the best packages available in the R community with easy to use GUIs that automate syntax generation which in turn makes learning easy and accelerates productivity.
- Sophisticated analytics would like to use the dialog editor program to build a rich GUI for any function in any R package. BlueSky Statistics makes it easy to create and share custom modules that represent new analytical techniques or best practices with other users in their organization resulting in better collaboration and efficiency.
- As we add data mining and machine learning capabilities, we would like to see adoption amongst that community as well.
Ajay- Can analytics companies afford one more software to the stack?
Being open source and 100% R based as well as the fact that University graduates across a wide variety of disciplines are already trained in R will be advantageous for us. Additionally with the increasing adoption of R amongst users of commercial statistical applications, we hope that more and more of these users will view us as the preferred alternative because of the large R community, the huge contribution base and innovation pace that no commercial statistical vendor can match.
BlueSky Statistics is a software product based on R which aims at making analytics easier through a Graphical USer Interface through menus. It has both a free and a commercial version which you can see here.