Home » Interviews
Category Archives: Interviews
Below an interview with Jeroen Ooms, a pioneer in R and web development. Jeroen contributes to R by developing packages and web applications for multiple projects.
Ajay- What are you working on these days?
Jeroen- My research revolves around challenges and opportunities of using R in embedded applications and scalable systems. After developing numerous web applications, I started the OpenCPU project about 1.5 year ago, as a first attempt at a complete framework for proper integration of R in web services. As I work on this, I run into challenges that shape my research, and sometimes become projects in their own. For example, the RAppArmor package provides the security framework for OpenCPU, but can be used for other purposes as well. RAppArmor interfaces to some methods in the Linux kernel, related to setting security and resource limits. The github page contains the source code, installation instructions, video demo’s, and a draft of a paper for the journal of statistical software. Another example of a problem that appeared in OpenCPU is that applications that used to work were breaking unexpectedly later on due to changes in dependency packages on CRAN. This is actually a general problem that affects almost all R users, as it compromises reliability of CRAN packages and reproducibility of results. In a paper (forthcoming in The R Journal), this problem is discussed in more detail and directions for improvement are suggested. A preprint of the paper is available on arXiv: http://arxiv.org/abs/1303.2140.
I am also working on software not directly related to R. For example, in project Mobilize we teach high school students in Los Angeles the basics of collecting and analyzing data. They use mobile devices to upload surveys with questions, photos, gps, etc using the ohmage software. Within Mobilize and Ohmage, I am in charge of developing web applications that help students to visualize the data they collaboratively collected. One public demo with actual data collected by students about snacking behavior is available at: http://jeroenooms.github.com/snack. The application allows students to explore their data, by filtering, zooming, browsing, comparing etc. It helps students and teachers to access and learn from their data, without complicated tools or programming. This approach would easily generalize to other fields, like medical data or BI. The great thing about this application is that it is fully client side; the backend is simply a CSV file. So it is very easy to deploy and maintain.
Ajay-What’s your take on difference between OpenCPU and RevoDeployR ?
Jeroen- RevoDeployR and OpenCPU both provide a system for development of R web applications, but in a fairly different context. OpenCPU is open source and written completely in R, whereas RevoDeployR is proprietary and written in Java. I think Revolution focusses more on a complete solution in a corporate environment. It integrates with the Revolution Enterprise suite and their other big data products, and has built-in functionality for authentication, managing privileges, server administration, support for MS Windows, etc. OpenCPU on the other hand is much smaller and should be seen as just a computational backend, analogous to a database backend. It exposes a clean HTTP api to call R functions to be embedded in larger systems, but is not a complete end-product in itself.
OpenCPU is designed to make it easy for a statistician to expose statistical functionality that will used by web developers that do not need to understand or learn R. One interesting example is how we use OpenCPU inside OpenMHealth, a project that designs an architecture for mobile applications in the health domain. Part of the architecture are so called “Data Processing Units”, aka DPU’s. These are simple, modular I/O units that do various sorts of data processing, similar to unix tools, but then over HTTPS. For example, the mobility dpu is used to calculate distances between gps coordinates via a simple http call, which OpenCPU maps to the corresponding R function implementing the harversine formula.
Ajay- What are your views on Shiny by RStudio?
Jeroen- RStudio seems very promising. Like Revolution, they deliver a more full featured product than any of my projects. However, RStudio is completely open source, which is great because it allows anyone to leverage the software and make it part of their projects. I think this is one of the reasons why the product has gotten a lot of traction in the community, which has in turn provided RStudio with great feedback to further improve the product. It illustrates how open source can be a win-win situation. I am currently developing a package to run OpenCPU inside RStudio, which will make developing and running OpenCPU apps much easier.
Ajay- Are you still developing excellent RApache web apps (which IMHO could be used for visualization like business intelligence tools?)
Jeroen- The OpenCPU framework was a result of those webapps (including ggplot2 for graphical exploratory analysis, lme4 for online random effects modeling, stockplot for stock predictions and irttool.com, an R web application for online IRT analysis). I started developing some of those apps a couple of years ago, and realized that I was repeating a large share of the infrastructure for each application. Based on those experiences I extracted a general purpose framework. Once the framework is done, I’ll go back to developing applications
Ajay- You have helped build web apps, openCPU, RAppArmor, Ohmage , Snack , mobility apps .What’s your thesis topic on?
Jeroen- My thesis revolves around all of the technical and social challenges of moving statistical computing beyond the academic and private labs, into more public, accessible and social places. Currently statistics is still done to mostly manually by specialists using software to load data, perform some analysis, and produce results that end up in a report or presentation. There are great opportunities to leverage the open source analysis and visualization methods that R has to offer as part of open source stacks, services, systems and applications. However, several problems need to be addressed before this can actually be put in production. I hope my doctoral research will contribute to taking a step in that direction.
Ajay- R is RAM constrained but the cloud offers lots of RAM. Do you see R increasing in usage on the cloud? why or why not?
Jeroen- Statistical computing can greatly benefit from the resources that the cloud has to offer. Software like OpenCPU, RStudio, Shiny and RevoDeployR all provide some approach of moving computation to centralized servers. This is only the beginning. Statisticians, researchers and analysts will continue to increasingly share and publish data, code and results on social cloud-based computing platforms. This will address some of the hardware challenges, but also contribute towards reproducible research and further socialize data analysis, i.e. improve learning, collaboration and integration.
Here is an interview with Pranay Agrawal, Executive Vice President- Global Client Development, Fractal Analytics – one of India’s leading analytics services providers and one of the pioneers in analytics services delivery.
Ajay- Describe Fractal Analytics’ journey as a startup to a pioneer in the Predictive Analytics Services industry. What were some of the key turning points in the field of analytics that you have noticed during these times?
Pranay- In 2000, Fractal Analytics started as a pure-play analytics services company in India with a focus on financial services. Five years later, we spread our operation to the United States and opened new verticals. Today, we have the widest global footprint among analytics providers and have experience handling data and deep understanding of consumer behavior in over 150 counties. We have matured from an analytics service organization to a productized analytics services firm, specializing in consumer goods, retail, financial services, insurance and technology verticals.
We are on the fore-front of a massive inflection point with Big Data Analytics at the center. We have witnessed the transformation of analytics within our clients from a cost center to the most critical division that drives competitive advantage. Advances are quickly converging in computer science, artificial intelligence, machine learning and game theory, changing the way how analytics is consumed by B2B and B2C companies. Companies that use analytics well are poised to excel in innovation, customer engagement and business performance.
Ajay- What are analytical tools that you use at Fractal Analytics? Are there any trends in analytical software usage that you have observed?
Pranay- We are tools agnostic to serve our clients using whatever platforms they need to ensure they can quickly and effectively operationalize the results we deliver. We use R, SAS, SPSS, SpotFire, Tableau, Xcelsius, Webfocus, Microstrategy and Qlikview. We are seeing an increase in adoption of open source platform such as R, and specialize tools for dashboard like Tableau/Qlikview, plus an entire spectrum of emerging tools to process manage and extract information from Big Data that support Hadoop and NoSQL data structures
Ajay- What are Fractal Analytics plans for Big Data Analytics?
Pranay- We see our clients being overwhelmed by the increasing complexity of the data. While they are all excited by the possibilities of Big Data, on-the-ground struggle continues to realize its full potential. The analytics paradigm is changing in the context of Big Data. Our solutions focus on how to make it super-simple for our clients combined with analytics sophistication possible with Big Data.
Let’s take our Customer Genomics solution for retailers as an example. Retailers are collecting information about Shopper behaviors through every transaction. Retailers want to transform their business to make it more customer-centric but do not know how to go about it. Our Customer Genomics solution uses advanced machine learning algorithm to label every shopper across more than 80 different dimensions. Retailers use these to identify which products it should deep-discount depending on what price-sensitive shoppers buy. They are transforming the way they plan their assortment, planogram and targeted promotions armed with this intelligence.
We are also building harmonization engines using Concordia to enable real-time update of Customer Genomics based on every direct, social, or shopping transaction. This will further bridge the gap between marketing actions and consumer behavior to drive loyalty, market share and profitability.
Ajay- What are some of the key things that differentiate Fractal Analytics from the rest of the industry? How are you different?
Pranay- We are one of the pioneer pure-play analytics firm with over a decade of experience consulting with Fortune 500 companies. What clients most appreciate about working with us includes:
- Experience managing structured and unstructured Big Data (volume, variety) with a deep understanding of consumer behavior in more than 150 counties
- Advanced analytics leveraging supervised machine-learning platforms
- Proprietary products for example: Concordia for data harmonization, Customer Genomics for consumer insights and personalized marketing, Pincer for pricing optimization, Eavesdrop for social media listening, Medley for assortment optimization in retail industry and Known Value Item for retail stores
- Deep industry expertise enables us to leverage cross-industry knowledge to solve a wide range of marketing problems
- Lowest attrition rates in the industry and very selective hiring process makes us a great place to work
Ajay- What are some of the initiatives that you have taken to ensure employee satisfaction and happiness?
Pranay- We believe happy employees create happy customers. We are building a great place to work by taking a personal interest in grooming people. Our people are highly engaged as evidenced by 33% new hire referrals and the highest Glassdoor ratings in our industry.
We recognize the accomplishments and contributions made through many programs such as:
- FractElite – where peers nominate and defend the best of us
- Recognition board – where anyone can write a visible thank you
- Value cards – where anyone can acknowledge great role model behavior in one or more values
- Townhall – a quarterly all hands where we announce anniversaries and FractElite awards, with an open forum to ask questions
- Employee engagement surveys – to measure and report out on satisfaction programs
- Open access to managers and leadership team – to ensure we understand and appreciate each person’s unique goals and ambitions, coach for high performance, and laud their success
Ajay- How happy are Fractal Analytics customers quantitatively? What is your retention rate- and what plans do you have for 2013?
Pranay- As consultants, delivering value with great service is critical to our growth, which has nearly doubled in the last year. Most of our clients have been with us for over five years and we are typically considered a strategic partner.
We conduct client satisfaction surveys during and after each project to measure our performance and identify opportunities to serve our clients better. In 2013, we will continue partnering with our clients to define additional process improvements from applying best practice in engagement management to building more advanced analytics and automated services to put high-impact decisions into our clients’ hands faster.
Pranay Agrawal -Pranay co-founded Fractal Analytics in 2000 and heads client engagement worldwide. He has a MBA from India Institute of Management (IIM) Ahmedabad, Bachelors in Accounting from Bangalore University, and Certified Financial Risk Manager from GARP. He is is also available online on http://www.linkedin.com/in/pranayfractal
Fractal Analytics is a provider of predictive analytics and decision sciences to financial services, insurance, consumer goods, retail, technology, pharma and telecommunication industries. Fractal Analytics helps companies compete on analytics and in understanding, predicting and influencing consumer behavior. Over 20 fortune 500 financial services, consumer packaged goods, retail and insurance companies partner with Fractal to make better data driven decisions and institutionalize analytics inside their organizations.
Fractal sets up analytical centers of excellence for its clients to tackle tough big data challenges, improve decision management, help understand, predict & influence consumer behavior, increase marketing effectiveness, reduce risk and optimize business results.
Here is an interview with Anne Milley,Sr Director, Analytic Strategy, JMP.
Ajay- Review – How was the year 2012 for Analytics in general and JMP in particular?
Anne- 2012 was great! Growing interest in analytics is evident—more analytics books, blogs, LinkedIn groups, conferences, training, capability, integration…. JMP had another good year of worldwide double-digit growth.
Ajay- Forecast- What is your forecast for analytics in terms of top 5 paradigms for 2013?
Anne- In an earlier blog, I had predicted we will continue to see more lively data and information visualizations—by that I mean more interactive and dynamic graphics for both data analysts and information consumers.
We will continue to hear about big data, data science and other trendy terms. As we amass more and more data histories, we can expect to see more innovations in time series visualization. I am excited by the growing interest we see in spatial and image analysis/visualization and hope those trends continue—especially more objective, data-driven image analysis in medicine! Perhaps not a forecast, but a strong desire, to see more people realize and benefit from the power of experimental design. We are pleased that more companies—most recently SiSoft—have integrated with JMP to make DOE a more seamless part of the design engineer’s workflow.
Ajay- Cloud- Cloud Computing seems to be the next computing generation. What are JMP plans for cloud computing?
Anne- With so much memory and compute power on the desktop, there is still plenty of action on PCs. That said, JMP is Citrix-certified and we do see interest in remote desktop virtualization, but we don’t support public clouds.
Ajay- Events- What are your plans for the International Year of Statistics at JMP?
Anne- We kicked off our Analytically Speaking webcast series this year with John Sall in recognition of the first-ever International Year of Statistics. We have a series of blog posts on our International Year of Statistics site that features a noteworthy statistician each month, and in keeping with the goals of Statistics2013, we are happy to:
- increase awareness of statistics and why it’s essential,
- encourage people to consider it as a profession and/or enhance their skills with more statistical knowledge, and
- promote innovation in the sciences of probability and statistics.
Both JMP and SAS are doing a variety of other things to help celebrate statistics all year long!
Ajay- Education Training- How does JMP plan to leverage the MOOC paradigm (massive open online course) as offered by providers like Coursera etc.?
Anne- Thanks to you for posting this to the JMP Professional Network on LinkedIn, where there is some great discussion on this topic. The MOOC concept is wonderful—offering people the ability to invest in themselves, enhance their understanding on such a wide variety of topics, improve their communities…. Since more and more professors are teaching with JMP, it would be great to see courses on various areas of statistics (especially since this is the International Year of Statistics!) using JMP. JMP strives to remove complexity and drudgery from the analysis process so the analyst can stay in flow and focus on solving the problem at hand. For instance, the one-click bootstrap is a great example of something that should be promoted in an intro stats class. Imagine getting to appreciate the applied results and see the effects of sampling variability without having to know distribution theory. It’s good that people have options to enhance their skills—people can download a 30-day free trial of JMP and browse our learning library as well.
Ajay- Product- What are some of the exciting things JMP users and fans can look forward to in the next releases this year?
Anne- There are a number of enhancements and new capabilities planned for new releases of the JMP family of products, but you will have to wait to hear details…. OK, I’ll share a few! JMP Clinical 4.1 will have more sophisticated fraud detection. We are also excited about releasing version 11 of JMP and JMP Pro this September. JMP’s DOE capability is well-known, and we are pleased to offer a brand new class of experimental design—definitive screening designs. This innovation has already been recognized with The 2012 Statistics in Chemistry Award to Scott Allen of Novomer in collaboration with Bradley Jones in the JMP division of SAS. You will hear more about the new releases of JMP and JMP Pro at Discovery Summit in San Antonio—we are excited to have Nate Silver as our headliner!
Anne Milley directs analytic strategy in JMP Product Marketing at SAS. Her ties to SAS began with bank failure prediction at FHLB Dallas. Using SAS continued at 7-Eleven Corporation in Strategic Planning. She has authored papers and served on committees for SAS Education conferences, KDD, and SIAM. In 2008, she completed a 5-month assignment at a UK bank. Milley completed her M.A. in Economics from Florida Atlantic University, did post-graduate work at RWTH Aachen, and is proficient in German.
Introduced in 1989, JMP has grown into a family of statistical discovery products used worldwide in almost every industry. JMP is statistical discovery software that links dynamic data visualization with robust statistics, in memory and on the desktop. From its beginnings, JMP software has empowered its users by enabling interactive analytics on the desktop. JMP products continue to complement – and are often deployed with – analytics solutions that provide server-based business intelligence.
Here is an interview with Naveen Gattu, COO and co-founder of Gramener ,one of the most happening data science companies.
Ajay- Describe the story so far for Gramener. What have been the key turning points ?
Naveen- All founders of Gramener are first generation entrepreneurs, started our careers with IBM were very successful in our corporate jobs with hefty pay packages, but always at the back of the mind can’t we work our ourselves and have FUN.
With this thought in mind 6 of us got together in 2010 to lay foundation for Gramener, with our consulting experience we wanted to get into business analytics , but soon we realized that there are lot many people who are doing great analytics but not an effective way of presentation, we wanted to establish niche for ourselves and create an offering to make “Data Consumption” easy and joyful.
Our significant milestone was Airtel (more…)
Ajay- Describe how you started using R. What are some of the benefits you noticed on moving to R?
Jeff- I began using R in an internship while working on my undergraduate degree. I was provided with some unformatted R code and asked to modularize the code then wrap it up into an R package for distribution alongside a publication.
To be honest, as a Computer Science student with training more heavily emphasizing the big high-level languages, R took some getting used to for me. It wasn’t until after I concluded that initial project and began using R to do my own data analysis that I began to realize its potential and value. It was the first scripting language which really made interactive use appealing to me — the experience of exploring a dataset in R was unlike anything (more…)