Interview Jim Davis SAS Institute

Here is an interview with Jim Davis, SAS Institute SVP and Chief Marketing Officer.

Traditional business intelligence (BI) as we know it is outdated and insufficient-

davis_jim

Jim Davis, SAS Institute..

Ajay -Please describe your career in science to your present position. What advice would you give to young science graduates in this recession? What advice would you give to entrepreneurs in these challenging economic times?
Jim – After earning a degree in computer science from North Carolina State University, I embarked on a career path that ultimately brought me to SAS and my role as senior VP and CMO. Along the way I’ve worked in software development, newspaper and magazine publishing and IT operations. In 1994, I joined SAS, where I worked my way up the ranks from enterprise computing strategist focused on IT issues to program manager for data warehousing to director of product strategy, VP of marketing and now CMO. It’s been an interesting path.

My advice to new graduates embarking on a career is to leave no stone unturned in your search, particularly in this economy, but also consider adding to your skill set. A local example here in the Research Triangle area is at N.C. State University’s Institute for Advanced Analytics, which offers a master’s degree that combines business and analytical skills. These skills are very much in demand. SAS CEO Jim Goodnight helped establish this 10-month degree program where the first 23 graduating all found solid jobs within four months at an average salary of $81,000. Many of this year’s class, facing the worst economy since the Great Depression, have already found jobs. For entrepreneurs today, my advice is simple: make absolutely sure you’re creating a product or service that people want. And especially given the challenging economic environment, resolve to improve your decision making. Regardless of industry or company size, business decisions need to be based on facts, on data, on science. Not on hunches and guesswork. Business analytics can help here.

Ajay – What are some of the biggest challenges that you have faced and tackled as a marketing person for software? What continues to your biggest focus area for this year?

Jim – Among the biggest challenges that the SAS marketing team has worked to overcome is the perception that analytical software – advanced forecasting, optimization and data mining technologies – are way too complex, difficult to use, and only useful to a small band of highly trained statisticians and other quantitative experts, or “quants.” With lots of hard work, we’ve been able to show the marketplace that powerful tools are available in business solutions designed to solve industry issues.

The biggest marketing challenge now is showing the market how SAS offers unique value with its broad and integrated technologies. The industry terminology is confusing with some companies selling Business Intelligence tools that when you scratch the surface are limited to reporting and query operations. Other SAS competitors only provide data integration software, and still others offer analytics. SAS is the only vendor offering an integrated portfolio of these three very important technologies, as well as cross-industry and industry-specific intelligent applications. This combination, which we and others are calling Business Analytics, is a very powerful set of capabilities. Our challenge is to demonstrate the real value of our comprehensive portfolio. We’ll get there but we have some work to do.

Ajay -It is rare to find a major software company that has zero involvement with open source movement (or as I call it with peer-reviewed code). Could you name some of SAS Institute’s contribution to open source? What could be further plans to enhance this position with the global community of scientists?

Jim – SAS does support open source and open standards too. Open standards typically guide open source implementations (e.g., the OASIS work is guiding some of the work in Eclipse Cosmos, some of the JCP standards guide the Tomcat implementation, etc.).

Some examples of SAS’s contributions to open source and open standards include:

Apache Software Foundation – a senior SAS developer has been a committer on multiple releases of the Apache Tomcat project, and has also acted as Release Coordinator.

Eclipse Foundation — SAS developers were among the early adopters of Eclipse. One senior SAS developer wrote a tutorial whitepaper on using Eclipse RCP, and was named “Top Ambassador” in the 2006 Eclipse Community Awards. Another is a committer on the Eclipse Web Tools project. A third proposed and led Eclipse’s Albireo project. SAS is a participant in the Eclipse Cosmos project, with three R&D employees as committers. Finally, SAS’ Rich Main served on the board of directors of the Eclipse Foundation from 2003 to 2006, helping write the Eclipse Bylaws, Development Process, Membership Agreement, Intellectual Property Policy and Public License.

Java Community Process — SAS has been a Java licensee partner since 1997 and has been active in the Java Community Process. SAS has participated in approximately 25 Java Specification Requests spanning both J2SE and J2EE technology. Rich Main of SAS also served on the JCP Executive Committee from 2005 through 2008.

OASIS — A senior SAS developer serves as secretary of the OASIS Solution Deployment Descriptor (SDD) Technical Committee. In total, six SAS employees serve on this committee.

XML for Analysis — SAS co-sponsored XML for Analysis standard with Microsoft and Hyperion.

Others — A small SAS team developed Cobertura, an open source coverage analysis tool for Java. SAS (through our database access team) is one of the top corporate contributors to Firebird, an open source relational database. Another developer contributes to Slide WebDav. We’ve had people work on HtmlUnit (another testing framework) and FreeBSD.

In addition, there are dozens if not hundreds of contributed bug reports, fixes/patches from SAS developers using open source software. SAS will continue to expand our work with and contribute to open-source tools and communities.

For example, we know a number of our customers use R as well as SAS. So we decided to make it easier for them to access R by making it available in the SAS environment. Our first interface to R, which enables users to integrate R functionality with IML or SAS programs, will be in an upcoming version of SAS/IML Studio later this summer. We’re also working on an R interface that can be surfaced in the SAS server or via other SAS clients.

Ajay – What is business intelligence, and business analytics as per you? SAS is the first IT vendor that comes in the non sponsored link when I search for “business intelligence’ in Google. How well do you think the SAS Business Intelligence Platform rates across platforms from SAP, Oracle , IBM and Microsoft.

Jim – Traditional business intelligence (BI) as we know it is outdated and insufficient.

The term BI has been stretched and widened to encapsulate a lot of different techniques, tools and technologies since it was first coined decades ago. Essentially, BI has always been about information delivery, be it in static rows and columns, graphical representations of information, or the modern and hyper-interactive dashboard with dials and widgets.

BI technologies have also evolved to include intuitive ad-hoc query and analysis with the ability to drill down into the details within context. All of these capabilities are great for reacting to business problems after they have occurred. But businesses face diverse and complex problems, global competition grows exponentially, and increasingly restrictive regulations are just around the corner. They need to anticipate and manage change, drive sustainable growth and track performance.

Now they also have to operate in the midst of a ruinous global credit and liquidity crisis. Reactionary decision making is just not working. Now more than ever, progressive organizations are looking to leverage the power of analytics, specifically business analytics. Why? Real business value comes from capitalizing on all available information assets and selecting the best outcome based on every possible scenario.

Proactive evidence-based decisions – not just information delivery – should drive informed decisions. That is business analytics and that is what SAS provides its customers.

Businesses require robust data integration, data quality, data and text mining, predictive modeling, forecasting and optimization technologies to anticipate what might happen, avoid undesired outcomes and course correct.

These capabilities need to be in synch and integrated from the ground up rather than cobbled together through acquisitions. More importantly, they cannot be part of a monolithic platform that requires 2-3 years before any real value is derived.

They must be part of an agile framework that enables an organization to address its most critical business issues now and then add new functionality over time. A business analytics framework — like the one SAS provides — enables strategic business decisions that optimize performance across an organization.

Ajay – For 4 decades SAS Institute created, nurtured and sustained the SAS language, often paying from its pocket for conferences, papers. Till today SAS Language code on your website is free and accessible to all without a registration unlike other software companies. What do you have to say about third party SAS language compilers like “Carolina” and “WPS”

Jim – There is no doubt that much of the power and flexibility behind our framework for business analytics is derived from our SAS language. At its core, the Base SAS language offers an easy-to-learn syntax and hundreds of language elements, pre-built SAS procedures and re-usable functions. Our focus on listening and adapting to customer’s changing needs has helped us, over the years, to sustain and continuously improve the SAS language and the SAS products that leverage it.

Competition comes in many forms and it pushes us to innovate and keep delivering value for our customers. Language compilers or code interpreters like Carolina and WPS are no exception.

One thing that sets SAS apart from other vendors is that we care so deeply about the quality of results.Our Technical Support, Education and consulting services organizations really do partner with customers to help them achieve the best results.

As Anne Milley, SAS’ director of technology product marketing, told DecisionStats this March, customers have varied and specific requirements for their analytics infrastructure. Desired attributes include speed, quality, support, backward and forward compatibility, and others. Certain customers only care about one or two of these attributes, other customers care about more. With our broad and deep analytics portfolio, SAS can uniquely provide the analytics infrastructure that meets a customer’s specific requirements, whether for one or many key attributes. Because of this, an overwhelming majority vote with their pocketbooks to select or retain SAS.

For example, as Anne noted, for some customers with tight batch-processing windows, speed trumps everything. In tests conducted by Merrill Consultants, an MXG program running on WPS runs significantly longer, consumes more CPU time and requires more memory than the same MXG program hosted on its native SAS platform.

At SAS, we provide a complete environment for analytics — from data collection, manipulation, exploration and analysis to the deployment of results. One example of our continuous innovation, and where we are devoting R&D and sales resources, is the SAS In-Database Processing Initiative. Through in-database analytics, customers can move computational tasks (e.g., SAS code, SQL) to execute inside a database. This streamlines the analytic data preparation, model development and scoring processes. Customers needing to leverage their investments in mixed workload relational database platforms will benefit from this SAS initiative. It will help them accelerate their business processes and drive decisions with greater confidence and efficiency.

Ajay – Are you going to move closer for an acquisition? Or be acquired? Which among the existing BI vendors are you most comfortable with in synergy of products and philosophy?

Jim –SAS is in an enviable position as the largest independent provider of business intelligence (BI) software, and the leader in the rapidly emerging field of business analytics, which combines BI with data integration and advanced analytics. We have no plans, nor have had any talks regarding SAS being acquired.

As for SAS acquiring another company, we continuously look for technologies complementary to our wide and deep lineup of business analytics solutions, many of which are targeted at the specific needs of industries ranging from banking, insurance and pharma to healthcare, telecom, manufacturing and government.

Last year, SAS made two acquisitions, IDeaS Revenue Optimization, the premier provider of advanced revenue-management and optimization software for the hospitality industry, and Teragram, a leader in natural language processing and advanced linguistic technology. IDeaS delivers to SAS and our hotel and hospitality customers software sold as a service that meets a critical need in this industry. Teragram’s exciting technology has enhanced SAS’ own robust text mining offerings.

Ajay – Jim Goodnight is a legend in philanthropy, inventions, and as a business leader (obviously he has a fine team supporting him). Who will be the next Jim         Goodnight ?

Jim – I think Jim Goodnight best addressed the question of succession plans at SAS best a few years ago when he noted that the business world often places undue emphasis on the CEO and forgets about the CTO, CMO, CFO and other senior leaders who play a key role in any company’s success. SAS has a very strong executive management team that runs a two billion-dollar software company very effectively. If a “next Jim Goodnight” is needed in the future, SAS will be ready and will continue to provide our customers with the business analytics software they need.

Biography-

Jim Davis, Senior Vice President and Chief Marketing Officer for SAS, is responsible for providing strategic direction for SAS products, solutions and services and presenting the SAS brand worldwide. He helped develop the Information Evolution Model and co-authored “Information Revolution: Using the information Evolution Model to Grow your Business.” By outlining how information is managed and used as a corporate asset, the model enables organizations to evaluate their management of information objectively, providing a framework for making improvements necessary to compete in today’s global arena.

s285_sas100k_130w SAS (www.sas.com) is the leader in business analytics software and services, and the largest independent vendor in the business intelligence market. Through innovative solutions delivered within an integrated framework, SAS helps customers at more than 45,000 sites improve performance and deliver value by making better decisions faster. Since 1976 SAS has been giving customers around the world The Power to Know®.

Interview Paul van Eikeren Inference for R

visit this

http://decisionstats.posterous.com/decisionstats-interview-paul-van-eikeren-infe

Interview Paul van Eikeren Inference for R

Here is an interview with Paul van Eikeren, President and CEO of Blue Reference, Inc. Paul heads up a startup company addressing the need of information workers to have easier-cheaper-faster access to high-end data mining, analysis and reporting capabilities from software like R, S-plus, MATLAB, SAS, SPSS, python and ruby. His recent product Inference for R has been causing waves within the analytical fraternity across both R users and SAS users, especially given the fact that it is quite well designed, has a great GUI, and is priced rather reasonably.

A few weeks ago, rumour had it the SAS Institute was reportedly buying out the Inference for R product ( Note the merger and acquisition question below)

Rather curious to know about this company, I happened to met Ben Hincliffe at the http://www.analyticbridge.com site which with 5000 members has the largest number of data analytics and many business intelligence members as well). Ben who recently authored a guest post for Sandro at Data Mining Blog then put across my request to interview with Paul, the CEO for Blue Reference. Existing products for Blue Reference include additional analytical packages like Inference for Matlab etc.

Paul is an extremely seasoned person with years in the analytical fraternity and with a Phd from MIT. Here is Paul’s vision on his company and analytics product development.
pve1

Ajay: Describe your career journeys. What advice would you give to today’s young people of following careers in science.

Paul: I have been blessed with extremely productive and diversified career journey. After receiving undergraduate and graduate degrees in chemistry, I taught chemistry and carried out research as a college professor for 14 years. During the next 12 years I spend heading R&D teams at three different startup companies focused on the application of novel processing technology for use in drug discovery and development. And using that wealth of acquired experience, I have had the good fortune to successfully co-found and develop with my son Josh, two startup companies (IntelliChem and Blue Reference) directed at the use of informatics to drive more efficient and effective Research, Development, Manufacturing and Operations.

In my journey I have had the opportunity to counsel many young people regarding their career choices. I have offered two principal pieces of advice: one, for the right person, science represents an outstanding opportunity for a productive and satisfying career; and two, a science education provides an outstanding stepping stone to careers in other fields. A study disclosed in a recent Wall Street Journal article (Sarah E. Needleman, “Doing the Math to Find the Good Jobs, 26 January 2009) revealed that mathematicians land the top spot in the new rankings of the best occupations. Science-linked occupations took 7 out of the top 20 spots.

These ratings suggest that the problem solving and innovation aspects of scientific occupations are much less stressful than other occupations, which leads to high job satisfaction. But does one have to be a genius to have a successful career in science? An interesting read on this subject is the book by Robert Weisberg (Creativity: Beyond the Myth of the Genius) in which he dispels the myth of the genius being the results of a genetic gift. Weisberg argues, convincingly, that a genius exhibits three elements: (1) a basic intellectual capacity; (2) a high level of motivation/determination, which enables the genius to remain focused; and (3) immersion in their chosen field, typically represented by over 10,000 hours of study/practice/experience. It turns out that the latter element is the principal differentiator, and fortunately, it is something one has control over.

Ajay: Describe the journey that Blue Reference has made leading to its current product line, including Inference for R.

Paul: The Inference product suite represents a natural extension beyond the Electronic Laboratory Notebook (ELN) product we developed at our previous company, IntelliChem. ELNs are used by scientists and technicians to document research, experiments and procedures performed in a laboratory. The ELN is a fully electronic replacement of the paper notebook. IntelliChem (sold to Symyx in 2004) was a leader in deployment of ELNs at global pharmaceutical companies.

After seeing the successful adoption of ELNs in the laboratory, we saw an opportunity to improve upon the utility of ELN documents and the data contained therein. Essentially, we developed Inference to be a platform for enabling MS Office documents with powerful, flexible, and transparent analytic capabilities – what we call “dynamic documents” or “document mashups”. Executable code from high-level scripting languages like R, MATLAB, and .NET, is combined with data and explanatory text in the document canvas to transform it from a static record into an analytic application.

The pharmaceutical industry, in cooperation with the FDA, has begun to look at ways to implement quality by design (QbD) practices as an alternative to quality by end-testing. QbD comprises a systematic application of predictive analytics to the drug R&D process such that development timelines and costs are reduced while drug safety and efficacy is improved.

Statistical modeling and analysis plays a key role in QbD as a tool for identifying critical quality attributes and confining their variability to a specified design space. Dynamic documents fit nicely into this paradigm, and we’re currently using Inference as a platform to develop an enterprise solution for QbD. You can visit http://www.InferenceForQbD.com for more information about our QbD product.

Along the way, we recognized the need for Inference outside of the pharmaceutical industry. The Inference for R, Inference for MATLAB, and Inference for.NET versions are meant to serve users of these technical computing languages who have analysis, publishing, reporting, collaboration, and reproducible research needs that are best served by a document centric environment. By using Microsoft Word, Excel and PowerPoint as the “front end,” we can serve the the 500 million users that use Microsoft Office as their principal desktop productive application.

Ajay: What is the pricing strategy for Inference for Matlab and Inference for R – and how do you see the current recession as an opportunity for analytical products.

Paul: Our strategy is to reach out to the market Microsoft Office users that would benefit from easy access to datamining and predictive analytics capabilities within their principal desktop productivity tool. Accordingly, we have offered the Inference product at the low price of $199 for a single user/one year subscription. Additionally, because it is implemented on top of an existing installation of Microsoft Office, the cost of training, support and maintenance are expected to be minimal.

create-a-simple-user-interface-for-your-r-application
create-a-simple-user-interface-for-your-r-application

r-code-directly-in-excel-to-customize-your-analysis
r-code-directly-in-excel-to-customize-your-analysis

graphical-output-in-an-excel-tab
graphical-output-in-an-excel-tab

Ajay: Your product seems to follow a nice fit where both open source as well as proprietary packages from Microsoft( .Net) are working together to give the customer a nice solution. Do you believe it is possible that big companies and big open source communities can work together to create some software rather than just be at loggerheads.

Paul: Absolutely. We’re seeing momentum build for open source analytic solutions as the economy impacts companies, both small and large. We saw this take place in the back office with implementation of Linux and Apache Web servers, and now we’re starting to see it in the front office. Smart IT teams are looking for creative ways to stretch their resources, forcing them to look beyond established, but expensive, software products.

We’ve encountered concrete evidence of this in the financial industry. Fresh on the heels of the credit crisis, investment banks and hedge funds have begun to realize that their risk models and supporting software infrastructure are inadequate. In response, quantitative finance and risk analysts are increasingly turning to the open source R statistical computing environment for improved predictive analytics.

R has a core group of devotees in academia that drive innovation, making it a comprehensive venue for development of leading-edge data analysis methods. In order to leverage these tools, banks need a way

for R to play nicely with their existing personnel and IT infrastructure. This is where Inference for R produces real value. It transforms MS Office into platform for the development, distribution, and maintenance of R based quantitative tools – enabling production level predictive analytics.

Commercial distributions of R address issues of scalability and support, which might otherwise be subjects of concern. For example, REvolution Computing distributes an optimized, validated and supported distribution of R, providing peace of mind to corporate IT. REvolution also offers Enterprise R, a distribution of R for 64-bit, high performance computing.

Ajay: Please name any successful customer testimonials for Inference for R.

Paul: We have been working with the director of quantitative analytics at a large international bank. He reported that he has successfully distributed R applications to his team of research analysts and portfolio managers based on Inference in Excel. Use of this strategy eliminated the need to code complex models in Visual Basic for applications, which is time consuming and error prone.

Ajay: Also are there any issues with licensing and IP for mixing open source code and proprietary code.

Paul- The licensing issues with open source R pertain to distributing R. There are no licensing restrictions in using R. Accordingly, we do not distribute R. Rather, our customers install R separately and Inference recognizes the installation.

Ajay: So R is free and I can get Open Office for free. What are the five specific uses where Inference for R can score an edge over this and make me pay for the solution.

Paul: R is free, and many R enthusiasts would argue that all you need for R is a Linux operating system like Ubuntu, a text editor such as Emacs, and R’s command line interface. For some highly-skilled R users this is sufficient; for the new and average R user this is a nightmare.

Many people think that the largest fraction of the cost of implementing new software is the cost of the license. In actuality, and especially in the corporate world, it is the cost of training, user support, software maintenance, and the costs of switching the user base to the new software. Free open source software does not help here. Hence there is a strong ROI argument to be made to build new software application on top of existing systems that have worked well.

Additionally, successful implementation of open source software like R requires a baseline of integration with existing systems. The fact is that Microsoft operating systems dominate the business world, as does Microsoft Office. If one is serious about using R to address the analytic needs of big business, tight integration with these systems is imperative.

Ajay: Any plans for a web hosted SaaS version for Inference for R soon?

Paul: The natural progression of Inference for R to SaaS will coincide with the next release of Office (Office 2010 or Office 14), which we expect to be largely SaaS enabled.

Ajay: Name some alliances and close partners working with Blue Reference

– and what we can expect from you in terms of product launches in 2009.

Paul: We have created a product development consortium in partnership involving ‘top ten’ global pharmaceutical companies The consortium is guiding the development of an enterprise solution for Quality by Design (QbD), using Inference for R as the platform.

We are working with several consulting firms specializing in IT solutions for specialized markets like risk management and predictive analytics.

We are also working with several technology partners who have complementary products and where integration of their products with Inference provides clear and significant value to customers.

Ajay: Any truth to the rumors of an acquisition by a BIG analytics company?

Paul: Our business strategy is centered on growth through partnerships with others. Acquisition is one means to execute that strategy.

Ajay: How do you see this particular product (for R) shaping up down the years.

Paul: R’s success can be attributed, in large part, to the support of its loyal open source community. Its enthusiastic use in academia bodes very well for its growth as a cutting-edge analytics tool. It is just a matter of time before commercial analytic solutions powered by R become de rigueur. We’re happy to be at the tip of the spear.

Ajay: Any Asia plans for Blue Reference or are you still happy with the Oregon location. How do you plan to interact with graduate schools and academia for your products.

Paul: Although we don’t have a major private university in our backyard, Oregon State University has opened a campus here. And, we’ve been in dialogue with the global Academic community from day one. Over 100 academic institutions around the world use Inference through our academic licensing program. Inference is a great tool for preparing dynamic lessons and publishing reproducible research.

Our Central Oregon location is home to a growing high-tech sector that we’ve been a part of for decades. We’ve had success building large and profitable companies here. Bend attracts Silicon Valley types who come here for vacation and don’t want to leave – they just can’t seem to resist the quality of life and bountiful recreational opportunities that this area offers. It’s a good mix of work and play.

Biography

Paul van Eikeren is President and CEO of Blue Reference, Inc. He is responsible for guiding the strategic direction of the company through novel products and services development, partnerships and alliances in the realm of application of informatics to faster-cheaper-better research, development, manufacturing and operations. Van Eikeren is a successful serial entrepreneur, which includes the co-founding of IntelliChem with his son Josh and its ultimate sale to Symyx Technologies. He has headed up R&D at several startup companies focused on drug discovery and development including Sepracor Inc., Argonaut Technologies, Inc, and Bend Research, Inc. He served as Professor of Chemistry and Biochemistry at Harvey Mudd College of Science and Engineering. He is author/co-author and inventor/co-inventor in over 50 scientific articles and patents directed at the application of chemical, biochemical and computational technologies. Van Eikeren holds a BA degree in Chemistry from Columbia University and a PhD in Chemistry from MIT.bluereference-logo

Ajay- To know more I recommend checking out the free evaluation at http://inferenceforr.com/ especially if you need to rev up your MS office Installation with greater graphics and analytics juice.

Interview David Smith REvolution Computing

Here is an Interview with REvolution Computing’s Director of Community David Smith.

Our development team spent more than six months making R work on 64-bit Windows (and optimizing it for speed), which we released as REvolution R Enterprise bundled with ParallelR.” David Smith

Ajay -Tell us about your journey in science. In particular tell us what attracted you to R and the open source movement.

David- I got my start in science in 1990 working with CSIRO (the government science organization in Australia) after I completed my degree in mathematics and computer science. Seeing the diversity of projects the statisticians there worked on really opened my eyes to statistics as the way of objectively answering questions about science.

That’s also when I was first introduced to the S language, the forerunner of R. I was hooked immediately; it was just so natural for doing the work I had to do. I also had the benefit of a wonderful mentor, Professor Bill Venables, who at the time was teaching S to CSIRO scientists at remote stations around Australia. He brought me along on his travels as an assistant. I learned a lot about the practice of statistical computing helping those scientists solve their problems (and got to visit some great parts of Australia, too).

Ajay- How do you think we should help bring more students to the fields of mathematics and science-

David- For me, statistics is the practical application of mathematics to the real world of messy data, complex problems and difficult conclusions. And in recent years, lots of statistical problems have broken out of geeky science applications to become truly mainstream, even sexy. In our new information society, graduating statisticians have a bright future ahead of them which I think will inevitably draw more students to the field.

Ajay- Your blog at REVolution Computing is one of the best technical corporate blogs. In particular the monthly round up of new packages, R events and product launches all written in a lucid style. Are there any plans for a REvolution computing community or network as well instead of just the blog.

David- Yes, definitely. We recently hired Danese Cooper as our Open Source Diva to help us in this area. Danese has a wealth of experience building open-source communities, such as for Java at Sun. We’ll be announcing some new community initiatives this summer. In the meantime, of course, we’ll continue with the Revolutions blog, which has proven to be a great vehicle for getting the word out about R to a community that hasn’t heard about it before. Thanks for the kind words about the blog, by the way — it’s been a lot of fun to write. It will be a continuing part of our community strategy, and I even plan to expand the roster of authors in the future, too. (If you’re an aspiring R blogger, please get in touch!)

Ajay- I kind of get confused between what exactly is 32 bit or 64 bit computing in terms of hardware and software. What is the deal there. How do Enterprise solutions from REvolution take care of the 64 bit computing. How exactly does Parallel computing and optimized math libraries in REvolution R help as compared to other flavors of R.

David– Fundamentally, 64-bit systems allow you to process larger data sets with R — as long as you have a version of R compiled to take advantage of the increased memory available. (I wrote about some of the technical details behind this recently on the blog.)  One of the really exciting trends I’ve noticed over the past 6 months is that R is being applied to larger and more complex problems in areas like predictive analytics and social networking data, so being able to process the largest data sets is key.

One common mis perception is that 64-bit systems are inherently faster than their 32-bit equivalents, but this isn’t generally the case. To speed up large problems, the best approach is to break the problem down into smaller components and run them in parallel on multiple machines. We created the ParallelR suite of packages to make it easy to break down such problems in R and run them on a multiprocessor workstation, a local cluster or grid, or even cloud computing systems like Amazon’s EC2 .

” While the core R team produces versions of R for 64-bit Linux systems, they don’t make one for Windows. Our development team spent more than six months making R work on 64-bit Windows (and optimizing it for speed), which we released as REvolution R Enterprise bundled with ParallelR. We’re excited by the scale of the applications our subscribers are already tackling with a combination of 64-bit and parallel computing”

Ajay-  Command line is oh so commanding. Please describe any plans to support or help any R GUI like rattle or R Commander. Do you think Revolution R can get more users if it does help a GUI.

David- Right now we’re focusing on making R easier to use for programmers by creating a new GUI for programming and debugging R code. We heard feedback from some clients who were concerned about training their programmers in R without a modern development environment available. So we’re addressing that by improving R to make the “standard” features programmers expect (like step debugging and variable inspection) work in R and integrating it with the standard environment for programmers on Windows, Visual Studio.

In my opinion R’s strength lies in its combination of high-quality of statistical algorithms with a language ideal for applying them, so “hiding” the language behind a general-purpose GUI negates that strength a bit, I think. On the other hand it would be nice to have an open-source “user-friendly” tool for desktop statistical analysis, so I’m glad others are working to extend R in that area.

Ajay- Companies like SAS are investing in SaaS and cloud computing. Zementis offers scored models on the cloud through PMML. Any views on just building the model or analytics on the cloud itself.

David- To me, cloud computing is a cost-effective way of dynamically scaling hardware to the problem at hand. Not everyone has access to a 20-machine cluster for high-performing computing — and even those that do can’t instantly convert it to a cluster of 100 or 1000 machines to satisfy a sudden spike in demand. REvolution R Enterprise with ParallelR is unique in that it provides a platform for creating sophisticated data analysis applications distributed in the cloud, quickly and easily.

Using clouds for building models is a no-brainer for parallel-computing problems: I recently wrote about how parallel backtesting for financial trading can easily be deployed on Amazon EC2, for example. PMML is a great way of deploying static models, but one of the big advantages of cloud computing is that it makes it possible to update your model much more frequently, to keep your predictions in tune with the latest source data.

Ajay- What are the major alliances that REvolution has in the industry.

David- We have a number of industry partners. Microsoft and Intel, in particular, provide financial and technical support allowing us to really strengthen and optimize R on Windows, a platform that has been somewhat underserved by the open-source community. With Sybase, we’ve been working on combing REvolution R and Sybase Rap to produce some exciting advances in financial risk analytics. Similarly, we’ve been doing work with Vhayu’s Velocity database to provide high-performance data extraction. On the life sciences front, Pfizer is not only a valued client but in many ways a partner who has helped us “road-test” commercial grade R deployment with great success.

Ajay- What are the major R packages that REvolution supports and optimizes and how exactly do they work/help?

David- REvolution R works with all the R packages: in fact, we provide a mirror of CRAN so our subscribers have access to the truly amazing breadth and depth of analytic and graphical methods available in third-party R packages. Those packages that perform intensive mathematical calculations automatically benefit from the optimized math libraries that we incorporate in REvolution R Enterprise. In the future, we plan to work with authors of some key packages provide further improvements — in particular, to make packages work with ParallelR to reduce computation times in multiprocessor or cloud computing environments.

Ajay- Are you planning to lay off people during the recession. does REvolution Computing offer internships to college graduates. What do people at REvolution Computing do to have fun?

David- On the contrary, we’ve been hiring recently. We don’t have an intern program in place just yet, though. For me, it’s been a really fun place to work. Working for an open-source company has a different vibe than the commercial software companies I’ve worked for before. The most fun for me has been meeting with R users around the country and sharing stories about how R is really making a difference in so many different venues — over a few beers of course!


David Smith
Director of Community

David has a long history with the statistical community.  After graduating with a degree in Statistics from the University of Adelaide, South Australia, David spent four years researching statistical methodology at Lancaster University (United Kingdom), where he also developed a number of packages for the S-PLUS statistical modeling environment. David continued his association with S-PLUS at Insightful (now TIBCO Spotfire) where for more than eight years he oversaw the product management of S-PLUS and other statistical and data mining products. David is the co-author (with Bill Venables) of the tutorial manual, An Introduction to R , and one of the originating developers of ESS: Emacs Speaks Statistics. Prior to joining REvolution, David was Vice President, Product Management at Zynchros, Inc.

AjayTo know more about David Smith and REvolution Computing do visit http://www.revolution-computing.com and

http://www.blog.revolution-computing.com
Also see interview with Richard Schultz ,­CEO REvolution Computing here.

http://www.decisionstats.com/2009/01/31/interviewrichard-schultz-ceo-revolution-computing/

White Riders

Here is a nice company started by a fellow batchmate from the Indian Institute of Management, Kaustubh Mishra. It is called White Riders- It is a relative pioneer in adventure travel. Note these bikers are well behaved MBA’s and imparting Team Building Management lessons along the way. I caught up with Kaustubh long enough for him to tell me why he chose the adventure travel business.

km1
Ajay – What has been the story of your career and what message would you like to send to young people aspiring for MBA’s or just starting their careers?

Kaustubh- My first job was as a peon with SPCA, handling paperwork, dishes, etc. My Father wanted to see me getting a bicycle from my own money and that is why it happened. Thanks to Papa, I learnt some important lessons while serving people. During graduation I was doing odd jobs like a faculty at a computer institute, freelance programmer, etc.

The first experience of a large organization came @ Bharti Telecom, where I did my summers. It was a market research project and I remember sleeping in an interviewee’s cabin during a survey. After my PGDM from IIML, I got into Tech Pacific, and then ICICI then ABN AMRO. Please visit my linkedin profile for more details

My message to people doing their MBA is simple – MBA is not the end, it is just a via media for you to get into a good career. Get into an MBA because YOU want to do it and not because everyone else is doing it. There are so many careers options in front of you, follow your heart.

For people starting their careers, just 7 words – realize the power within & follow your dreams.

Ajay- Why did you create a startup? Why did you name it White Collar company ( there was an ad of a business school reunion which had the same name). What is your vision for White Collar Company
Kaustubh
– When I was doing my job, I was always over achieving targets, but after some time a rut sets in. I also realized that complete freedom and maximum returns for my efforts were absent. There were so many things, ideas, etc simmering inside me but I could not do anything inside. To do all that, I had to venture on my own and venture I did. So the biggest reason I started my own company was to put my ideas into practice.

White Collar is a name generally associated with knowledge. I first wanted to name it ‘white’ but the name, domain name, trademarks etc were not available. White denotes knowledge. Our goddess of knowledge and learning ‘Saraswati’ is dressed in white. As all my ventures are essentially about knowledge and learning, so white collar. And White Collar Biker sounds cool and very oxymoronish.

I see White Collar Company to be known as the cradle of new ideas, innovation and creativity in the field of knowledge. A university is next in some years.

Ajay- What are the key learnings that you have learnt in this short period? name some companies in the United States that are similar to your company. What do you think is the market potential of this segment.
Kaustubh-
We are 3 industries – adventure tourism, corporate training and hr advisory. While in the first and the last there are people doing nearly the same thing (I would not say exactly, because we do have our USPs) in corporate training – White Collar Company is the only company in the world conducting management training through motorcycles

With innovation and RoI being extremely important in training, the market potential is huge. In adventure tourism also the potential is great as we are waking upto it. In consultancy as we operate in SME space, the potential again is very large.

It has been a short period to have big learning, but I have been applying learning I had in my previous jobs to this like vendor management, marketing channel management, etc. But yes, I learnt the art of hard bargain and negotiations during this short period.

Ajay- Is an MBA (IIM or Otherwise) necessary for success. Comments please.
Kaustubh-Ajay, your question here says success. Before answering this question, I would first differentiate between 2 successes we are talking about. Success in corporate life is different from success as a entrepreneur.

For being successful as a corporate executive, MBA to a certain extent is good. It gives you certain kind of thought processes and also a platform for future success.

However, if we talk of a successful entrepreneur, I personally do not think MBA will matter much. In fact I often talk of the ‘1st of the month’ syndrome – this is the comfort of getting a handsome amount deposited as your salary every month. When you get into that comfort zone, it becomes very hard to come out. Larger the amount, harder it gets. For a successful entrepreneur – perseverance, self belief, ability of trust and ability to take risk is very important. I doubt if any MBA is going to give you that. The very same thought processes, way of thinking that help you succeed in corporate life, need to be challenged as an entrepreneur.

Ajay- Whats your vision for your web site. Which website is a good analogy for it? Why should anyone visit the website?

Kaustubh- I am not a technical person, but having said that, I see my website to be the focal point of my business. I myself built my website using widgets, etc and going forward all my business will happen from the site. By 2010, we will put a strong CRM and PRM on the website, thus enabling all business processes to be routed through the website. Like I said, I am not a techie, but I think Web 2.0, participative nature of the internet and cloud computing are going to help me save and optimize. We already have an online chat built in site, any customer can come and get more details about our programs.

Going forward, customers will be able to do bookings themselves on the site. Vendors will be able to log in do all necessary business through website and we plan to implement SFA for our employees. I believe this answers the vision and why should anyone visit my site.

Ajay-What is your favorite incident in this short period of your startup. What were the key learnings. Are you seeking venture capital funds.

Kaustubh- For customers, I thought the typical profile that will come will be young males, I was delighted when a female became our first customer. We have tweaked our marketing strategy and positioning after that.

At this stage my baby is too young and fragile. If I give her crutches to walk, she will never be able to stand up herself and be counted. So while we will go for external funding at some point of time, that time is not now. With our kind of business model, right now we are not ready for the interference of a venture capitalist.

dsc01530-300x224

So if you always wanted to travel to India and have an adventure as well contact Kaustubh at http://www.wccindia.com/rider/R_kaustubh.html and he will show you to be a White Rider too.

02020022_jpg

Read more about his company here – http://www.wccindia.com/rider/whywhite.html

Interview Dominic Pouzin Data Applied

Here is an interview with Dominic Pouzin, CEO of http://www.data-applied.com which is a startup making waves in the fields of Data Visualization.
meAjay – Describe your career in applied science. What made you decide to pursue a career in science? Some people think that careers in science are boring. How would you convince a high school student to choose a career in science?

Dominic- It’s important to realize that we are surrounded by products of science and engineering. By products of science, I mean bridges we cross on our way to work, video games we play for entertainment, or even the fabric of clothes we wear. Anyone who is curious should want to know how things really work. In that case, a scientific education makes sense, because it provides the tools necessary to understand and improve our world. I would also argue that a scientific training can also be a stepping stone towards high levels of achievements in other fields. For example, to become a financial wizard, a top patent attorney, or direct large clinical trials, a scientific education serves as a strong foundation. In addition, it’s probably easier to switch from science to another field than the other way round. Who wants to learn about matrix calculus in their forties? In my case, I graduated with a Masters in Computer Science degree, and spent 10 years at Microsoft leading software development teams for the Windows server, Exchange server, and Dynamics CRM product lines. I wish that, along the way, I had found time for a PhD in data mining, but years of practical software engineering experience also has its advantages.

Ajay- What advice would you give to someone who just got laid off, and is pondering whether he should / should not start a business?

Dominic- Working for a large company used to mean trading some autonomy for more stability and access to a wide array of resources. However, in this economy, the terms of the equation have changed. Many workers who lost their jobs found that this stability had disappeared. Others found that resources have become scarcer due to shrinking budgets. With this shift in the balance, entrepreneurship starts becoming more appealing.

Creating your own business might sound daunting, but for example creating a US Washington State LLC takes about 15 minutes, costs 200 dollars, and only requires an Internet connection. Managing payroll may sound like a big headache, but again specialized companies can handle all payroll matters on your behalf for only a few dollars a month. So while this part is relatively easy, you also need two things which are more difficult to come by:

a/ an unshakable belief in what you are trying to achieve, and

b/ a willingness to handle anything that comes your way.

You need to think like a commando solider who just landed on a beach: you’ve got great skills, but you’re alone, and can’t afford to fail. Practically, you may find yourself working for weeks or months with little or no income, and friends and family thinking that you are wasting your time. So, if necessary, try finding a co-founder to boost your confidence and motivate one another. Also, unless you want to spend most of your time chasing people for money, personal savings are a must.

Ajay- So describe your company. How does data visualization work? What differentiates your company from so many data visualization companies?

Dominic- We’re trying to stir things up a bit in terms of making it easier for regular business users to benefit from data mining. For example, we enable new “BI in the cloud” scenarios by allowing users to simply point a browser to access analysis results, or by allowing applications to submit and analyze data using an XML-based API. Built-in collaboration features, and more interactive visualizations, are also definitely part of our story.

Finally, while we focus on data mining (ex: time series forecasting, association rule mining, decision trees, etc.), we also make available other things such as pivot charts or tree maps. No data mining algorithm there, but why should business users care as long as the insight is there?

dataapplied_overview-500x326

To answer your question about visualization, most packages offer basic features such as the ability to pick colors, or to change labels, etc. For differences to emerge, you have to ask the right questions.

*

Access: does visualization require an application to be installed on each computer? Our visualization work directly from a web page, so there is nothing to install (and upgrades are automatic).
*

Search: can visualization results be searched, so as to enable drill-down scenarios? In the age of Google, we enable search everywhere, so that views can be constrained to what the user is looking for.
*

Collaboration: can visualization results be tagged using comments, or shared with other users while securely controlling access, etc.? Visualization is only a starting point – chances are that you will need to talk to someone before analysis is complete – so we offer plenty of collaboration features.
*

Export:how easy is it for a business user to present analysis results to management in a way that is understandable? We make it easy to export visualization content to a shared gallery, and as presentation-ready images.

There are a couple of other things we do as well in terms of interaction (ex: zoom, select, focus, smart graph layout), and a couple we don’t have yet (ex: geo-mapping, export to PDF).

But in conclusion, I would say that useful data visualization is as much about the way you present data (and that must be compelling!), as it is about how one accesses, searches, secures, shares, or exports visualizations.

Ajay- The technology sector was hit the hardest by the immigration of skilled workers. As a technology worker, what do you have to say about immigration? What do you have to say about outsourcing? Do you have any plans for selling your products outside the United States?

Dominic- I am a US permanent resident, half French, half British, and my wife is Indian. So you won’t find it surprising to hear that I am in favor of immigration. In 1996, as an engineering student in France, I made the unusual choice to study one year at the Indian Institute of Technology (Delhi).

In fact, I was the only one in my engineering college (France’s largest) to select India as a destination (my friends all went to the US, UK, Australia, Germany, etc.). Now that India has become a recognized player in the IT field, several dozen students from the same engineering college chose India as a destination. So I guess the immigration is starting to flow both ways!

Also, among the people I used to work with at Microsoft and who left to start a company, a good proportion are immigrants. So it’s important to recognize that immigrants not only help fill high-tech positions, but also create jobs.

Finally, as an entrepreneur trying to keep costs low, outsourcing is a tool you can’t afford to ignore. For example, websites such as http://www.elance.com provide easy access to the global marketplace. For those worried about quality, it’s possible to review customer ratings and portfolios. We keep track of visitors coming to our website, and the majority of the visitors to date have been from outside the US.

Ajay- What is the basic science used by your company’s product?

Dominic – We use a client / server model. On the server, at the lowest level, we use SQL databases (accessed using ODBC), acting as data and configuration repositories.

Immediately above that sits a computing layer, which offers scalable, distributed data mining algorithms. We implement algorithms which scale well with the number of rows and attributes, but also properly handle a mix of discrete / numeric / missing values.

For example, just for clustering, the literature has some incredibly powerful algorithms (ex: WaveCluster, an algorithm based on wavelet transforms), but which also fail as soon as you enter real-world situations (ex: some fields are discrete).

On top of the computing layer sits a rich, secure web-based XML API, which allows users to manipulate analysis and collaboration objects, while enforcing security.

For the client, we built a web-based visualization application using Microsoft Silverlight. To ensure client / server communications are as efficient as possible, we use a fair amount of data compression and caching.

Ajay- Who are your existing clients and what is the product launch plan for next year?

Dominic- We’re only in alpha mode right now, so our next customers are in fact beta testers. We’re still busy adding new features. It’s good to be small and nimble, it allows us to move quickly. Sorry, I can’t confirm any launch date yet!

Ajay- What does the CEO of a startup company do, when he has free time (assuming he has any)?

Dominic- When you spend most of your time working on analytics, it’s sometimes hard to leave your analytical brain at work.

For example, I am sure that readers who come to your website and visit a casino can’t help themselves and immediately start calculating the exact odds of winning (instead of just having fun).

Among other things, I enjoy challenging friends to programming puzzles (actually, they’re recycled Microsoft interview questions). My current bedtime reading is a book about data compression. I think you got the picture!

******************************************************************

Dominic is currently making promising data visualization products athttp://data-applied.com/ .To read more about him, please visit his profile pagehttp://www.analyticbridge.com/profile/DominicPouzin

The World's Largest Analytics Networker

1) What prompted you take a career in science, and what has been the reason you stuck to it, and been a success in it ?

I was doing mathematics for fun at a very young age when my friends were interested in sports, cars and movies. When I finished my master, I was approached by one of the professors to pursue a PhD program. It was in statistics (image analysis, bayesian clustering), and I thought that choosing statistics rather than number theory  or numerical analysis would increase my chances of getting a job after presenting my thesis. At that time, my favorite subject was indeed number theory – I was even published in J. of Number Theory. After earning my PhD, I moved to Cambridge, then North Carolina, then the Internet industry – with a very interesting detour into finance and risk management / fraud detection between 2002 and 2005.

2) AnalyticBridge is the world’s largest network for analytic professionals ? What prompted you to build it, what were the critical milestones, and what is your vision for it ?

It is a convergence of multiple factors. The feeling that the startup I was involved with at that time wasn’t doing well, the fact that I had a large network (thanks in part to LinkedIn) and that I discovered Ning.com (while browsing recruiter networks) on February 16th, 2008 – the date AnalyticBridge was born. I decided to create and grow AnalyticBridge very fast, both through networking, quality content, and significant paid advertising. I hope that within 5 years it will be five times bigger in terms of members, and even more profitable

Continue reading “The World's Largest Analytics Networker”

Interview Alan Churchill Savian

An interview with Alan Churchill, SAS Consultant and Alumni of SAS Institute.

Ajay- What’s the latest trend you see in Computer Programming over the next year and next three to five years.

Alan- Silverlight and Flex will be huge and will really enable much more SaaS. The current web simply needs wholesale replacement to make it more usable for business applications. These new RIAs will allow us, as developers, to take it to a whole new level. Expect a massive influx of dollars into web redesign and redevelopment.

Ajay-  Tell us how you came in this field of work, and what factors made you succeed.

Alan- I got into computers in high school (this was very early computing). I loved the sense of challenge that computers offered: they were a big crossword puzzle. I succeeded because I never viewed a problem the way a typical computer person or scientist would view them. As a history guy, I took a more holistic approach to problems. Heck, if you don’t know about a particular theory, you won’t be constrained by it. If you do know it, sometimes ignore it to get the job done, even if it isn’t as pretty.

Ajay-  Most challenging and fun project you ever did (anonymous details)

Alan- I have had many, many rewarding projects. As a consultant, every job is different. However, the spare time project one I am currently working on (figuring out the layout of the sas dataset) is perhaps my favorite due to the complexity.

Ajay- Advice to people wanting to join computer programming as a career- Positive Things, Challenges, Skill Requirements.

Alan- First of all, programming is hard so be prepared to work to be good. Never ever stop evolving and looking for the next thing: you are only as good as your last 18 months of experience.

The career is very rewarding since you are continuously facing challenges that must be overcome. Computers have no patience for mistakes so they require a lot of patience for programmers.

Always, always, always think outside of the box. Approach problems differently. If you hit an obstacle, move around it rather than always trying to burrow through. At the end of the day, it is all about getting the job solved at the speed of business not finding a cool, nifty new algorithm: do that on your spare time.

Ajay- Would you like to visit India for work/travel.

Alan- I honestly don’t like to travel long distances. After a long corporate career flying over a million miles, travel is simply taxing to me and takes me away from what I love to do: programming. As a history major, I love various cultures and would enjoy the beauty and history that India provides but would dread the flight ;-]

Bio;

Alan Churchill has been coding in SAS for over 20 years and worked at SAS as a senior consultant for 5 1/2 years.At SAS, Alan worked on the Microsoft-SAS Alliance and helped SAS customers integrate with .NET. He is also responsible for coding the engine for SAS’s web analytics product. Currently, he is the owner of Savian which specializes in Microsoft-SAS solutions. He lives and works in Colorado Springs, Colorado.

%d bloggers like this: