Interview SPSS Olivier Jouve

SPSS recently launched a major series of products in it’s text mining and data mining product portfolio and rebranded data mining to the PASW series. In an exclusive and extensive interview, Oliver Jouve Vice President,Corporate Development at SPSS Inc talks of science careers, the recent launches, open source support to R by SPSS, Cloud Computing and Business Intelligence.

Ajay: Describe your career in Science. Are careers in science less lucrative than careers in business development? What advice would you give to people re-skilling in the current recession on learning analytical skills?

Olivier: I have a Master of Science in Geophysics and Master of Science in Computer Sciences, both from Paris VI University. I have always tried to combine science and business development in my career as I like to experience all aspects � from idea to concept to business plan to funding to development to marketing to sales.

There was a study published earlier this year that said two of the three best jobs are related to math and statistics. This is reinforced by three societal forces that are converging � better uses of mathematics to drive decision making, the tremendous growth and storage of data, and especially in this economy, the ability to deliver ROI. With more and more commercial and government organizations realizing the value of Predictive Analytics to solve business problems, being equipped with analytical skills can only enhance your career and provide job security.

Ajay: So SPSS has launched new products within its Predictive Analytics Software (PASW) portfolio � Modeler 13 and Text Analytics 13? Is this old wine in a new bottle? What is new in terms of technical terms? What is new in terms of customers looking to mine textual information?

Olivier: Our two new products — PASW Modeler 13 (formerly Clementine) and PASW Text Analytics 13 (formerly Text Mining for Clementine) � extend and automate the power of data mining and text analytics to the business user, while significantly enhancing the productivity, flexibility and performance of the expert analyst.

PASW Modeler 13 data mining workbench has new and enhanced functionality that quickly takes users through the entire data mining process � from data access and preparation to model deployment. Some the newest features include Automated Data Preparation that conditions data in a single step by automatically detecting and correcting quality errors; Auto Cluster that gives users a simple way to determine the best cluster algorithm for a particular data set; and full integration with PASW Statistics (formerly SPSS Statistics).

With PASW Text Analytics 13, SPSS provides the most complete view of the customer through the combined analysis of text, web and survey data.   While other companies only provide the text component, SPSS couples text with existing structured data, permitting more accurate results and better predictive modeling. The new version includes pre-built categories for satisfaction surveys, advanced natural language processing techniques, and it supports more than 30 different languages.

Ajay: SPSS has supported open source platforms – Python and R � before it became fashionable to do so. How has this helped your company?

Olivier: Open source software helps the democratization of the analytics movement and SPSS is keen on supporting that democratization while welcoming open source users (and their creativity) into the analytics framework.

Ajay: What are the differences and similarities between Text Analytics and Search Engines? Can we mix the two as well using APIs?

Olivier: Search Engines are fundamentally top-down in that you know what you are looking for when launching a query. However, Text Analytics is bottom-up, uncovering hidden patterns, relationships and trends locked in unstructured data � including call center notes, open-ended survey responses, blogs and social networks. Now businesses have a way of pulling key concepts and extracting customer sentiments, such as emotional responses, preferences and opinions, and grouping them into categories.

For instance, a call center manager will have a hard time extracting why customers are unhappy and churn by using a search engine for millions of call center notes. What would be the query? But, by using Text Analytics, that same call center agent will discover the main reasons why customers are unhappy, and be able to predict if they are going to churn.

Ajay: Why is Text Analytics so important?  How will companies use it now and into the future?
Olivier –
Actually, the question you should ask is, “Why is unstructured data so important?” Today, more than ever, people love to share their opinions — through the estimated 183 billion emails sent, the 1.6 million blog posts, millions of inquiries captured in call center notes, and thousands of comments on diverse social networking sites and community message boards. And, let�s not forget all data that flows through Twitter. Companies today would be short-sighted to ignore what their customers are saying about their products and services, in their own words. Those opinions � likes and dislikes � are essential nuggets and bear much more insights than demographic or transactional data to reducing customer churn, improving satisfaction, fighting crime, detecting fraud and increasing marketing campaign results.

Ajay: How is SPSS venturing into cloud computing and SaaS?

Olivier: SPSS has been at the origin of the PMML standard to allow organizations to provision their computing power in a very flexible manner � just like provisioning computing power through cloud computing. SPSS strongly believes in the benefits of a cloud computing environment, which is why all of our applications are designed with Service Oriented Architecture components.  This enables SPSS to be flexible enough to meet the demands of the market as they change with respect to delivery mode.  We are currently analyzing business and technical issues related to SPSS technologies in the cloud, such as the scoring and delivery of analytics.  In regards to SaaS, we currently offer hosted services for our PASW Data Collection (formerly Dimensions) survey research suite of products.

Ajay: Do you think business intelligence is an over used term? Why do you think BI and Predictive Analytics failed in mortgage delinquency forecasting and reporting despite the financial sector being a big spender on BI tools?

Oliver: There is a big difference between business intelligence (BI) and Predictive Analytics. Traditional BI technologies focus on what�s happening now or what�s happened in the past by primarily using financial or product data. For organizations to take the most effective action, they need to know and plan for what may happen in the future by using people data � and that�s harnessed through Predictive Analytics.

Another way to look at it � Predictive covers the entire capture, predict and act continuum � from the use of survey research software to capture customer feedback (attitudinal data), to creating models to predict customer behaviors, and then acting on the results to improve business processes. Predictive Analytics, unlike BI, provides the secret ingredient and answers the question, �What will the customer do next?�

That being said, financial institutions didn�t need to use Predictive Analytics to see
that some lenders sold mortgages to unqualified individuals likely to default. Predictive Analytics is an incredible application used to detect fraud, waste and abuse. Companies in the financial services industry can focus on mitigating their overall risk by creating better predictive models that not only encompass richer data sets, but also better rules-based automation.

Ajay: What do people do at SPSS to have fun when they are not making complex mathematical algorithms?
Oliver: SPSS employees love our casual, friendly atmosphere, our professional and talented colleagues, and our cool, cutting-edge technology. The fun part comes from doing meaningful work with great people, across different groups and geographies. Of course being French, I have ensured that my colleagues are fully educated on the best wine and cuisine. And being based in Chicago, there is always a spirited baseball debate between the Cubs and White Sox. However, I am yet to convince anyone that rugby is a better sport.

Biography

Olivier Jouve is Vice President, Corporate Development, at SPSS Inc. He is responsible for defining SPSS strategic directions, growth opportunities through internal development, merger and acquisitions and/or tactical alliances. As a pioneer in the field of data and text mining for the last 20 years, he has created the foundation of Text Analytics technology for analyzing customer interactions at SPSS. Jouve is a successful serial entrepreneur and has had his works published internationally in the area of Analytical CRM, text mining, search engines, competitive intelligence and knowledge management.

Interview KNIME Fabian Dill

fabian We have covered KNIME.com ‘s open source platform earlier. On the eve of it’s new product launch, co-founder of Knime.com Fabian Dill reveals his thoughts in an exclusive interview.

From the Knime.com website

The modular data exploration platform KNIME, originally solely developed at the University of Konstanz, Germany, enables the user to visually create data flows – or pipelines, execute selected analysis steps, and later investigate the results through interactive views on data and models. KNIME already has more than 2,000 active users in diverse application areas, ranging from early drug discovery and customer relationship analysis to financial information integration.

Ajay – What prompted you personally to be part of KNIME and not join a big technology  company?  What does the future hold for KNIME in 2009-10?

Fabian -I was excited when I first joined the KNIME team in 2005. Back then, we were working exclusively on the open source version backed by some academic funding. Being part of the team that put together such a professional data mining environment from scratch was a great experience. Growing this into a commercial support and development arm has been a thrill as well. The team and the diverse experiences gained from helping get a new company off the ground and being involved in everything it takes to enable this to be successful made it unthinkable for me to work anywhere else.

We continue to develop the open source arm of KNIME and many new features lie ahead: text, image, and time series processing as well as better support for variables. We are constantly working on adding new nodes. KNIME 2.1 is expected in the fall and some of the ongoing development can already be found on the KNIME Labs page (http://labs.knime.org)

The commercial division is providing support and maintenance subscriptions for the freely available desktop version. At the same time we are developing products which will streamline the integration of KNIME into existing IT infrastructures:

  • the KNIME Grid Support lets you run your compute-intensive (sub-) workflows or nodes on a grid or cluster;

  • KNIME Reporting makes use of KNIME’s flexibility in order to gather the data for your report and provides simplified views (static or interactive=dashboards) on the resulting workflow and its results; and

  • the KNIME Enterprise Server facilitates company-wide installation of KNIME and supports collaboration between departments and sites by providing central workflow repositories, scheduled and remote execution, and user rights management.

Ajay -Software as a service and Cloud Computing is the next big thing in 2009. Are there any plans to put KNIME on a cloud computer and charge clients for the hour so they can build models on huge data without buying any hardware but just rent the time?

Fabian – Cloud computing is an agile and client-centric approach and therefore fits nicely into the KNIME framework, especially considering that we are already working on support for distributed computing of KNIME workflows (see above). However, we have no immediate plans for KNIME workflow processing on a per-use charge or similar. That’s an interesting idea, though. The way KNIME nodes are nicely encapsulated (and often even distributable themselves) would make this quite natural.

Ajay – What differentiates KNIME from other products such as RPro and Rapid Miner, for example? What are the principal challenges you have faced in developing it? Why do customers like and dislike it?

Fabian- Every tool has its strengths and weaknesses depending on the task you actually want to accomplish. The focus of KNIME is to support the user during his or her quest of understanding large and heterogeneous data and to make sense out of it. For this task, you cannot rely only on classical data mining techniques, wrapping them into a command line or otherwise configurable environment, but simple, intuitive access to those tools is required in addition to supporting visual exploration with interactive linking and brushing techniques.

By design, KNIME is a modular integration platform, which makes it easy to write own nodes (with the easy-to-use API) or integrate existing libraries or tools.

We integrated Weka, for example, because of its vast library of state-of-the-art machine learning algorithms, the open source program R – in order to provide access to a rich library of statistical functions (and of course many more) – and parts of the Chemistry Development Kit (CDK). All these integrations follow the KNIME requirements for easy and intuitive usage so the user does not need to understand the details of each tool in great depth.

A number of our commercial partners such as Schroedinger, Infocom, Symyx, Tripos, among others, also follow this paradigm and similarly integrate their tools into KNIME. Academic collaborations with ETH Zurich, Switzerland on the High Content Screening Platform HC/DC represent another positive outcome of this open architecture. We believe that this strictly result-oriented approach based on a carefully designed and professionally coded framework is a key factor of KNIME’s broad acceptance. I guess this is another big differentiator: right from the start, KNIME has been developed by a team consisting of SW developers with decades of industrial SW engineering experience.

Ajay – Any there any Asian plans for KNIME? Any other open source partnerships in the pipeline?

Fabian – We have a Japan-based partner, Infocom, who operates in the fields of life science. But we are always open for other partnerships, supporters, or collaborations.

In addition to the open source integrations mentioned above (Weka, R, CDK, HC/DC), there are many other different projects in the works and partnerships under negotiation. Keep an eye on our blog and on our Labs@KNIME page (labs.knime.org).

ABOUT

KNIME – development started in January 2004. Since then: 10 releases; approx. 350,000 lines of code; 25,000 downloads; an estimated 2000 active users. KNIME.com was founded in June 2008 in Zurich, Switzerland.

Fabian Dill – has been working for and with KNIME since 2005; co-founder of KNIME.com.

Interview Visual Numerics Alicia McGreevey

alicia

Here is an interview with the head of marketing of Visual Numerics, Alicia McGreevey.

Visual Numerics® is the leading provider of data analysis software, visualization solutions and expert consulting for technical, business and scientific communities worldwide (see http://www.vni.com ).

Ajay – Describe your career in science so far. How would explain embeddable analytics to a high school student who has to decide between getting a MBA or a Science degree.

Alicia – I think of analytics as analyzing a situation so you can make a decision. To do that objectively, you need data about your situation. Data can be anything: foreign currency exchange rates, the daily temperature here in Houston, or Tiger Wood’s record at the Master’s tournament when he’s not leading after the 3rd round.

Embedding analytics is simply making the analysis part of an application close to, or embedded with, your data. As an example, we have a customer in Germany, GFTA (Gesellschaft Fuer Trendanalysen), who has built an application that embeds analytics to analyze historic and live tick foreign exchange rate data. Their application gives treasuries and traders predictions on what is about to happen to exchange rates so they can make good decisions on when to buy or sell.

Embedding analytics is as much a business discipline as it is science. Historically, our analytics have been used predominantly by the government and scientific community to perform heavy science and engineering research. As business intelligence becomes increasingly important to compete in today’s marketplace, our analytics can now be found driving business decisions in industries like financial services, healthcare and manufacturing. Partners like Teradata and SAP are embedding our analytics into their software as a way to extend their current offerings. As their customers demand more custom BI solutions to fit unique data sets, our analytics provide a more affordable approach to meet that need. Customers now have an option to implement custom BI without incurring the massive overhead that you would typically find in a one-size-fits-all solution.

If you’re a student, I’d recommend you invest time and course work in the area of analytics regardless of the discipline you choose to study. The term analytics is really just a fancy term for math and statistics. I’ve taken math and statistics courses as part of a science curriculum and as part of a business curriculum. Being able to make optimal decisions by objectively analyzing data is a skill that will help you in business, science, engineering, or any area.

Ajay – You have been working behind the scenes quietly building math libraries that power many partners. Could you name a few success stories so far.

Alicia – One of the most interesting things about working at Visual Numerics is our customers. They create fascinating analytic applications using mathematic and statistical functions from our libraries. A few examples:

  • Total, who you probably know as one of the world’s super major oil companies, uses our math optimization routines in an application that automatically controls the blending of components in the production of gasoline, diesel and heavy fuels. By making best use of components, Total helps minimize their refining costs while maximizing revenue.

  • The Physics Department at the University of Kansas uses nonlinear equation solvers from our libraries to develop more efficient particle beam simulations. By simulating the behavior of particle beams in particle accelerators, scientists can better design particle accelerators, like the LHC or Large Hadron Collider, for high-energy research.

  • A final example that I think is interesting, given the current economic situation, is from one of our financial customers RiskMetrics Group. RiskMetrics uses functions from our libraries to do financial stress testing that allows portfolio fund managers simulate economic events, like the price of oil spiking 10% or markets diving 20%. They use this information to predict impacts on their portfolio and make better decisions for their clients.

Ajay – What have been the key moments in Visual Numerics path so far.

Alicia – Our company has been in business for over 38 years, rooted in the fundamentals of mathematics and statistics. It started off as IMSL, offering IMSL Numerical Libraries as a high performance computing tool for numerical analysis. Before visualization was fashionable, we saw visualization as an important part of the data analysis process. As a result, the company merged with Precision Visuals, makers of PV-WAVE (our visual data analysis product) in the 1990s to become what is now known as Visual Numerics.

Looking back at recent history, a major event for Visual Numerics was definitely when SAP AG licensed the libraries at the end of 2007. For several years leading up to 2007, we’d seen increased interest in our libraries from independent software vendors (ISVs). More and more ISVs with broad product offerings were looking to provide their customers with analytic capabilities, so we had invested considerably in making the libraries more attractive to this type of customer. Having SAP, one of the largest and most respected ISVs in the world, license our products gave us confidence that we could be a valued OEM partner to this type of customer.

Ajay – What are the key problems you face in your day to day job as a Visual Numerics employee. How do you have fun when not building math libraries.

Alicia – In marketing, our job is to help potential users of our libraries understand what it is we offer so that they can determine if what we offer is of value to them. Often the hardest challenge we face is simply finding that person. Since our libraries are embeddable, they’ve historically been used by programmers. So we’ve spent a lot of time at developer conferences and sponsoring developer websites, journals and academic programs.

One product update this year is that we’ve made the libraries available from Python, a dynamic scripting language. Making IMSL Library functions available from Python basically means that someone who is not a trained programmer can now use the math and stats capabilities in the IMSL Libraries just like a C, Java, .Net or Fortran developer. It’s an exciting development, though brings with it the challenge of letting a whole new set of potential users know about the capabilities of the libraries. It’s a fun challenge though.

On a more fun side of things, you may be interested to know that our expertise in math and statistics led us to some Hollywood fame. At one point in time, we were selected to review scripts for the crime busting drama, NUMB3RS. NUMB3RS, aired on CBS in the US and features an FBI Special Agent who recruits his brilliant mathematician brother to use the science of mathematics with its complex equations to solve the trickiest crimes in Los Angeles. So yes, the math behind the Show is real and it is exciting indeed to see how math can be applied in all aspects of our lives, including ferreting out criminals on TV!

AjayWhat is the story ahead. How do you think Visual Numerics can help demand forecasting and BI to say BYE to the recession.

We’re seeing more success stories from customers using analytics and data to make good decisions and I think the more organizations leverage analytics, the faster we’ll emerge from this economic slump.

As an example, we have a partner, nCode International, who makes software to help manufacturers collect and analyze test data and use the analysis to make design decisions. Using it, automobile manufacturers can, for example, analyze real-world driving pattern data for different geographic areas (e.g., emerging markets like China and India versus established markets like the USA and Europe) and design the perfect vehicle for specific markets.

So the analytic successes are out there and we know that organizations have multitudes of data. Certainly every organization that we work with has more data today than ever before. For analytics to help us say Bye to the recession, I think we need to continue to promote our successes, make analytic tools available to more users, and get users across multiple disciplines and industries using analytics to make the best possible decisions for their organizations.

Personal Biography:

As Director of Marketing for Visual Numerics, Alicia is an authority on how organizations are using advanced analytics to improve performance. Alicia brings over 15 years of experience working with scientists and customers in the planning and development of new technology products and developing go to market plans. She has a B.A. in Mathematics from Skidmore College and an M.B.A. from the University of Chicago Booth School of Business.

Interview Françoise Soulie Fogelman, KXEN

This week KXEN launched it’s social network analysis tool thus gaining a unique edge in being the first to launch social network tools for analytics. Having worked with KXEN as an analyst for scoring model- I am aware of the remarkable innovations they bring to their premium products. In an exclusive interview ,KXEN’s Vice President for  Strategic Business Development, Françoise Soulie Fogelman agreed to share some light on this remarkable new development in statistical software development.

 

Ajay – Françoise, how does the Social Network Analysis module helps model building for marketing professionals.

Françoise- KXEN Social network Analysis module (KSN) helps build models which take into account interactions between customers. This is done in 3 stages :

  • The data describing interactions is used to build a social network structure (actually, usually various social network structures are built in one pass through the data). You can explore your network to understand better the behavior of a given customer and what is happening around him.
  • From each social network structure, a set of attributes is automatically built by KSN for each node: it could be number of neighbors, average value of a given customer attribute among neighbors … Actually, you can have statistics on anything you have loaded into the system as customer node decoration. Usually, you’ll generate at this stage a few tens of social attributes per social network structure.
  • You then join these social attributes to the existing customer attributes. After that, you build your model as usual.

Ajay – But how does the KSN module work and which mathematical technique is it based on (or is it just addition of extra variables). Are there any proprietary patents that KXEN have filed in this field (both automated modeling as well as social network analysis).

Françoise- The KSN module uses (for extracting social attributes) graph theory. KXEN has not filed a patent in relation with KSN.

Ajay – There are many modeling software but very few which involve social network analysis though many companies have expressed interest in this. What are the present rivals to KSN module specifically in software and who do you think the future rivals will be?

Françoise- There are many software tools, but when it comes to the ability to handle very large graphs, not very many are left. We consider that our only real competitor today is SAS who has an offer for Social network Analysis, but this product is specifically targeted for fraud in bank and insurance. There are also companies positioned in Telco, usually offering a consulting service, built around an internal product. We think our solution is unique in its ability to handle very large volumes (we’re talking here more than 40 M nodes and 300 M links) and to address all industry domains. As usual, we offer a tool which is an exploratory tool, giving the customer the ability to produce by himself as many models as he wants.

Ajay – Who would be the typical customer or potential clients for KSN module? In which domains would this module be not so relevant? Are there any specific case studies that you can point out?

Françoise- This is a first version, so we do not really know yet who the typical customer will be and cannot point yet to case studies. However, Telco operators have expressed a very strong interest and we already have a Telco customer with whom we’ve worked on marketing projects. So our first case studies will most certainly come from Telco. We are working on some research projects in the retail space. We think that banks (for fraud), social sites, blogs sites and forums will be our next customers. The sector where I do not see (yet?) a potential is manufacturing industries.

Ajay – How would privacy concerns of customers be addressed with the kind of social network analysis that KSN can now offer to marketers.

Françoise- KXEN offers a tool to build models and is not concerned with the problem of collecting, storing and exploiting data: this is KXEN customer’s responsibility. Depending upon the country, there are various jurisdictions protecting the storage and use of data and those will naturally apply to building and analyzing Social Networks. However, in the case of Social Network Analysis the issue of “ethical” use will be more sensitive.

Ajay –What kind of hardware solutions go best with KXEN’s software. What are the other BI vendors that your offerings best complement with.

Françoise – KXEN software in general and KSN in particular, run on any platform. When using KSN to build decent size graphs (with tens of millions of nodes and hundreds of millions of links for example), 64 bits architecture is required. A recent survey of KXEN customers show that the BI suites used by our customers are mostly MicroStrategy and Business Objects (SAP). We also like very much to mention Advizor Solutions which offers data visualization software already embedding KXEN technology.

Ajay –Do you think the text mining as well as the Data Fusion approach can work for online web analytics, search engines or ad targeting?

Françoise –Of course, our data fusion approach can be very well suited for online web analytics and ad targeting (we have a number of partners that either are already using KXEN for this purpose or developing applications in these domains using KXEN technology). We would be more cautious for search engines per se.

Ajay –Are there any plans for offering KXEN products as a Service (like Salesforce.com) instead of the server based approach?

Françoise – We do not have yet plans to offer KXEN products as a service yet, but, again, we have partners such as Kognitio that offers analytics platforms embedding KXEN.

 

Brief Biography-

 

Françoise Soulie Fogelman is responsible for leading KXEN business development, identifying new business opportunities for KXEN and working with Product development, Sales and Marketing to help promote KXEN’s offer. She is also in charge of managing KXEN’s University Program.

Ms Soulie Fogelman has over 30 years of experience in data mining and CRM both from an academic and a business perspective. Prior to KXEN, she directed the first French research team on Neural Networks at Paris 11 University where she was a CS Professor. She then co-founded Mimetics, a start-up that processes and sells development environment, optical character recognition (OCR) products and services using neural network technology, and became its Chief Scientific Officer. After that she started the Data Mining and CRM group at Atos Origin and, most recently, she created and managed the CRM Agency for Business & Decision, a French IS company specialized in Business Intelligence and CRM.

Ms Soulie Fogelman holds a master’s degree in mathematics from Ecole Normale Superieure and a PhD in Computer Science from University of Grenoble. She was advisor to over 20 PhD on data mining, has authored more than 100 scientific papers and books and has been an invited speaker to many academic and business events.

 

   ( Ajay – So it seems like an interesting software and with the marketing avenues for social networking growing, and analytics modelers exploring the last bit of data for incremental field – this is an area where we can be sure of new developments soon. I wonder what the response from other analytics vendors including open source developers would be as this does seem a promising area for statistical modelling as well as analysis. What do you think ?? Can I search all data from Twitter , Facebook ,search results on Indeed .com and Linkedin and add it to your credit profile for creating a better propensity model .. 🙂 Will the credit or marketing behavior scores of your friends affect your propensity and thus the telecom ads you see while surfing … …)

Interview –Jon Peck SPSS

JonPeck

 

I was in the middle of interviewing people as well as helping the good people in my new role as a community evangelist at Smart Data Collective when I got a LinkedIn Request to join the SDC group  from Jon Peck .

SPSS Inc. is a leading worldwide provider of predictive analytics software and solutions. Founded in 1968, today SPSS has more than 250,000 customers worldwide, served by more than 1,200 employees in 60 countries .Now Jon is a legendary SPSS figure and a great teacher in this field .I asked him for an interview he readily agreed.

Jon Peck is a Principal Software Engineer and Technical Advisor at SPSS. He has been working with SPSS since 1983  and in the interview he talks from the breadth of his perspective and experience on things in analytics and at SPSS .

Ajay – Describe your career journey from college to today. What advice would you give to young students seeking to be hedge fund managers rather than scientists.  What are the basic things that a science education can help students with , in your opinion ?

Jon– After graduating from college with a B.A. in math, I earned a Ph. D in Economics, specializing in econometrics, and taught at a top American university for 13 years in the Economics and Statistics Departments and the School of Organization and Management.  Working in an academic environment all that time was a great opportunity to grow intellectually.  I was increasingly drawn to computing and eventually decided to join a statistical software company.  There were only two substantial ones at the time.  After a lot of thought, I joined SPSS as it seemed to be the more interesting place and one where I would be able to work in a wider variety of areas.  That was over 25 years ago!  Now I have some opportunities to teach and speak again as well as working in development, which I enjoy a lot.

I still believe in getting a broad liberal arts education along with as much quantitative training as possible.  Being able to work in very different areas has been a big asset for me.  Most people will have multiple careers, so preparing broadly is the most important career thing you can do.  As for hedge fund jobs – if there are any left, I’d say not to be starry-eyed about the money.  If you don’t choose a career that really interests you, you won’t be very successful anyway. Do what you love – subject to earning a living.

Math and scientific reasoning skills are preparation for working in many areas as well as being helpful in making the many decisions with quantitative aspects in life.  Math, especially, provides a foundation useful in many areas.  The recently announced program in the UK to improve general understanding of probability illustrates some practical value.

Ajay- What are SPSS’s contribution to Open Source software . What ,if you can disclose are any plans for further increasing that involvement.

Jon-  I wish I could talk about SPSS future plans, but I can’t.  However, the company is committed to continuing its efforts in Python and R.  By opening up the SPSS technology with these open source technologies, we are able to expand what we and our users can do.  At the same time, we can make R more attractive through nicer output and simpler syntax and taking away much of the pain.  One of the things I love about this approach is how quickly and easily new things can be produced and distributed this way compared to the traditional development cycle.  I wrote about productivity and Python recently on my blog at insideout.spss.com.

Ajay – How happy is the SPSS developer community with Python . Are there any other languages that you are considering in the future.

Jon- Many in the SPSS user community were more used to packaged procedures than to programming (except in the area of data transformations).  So Python, first, and then R were a shock.  But the benefits are so large that we have had an excellent response to both the Python and R technologies.  Some have mastered the technology and have been very successful and have made contributions back to the SPSS community.  Others are consumers of this technology, especially through our custom dialogs and extension commands that eliminate the need to learn Python or R in order to use programs in these languages.  Python is an outstanding language.  It is easy to get started with it, but it has very sophisticated features.  It has fewer dark corners than any other language I know.  While there are a few other more popular languages, Python popularity has been steadily growing, especially in the scientific and statistical communities.  But we already have support for three high-level languages, and if there is enough demand, we’ll do more.

Some of our partners prefer to use the lower-level C language interfaces we offer.  That’s fine, too.  We’re not Python zealots (well, maybe, I am).  Python, as a scripting language, isn’t as fast as a compiled language.  For many purposes this does not matter, and Python itself is written in C.  I recently wrote a Python module for TURF analysis.  The computations are simple but computationally explosive, so I was worried that it would be too slow to be useful.   It turned out to be pretty fast because of the way I could use some of Python’s built-in data structures and algorithms.  And the popular numPy and SciPy scientific and numerical libraries are written in C.

Users who would not think of themselves as developers sometimes find that a small Python effort can automate manual work with big time and accuracy improvements.  I got a note recently from a user who said, "I got it to work, and this is FANTASTIC! It will save me a lot of time in my survey analysis work."

Ajay- What are the areas where SPSS is not a good fit for using. What areas suit SPSS software the most compared to other solutions.

Jon- SPSS Statistics, the product,  is not a database.  Our strength is in applying analytical methods to data for model building, prediction, and insight.  Although SPSS Statistics is used in a wide variety of areas, we focus first on people data and think of that first when planning and designing new features.  SPSS Statistics and other SPSS products all work well with databases, and we have solutions for deploying analytics into production systems, but we’re not going to do your payroll.  One thing that was a surprise to me a few years ago is that we have a significant number of users who use SPSS Statistics as a basic reporting product but don’t do any inferential statistics.  They find that they can do customized reporting – often using the Custom Tables module – very quickly.  With Version 17, they can also do fancier and dynamic output formatting without resorting to script writing or manual editing, which is proving very attractive.

Ajay- Are there any plans for SPSS to use Software as a Service Model . Any plans to use advances in remote and cloud computing for SPSS ?

Jon- We are certainly looking at cloud computing.  The biggest challenge is being able to put things in the cloud that will be robust and reliable.

Ajay- What are SPSS’s Asia plans ? Which
country has the maximum penetration of SPSS in terms of usage.

Jon- SPSS, the company, has long been strong in Japan, and Taiwan, and Korea is also strong.  China is increasingly important, of course.  We have a large data center in Singapore.  Although India has a strong, long, history in statistical methodology, it is a much less well-developed market for us.  We have a presence there, but I don’t know the numbers. (Ajay – SPSS has been one of my first experiences in statistical software when I came up with it at my business school in 2001. In India SPSS has been very active with academia licensing and it introduced us to the nice and easy menu driven features of SPSS.)

Biography – Jon earned his Ph. D. from Yale University and taught econometrics and statistics there for 13 years before joining SPSS.

Jon joined the SPSS company in 1983 and worked on many aspects of the very first SPSS DOS product, including writing the first C code that SPSS ever shipped. Among the features he has designed are OMS (the Output Management System), the Visual Bander, Define Variable Properties, ALTER TYPE, Unicode support, and the Date and Time Wizard. Jon is the author of many of the modules on Developer Central. He is an active cyclist and hiker.

Jon Peck blogs on  SPSS Inside-Out.

Interview- BI Dashboards dMINE Sanjay Patel

If you have ever felt frustrated in  knowing business metrics in your or your client organization, negotiated with a host of either legacy applications that don’t talk to each other or good solutions that cost more than the benefit they bring – a young man from India has a solution for you. With a total implementation time range of 1-6 weeks and costs to as low as 10,000 USD for Enterprise WIDE implementation , Dmine promises to shake things up. Here is an interview with the co founder of this startup.

 

Ajay- Describe your career journey. What advice would you give to  new entrepreneurs in this recession.

Sanjay- Geared with an M. Tech from BITS Pilani, and MBA from Jamnalal Bajaj Institute of Management, Mumbai, in 2000 I teamed up with Praveen Wicliff and  ventured to start our own  Product company with a focus to deliver critical enterprise solutions  and the company has now grown to a strong team of 100 innovative minds. My current mission is to make Icicle a strong leader in business driven Software Products across all segments with the best of the delivery capabilities.
Recession is a trying time for most people and this is one phase which brings out the best in everyone as most of the innovative solutions are floated during this phase. My only suggestion would be to move from emotional connect with customers to direct tangible benefits, stay focussed on cash flow, aim high, set your goals & targets and never give up on any of these. Even in the darkest moments, find the faith to keep going.

Ajay- One more Dashboard Solution. How is dMine different from it’s competitors. What are the principal competitors.

Sanjay- If you just plainly look at dMine you would see that dMine is just another Dashboard solution but what makes dMine different from all other competitive products is its intuitiveness and User friendly features which help even the Business users to use the product most effectively.

dMine_logo_72dpi_11032008

dMine is positioned as a product for Business Users and not for IT team. Unlike other Dashboard or BI products dMine can create Dashboards in just 3 easy steps:

1- IT team connects to Data-sources & Creates Business Views

2- Users can create Dashboards & Charts with dMine’s Intuitive interface

3- Users can share Dashboards & schedule Emails in PDF or PPT

The potent combination of best-of-class looking Comprehensive Graphical and Analytical reports, Easier representation and Interpretation of Key Business Data, Integrates data from multiple systems on a single chart and / or Dashboard for real-time Analysis, all these, with minimized IT overheads is a unique proposition from us. See dMine-in-action on  www.dminebi.com and you will know the difference.

Ajay- What is the area where dMine would not be suited for dashboards.Suppose I have data for 200,000 rows x 40 columns – would dMine work for me .

Sanjay-
dMine is positioned as a pure Dashboard product that does not implement a complete BI stack which requires to work on Transactional Data to create cubes and universe.

We look beyond Data warehouses and Datamarts and emphasize on summarized data to deliver key business performance metrics with high focus on Data Visualization.

The idea here is to target the Business Executives who would see these Dashboards and they are not interested in Transactional Data but the overall performance hence the summarized Data.

Here the summarized Data could be in the form of any RDBMS Database, Flat Files, Spreadsheets, Analytics output, Cubes or Universe. We Support almost 16 Database vendors in the market starting from as small as MS Access to as big as DB2.

Ajay- What is the pricing strategy of dMine . Any other products or complements that you are thing about. Name some customer case studies or big wins.

Sanjay- dMine pricing strategy is very simple and is based keeping in mind that the product can be used by customers in the SME/SMB segment or even at the Enterprise level.

Currently we are offering dMine in two forms

  • firstly On-Premise and
  • second one as a Hosted service.

In an on-premise version, dMine has a Product License fee and per user license fees issued separately.

At additional cost you can have loads of Add-on goodies catering to various needs of the customers. All the cost mentioned above are just one time.

The Hosted version, is on a monthly subscription model where the cost is decided based on the various parameters like the usage of Bandwidth, Server configuration, Disk space etc. Very soon we will be enabled to the Amazon cloud service.

Typically for an on-premise version a smaller implementation just at a corporate level with Dashboard access to only few top executives would cost anywhere between 10,000 USD to 16,000 USD plus the implementation cost which is on actuals plus Applicable taxes.

For larger implementation like for e.g in BFSI segment where you need to roll-out user licenses to all the branch managers in addition to the top executives the total User licenses goes to a few 100 licenses. Usually the implementation period is as low as a week and not more than 6 weeks.

Few of the Customers using dMine are some of the Marketing Analytics companies that use the product for submitting the final report of their Marketing / Customer Analysis to their end customers.

image

Case Study – Summary

We recently implemented dMine at a leading FM Radio channel to monitor the performance across its radio stations spread all over India.

The client being a major player in the media and communication space has to constantly monitor all their stations for their entire  operations like Revenues from sponsors, Peak and Non-Peak time Inventory & Sales, Market Share & Channel Ranking, P&L, Forecasts and other critical informations.

Currently the client uses multiple Applications for supporting these business functions which capture data in different databases and sources. The MIS-reports were manually created by extracting data from multiple sources in spreadsheets. These spreadsheets are distributed among the management, with a turn around time of about 15 days.

The dMine solution implemented, collates information from multiple data sources like ERP and Sales systems including lots of Spreadsheets. More than 70 Metrics (KPI’s) and Analytics are defined for the Client and now the management has access to these information whenever they require.

All these metrics are identified as critical and are categorized under 5 Dashboards  – Organizational KPI, Financial Dashboard, Sales Dashboard, Market Share
& Metrics Dashboard and Operational Dashboard. The Metrics are parameterized and drill-downs allows the management to get the source of issue/problem rapidly.

The implementation of dMine Business Dashboard product helped the client in effectively monitoring business operations, KPIs, and organizational performance.Making actionable and Real-time information available on-demand for the decision Makers and Operational Managers, has also helped in taking any timely critical business decisions, all these while minimizing IT overhead cost. The short implementation time lines also allows the users to see the benefits quite quickly and achieve the ROI within 1 to 3 months time. The detailed case study is readily available for your reference at our website www.dminebi.com .

Ajay-  Do you read or write blogs. What do you think about the Web 2.0 paradigm for social and community marketing.

Sanjay –I do read and write lots of blogs and am myself a member with quite a few groups that share interest in the virtual community. Web 2.0 provides a platform of many-to-many communications and in its social sense is based on the principles of collaboration & sharing, information & content putting social interaction at heart of it all.

A recent study by Fox Interactive Media reveals that 40% of social network users rely on social media outlets to learn more about brands and products. Whether you’re a freelancer promoting your own brand or part of a company, social media marketing is an essential component of an integrated campaign. If you are looking to startup new business, launch a new product & services or even expand your presence you cannot miss-out on the eMarketing process focusing on three prong strategy i.e. social networking sites, your own website, and the blogosphere these will help empower your brand and positively convey your message.

Ajay – Sanjay Patel is an experienced entrepreneur with Icicle Technologies and the Dmine dashboard is currently winning rave reviews ( see http://www.dminebi.com/ibm-nominates-icicle-as-isv-on-ibm-smart-business-platform/ ) . Here’s wishing luck to Mr Patel for the summarized data dashboard that can be a game changer at www.dminebi.com.

IcicleTech-logo_72

Interview with Anne Milley, SAS II

Anne Milley is director of product marketing, SAS Institute . In part 2 of the interview Anne talks of immigration in technology areas, open source networks ,how she misses coding and software as a service especially SAS Institute’s offering . She also reveals some preview on SAS ‘s involvement with R and mentions cloud computing.

Anne_Milley

Ajay – Labor arbitrage outsourcing versus virtual teams located globally. What is the SAS Inst position and your opinion on this. What do you feel about the recent debate on HB1 visas and job cuts. How many jobs if at all is SAS planning to cut in 2009-2010.

Anne – SAS is a global company, with customers in more than 100 countries around the world.  We hire employees in these countries to help us better serve our global customers.  Our workforce decisions are based on our business needs.  We also employ virtual teams–the feedback and insights from our global workforce help us improve and develop new products to meet the evolving needs of our customers.  (As someone who works from her home office in Connecticut, I am a fan of virtual teaming!)  We see these approaches as complementary.

The issue of the H-1B visa is a different discussion entirely.  H-1B visas, although capped, permit US employers to bring foreign employees in “specialty occupations” into this country.   The better question, though, is what is necessitating the need for H-1B visas.  We would submit that the reason the U.S. has to look outside its borders for highly qualified technical workers is because we are not producing a sufficient number of workers with the right skill sets to meet U.S. demand.  In turn, that means that our educational system is not producing students interested or qualified to pursue the STEM (science, technology, engineering or mathematics) professions (either at a K-12 or post-secondary level), or developing the workforce improvement programs that may allow workers to pursue these “specialty occupations.”  Further, any discussion about H-1B visas (or any other type of visa) should include a more comprehensive review of our nation’s immigration policies—are they working, are they not working, how or why are they, are we able to limit illegal immigration and if not, why not, etc.

I am not aware of any planned job cuts at SAS.  In fact, I am aware of a few groups which are actively hiring.

Ajay- What open source softwares have SAS Institute worked in the past and it continues to support financially as well as technologically.  Any exciting product releases in 2009-2010 that you can tell us about.

Anne- Open source software provides many options and benefits.  We see many (SAS included) embracing open source for different things.  Our software runs on Linux and we use some open-source tools in development. There are different aspects of open source software in developing SAS software:

-Development with open source tools such as Eclipse, Ant, NAnt, JUnit, etc. to build, test, and package our software

-Using open source software in our products; examples include Apache/Jakarta products such as the Apache Web Server.

-Developing open source software, making changes to an open source codebase, and optionally contributing that source back to the open source project, to adapt an open source project for use in a SAS product or for internal use. Example: Eclipse.

And we plan to do more with open source in the future.  The first step of SAS integrating with R will be shown at SAS Global Forum coming up in DC later this month.  Other announcements for new offerings are also planned at this event. 

Ajay- What do you feel about adopting Software as a service for any of  SAS Institute’s products. Any new initiatives from SAS on the cloud computing front especially in terms of helping customers cut down on hardware costs.

Anne- SAS Solutions OnDemand, the division which oversees the infrastructure and support of all our hosted offerings, is expanding in this rapidly growing market.  SAS Solutions OnDemand Drug Development was our first SaaS offering announced in January.  Additional news on new hosted offerings will be announced at SAS Global Forum later this month.  SAS doesn’t currently offer any external cloud computing options, but we’re actively looking at this area.

AjayWhich software do you personally find best to write code into and why. Do you miss writing code, if so why ?

Anne- In my current role, I have limited opportunity to write code.  At times, I do miss the logical thought process coding forces you to adopt (to do the job as elegantly as possible).  I had the opportunity to do a long-term assignment at a major financial services company in the UK last year and did get to use some SAS and JMP, including a little JSL (JMP scripting language).  There’s nothing like real-world, noisy, messy data to make you thankful for the power of writing code!  Even though I don’t write code on a regular basis, I am happy to see continued investment in the languages SAS provides—among the most recent, the addition of an algebraic optimization modeling language in our SAS/OR module contained within the SAS language as “PROC OPTMODEL.”

I have great respect for people who invest in learning (or even getting exposure to) more than one language and who appreciate the strengths of different languages for certain tasks and applications.

Ajay- It is great to see passionate people at work on both sides of the open source as well as packaged software teams- and even better for them to collaborate once in a while.Most of our work is based on scientists who came before us (especially in math theory).

Ultimately we are all just students of science anyway.

SAS Global Forum –http://support.sas.com/events/sasglobalforum/2009/

Annual event of SAS language practitioners.SAS language consists of data step and proc steps for input and output thus simplifying syntax for users.

SAS Institute – The leader of analytics software since 1970’s , it grew out of the North Carolina University, and provides jobs to thousands of people. The world’s largest privately held company, admired for it’s huge investments in Research and Development and criticized for its premium price  on packaged software solutions.A recent entrant in corporate users who are willing to support R language.