R for Predictive Modeling:Workshop

A view of the Oakland-San Francisco Bay Bridge...
Image via Wikipedia

A workshop on using R for Predictive Modeling, by the Director, Non Clinical Stats, Pfizer. Interesting Bay Area Event- part of next edition of Predictive Analytics World

Sunday, March 13, 2011 in San Francisco

R for Predictive Modeling:
A Hands-On Introduction

Intended Audience: Practitioners who wish to learn how to execute on predictive analytics by way of the R language; anyone who wants “to turn ideas into software, quickly and faithfully.”

Knowledge Level: Either hands-on experience with predictive modeling (without R) or hands-on familiarity with any programming language (other than R) is sufficient background and preparation to participate in this workshop.


Workshop Description

This one-day session provides a hands-on introduction to R, the well-known open-source platform for data analysis. Real examples are employed in order to methodically expose attendees to best practices driving R and its rich set of predictive modeling packages, providing hands-on experience and know-how. R is compared to other data analysis platforms, and common pitfalls in using R are addressed.

The instructor, a leading R developer and the creator of CARET, a core R package that streamlines the process for creating predictive models, will guide attendees on hands-on execution with R, covering:

  • A working knowledge of the R system
  • The strengths and limitations of the R language
  • Preparing data with R, including splitting, resampling and variable creation
  • Developing predictive models with R, including decision trees, support vector machines and ensemble methods
  • Visualization: Exploratory Data Analysis (EDA), and tools that persuade
  • Evaluating predictive models, including viewing lift curves, variable importance and avoiding overfitting

Hardware: Bring Your Own Laptop
Each workshop participant is required to bring their own laptop running Windows or OS X. The software used during this training program, R, is free and readily available for download.

Attendees receive an electronic copy of the course materials and related R code at the conclusion of the workshop.


Schedule

  • Workshop starts at 9:00am
  • Morning Coffee Break at 10:30am – 11:00am
  • Lunch provided at 12:30 – 1:15pm
  • Afternoon Coffee Break at 2:30pm – 3:00pm
  • End of the Workshop: 4:30pm

Instructor

Max Kuhn, Director, Nonclinical Statistics, Pfizer

Max Kuhn is a Director of Nonclinical Statistics at Pfizer Global R&D in Connecticut. He has been apply models in the pharmaceutical industries for over 15 years.

He is a leading R developer and the author of several R packages including the CARET package that provides a simple and consistent interface to over 100 predictive models available in R.

Mr. Kuhn has taught courses on modeling within Pfizer and externally, including a class for the India Ministry of Information Technology.

 

http://www.predictiveanalyticsworld.com/sanfrancisco/2011/r_for_predictive_modeling.php

 

Predictive Analytics World March2011 SF

USGS Satellite photo of the San Francisco Bay ...
Image via Wikipedia

Message from PAWCON-

 

Predictive Analytics World, Mar 14-15 2011, San Francisco, CA

More info: pawcon.com/sanfrancisco

Agenda at-a-glance: pawcon.com/sanfrancisco/2011/agenda_overview.php

PAW’s San Francisco 2011 program is the richest and most diverse yet, including over 30 sessions across two tracks – an “All Audiences” and an “Expert/Practitioner” track — so you can witness how predictive analytics is applied at Bank of America, Bank of the West, Best Buy, CA State Automobile Association, Cerebellum Capital, Chessmetrics, Fidelity, Gaia Interactive, GE Capital, Google, HealthMedia, Hewlett Packard, ICICI Bank (India), MetLife, Monster.com, Orbitz, PayPal/eBay, Richmond, VA Police Dept, U. of Melbourne, Yahoo!, YMCA, and a major N. American telecom, plus insights from projects for Anheiser-Busch, the SSA, and Netflix.

PAW’s agenda covers hot topics and advanced methods such as uplift modeling (net lift), ensemble models, social data (6 sessions on this), search marketing, crowdsourcing, blackbox trading, fraud detection, risk management, survey analysis, and other innovative applications that benefit organizations in new and creative ways.

Predictive Analytics World is the only conference of its kind, delivering vendor-neutral sessions across verticals such as banking, financial services, e-commerce, education, government, healthcare, high technology, insurance, non-profits, publishing, social gaming, retail and telecommunications

And PAW covers the gamut of commercial applications of predictive analytics, including response modeling, customer retention with churn modeling, product recommendations, fraud detection, online marketing optimization, human resource decision-making, law enforcement, sales forecasting, and credit scoring.

WORKSHOPS. PAW also features pre- and post-conference workshops that complement the core conference program. Workshop agendas include advanced predictive modeling methods, hands-on training and enterprise decision management.

More info: pawcon.com/sanfrancisco

Agenda at-a-glance: pawcon.com/sanfrancisco/2011/agenda_overview.php

Be sure to register by Dec 7 for the Super Early Bird rate (save $400):
pawcon.com/sanfrancisco/register.php

If you’d like our informative event updates, sign up at:
pawcon.com/signup-us.php

PAWCON -This week in London

Watch out for the twitter hash news on PAWCON and the exciting agenda lined up. If your in the City- you may want to just drop in

http://www.predictiveanalyticsworld.com/london/2010/agenda.php#day1-7

Disclaimer- PAWCON has been a blog partner with Decisionstats (since the first PAWCON ). It is vendor neutral and features open source as well proprietary software, as well case studies from academia and Industry for a balanced view.

 

Little birdie told me some exciting product enhancements may be in the works including a not yet announced R plugin 😉 and the latest SAS product using embedded analytics and Dr Elder’s full day data mining workshop.

Citation-

http://www.predictiveanalyticsworld.com/london/2010/agenda.php#day1-7

Monday November 15, 2010
All conference sessions take place in Edward 5-7

8:00am-9:00am

Registration, Coffee and Danish
Room: Albert Suites


9:00am-9:50am

Keynote
Five Ways Predictive Analytics Cuts Enterprise Risk

All business is an exercise in risk management. All organizations would benefit from measuring, tracking and computing risk as a core process, much like insurance companies do.

Predictive analytics does the trick, one customer at a time. This technology is a data-driven means to compute the risk each customer will defect, not respond to an expensive mailer, consume a retention discount even if she were not going to leave in the first place, not be targeted for a telephone solicitation that would have landed a sale, commit fraud, or become a “loss customer” such as a bad debtor or an insurance policy-holder with high claims.

In this keynote session, Dr. Eric Siegel will reveal:

  • Five ways predictive analytics evolves your enterprise to reduce risk
  • Hidden sources of risk across operational functions
  • What every business should learn from insurance companies
  • How advancements have reversed the very meaning of fraud
  • Why “man + machine” teams are greater than the sum of their parts for
  • enterprise decision support

 

Speaker: Eric Siegel, Ph.D., Program Chair, Predictive Analytics World

Top of this page ] [ Agenda overview ]


IBM9:50am-10:10am

Platinum Sponsor Presentation
The Analytical Revolution

The algorithms at the heart of predictive analytics have been around for years – in some cases for decades. But now, as we see predictive analytics move to the mainstream and become a competitive necessity for organisations in all industries, the most crucial challenges are to ensure that results can be delivered to where they can make a direct impact on outcomes and business performance, and that the application of analytics can be scaled to the most demanding enterprise requirements.

This session will look at the obstacles to successfully applying analysis at the enterprise level, and how today’s approaches and technologies can enable the true “industrialisation” of predictive analytics.

Speaker: Colin Shearer, WW Industry Solutions Leader, IBM UK Ltd

Top of this page ] [ Agenda overview ]


Deloitte10:10am-10:20am

Gold Sponsor Presentation
How Predictive Analytics is Driving Business Value

Organisations are increasingly relying on analytics to make key business decisions. Today, technology advances and the increasing need to realise competitive advantage in the market place are driving predictive analytics from the domain of marketers and tactical one-off exercises to the point where analytics are being embedded within core business processes.

During this session, Richard will share some of the focus areas where Deloitte is driving business transformation through predictive analytics, including Workforce, Brand Equity and Reputational Risk, Customer Insight and Network Analytics.

Speaker: Richard Fayers, Senior Manager, Deloitte Analytical Insight

Top of this page ] [ Agenda overview ]


10:20am-10:45am

Break / Exhibits
Room: Albert Suites


10:45am-11:35am
Healthcare
Case Study: Life Line Screening
Taking CRM Global Through Predictive Analytics

While Life Line is successfully executing a US CRM roadmap, they are also beginning this same evolution abroad. They are beginning in the UK where Merkle procured data and built a response model that is pulling responses over 30% higher than competitors. This presentation will give an overview of the US CRM roadmap, and then focus on the beginning of their strategy abroad, focusing on the data procurement they could not get anywhere else but through Merkle and the successful modeling and analytics for the UK.

Speaker: Ozgur Dogan, VP, Quantitative Solutions Group, Merkle Inc.

Speaker: Trish Mathe, Life Line Screening

Top of this page ] [ Agenda overview ]


11:35am-12:25pm
Open Source Analytics; Healthcare
Case Study: A large health care organization
The Rise of Open Source Analytics: Lowering Costs While Improving Patient Care

Rapidminer and R were the number 1 and 2 in this years annual KDNuggets data mining tool usage poll, followed by Knime on place 4 and Weka on place 6. So what’s going on here? Are these open source tools really that good or is their popularity strongly correlated with lower acquisition costs alone? This session answers these questions based on a real world case for a large health care organization and explains the risks & benefits of using open source technology. The final part of the session explains how these tools stack up against their traditional, proprietary counterparts.

Speaker: Jos van Dongen, Associate & Principal, DeltIQ Group

Top of this page ] [ Agenda overview ]


12:25pm-1:25pm

Lunch / Exhibits
Room: Albert Suites


1:25pm-2:15pm
Keynote
Thought Leader:
Case Study: Yahoo! and other large on-line e-businesses
Search Marketing and Predictive Analytics: SEM, SEO and On-line Marketing Case Studies

Search Engine Marketing is a $15B industry in the U.S. growing to double that number over the next 3 years. Worldwide the SEM market was over $50B in 2010. Not only is this a fast growing area of marketing, but it is one that has significant implications for brand and direct marketing and is undergoing rapid change with emerging channels such as mobile and social. What is unique about this area of marketing is a singularly heavy dependence on analytics:

 

  • Large numbers of variables and options
  • Real-time auctions/bids and a need to adjust strategies in real-time
  • Difficult optimization problems on allocating spend across a huge number of keywords
  • Fast-changing competitive terrain and heavy competition on the obvious channels
  • Complicated interactions between various channels and a large choice of search keyword expansion possibilities
  • Profitability and ROI analysis that are complex and often challenging

 

The size of the industry, its growing importance in marketing, its upcoming role in Mobile Advertising, and its uniquely heavy reliance on analytics makes it particularly interesting as an area for predictive analytics applications. In this session, not only will hear about some of the latest strategies and techniques to optimize search, you will hear case studies that illustrate the important role of analytics from industry practitioners.

Speaker: Usama Fayyad, , Ph.D., CEO, Open Insights

Top of this page ] [ Agenda overview ]


SAS2:15pm-2:35pm

Platinum Sponsor Presentation
Creating a Model Factory Using in-Database Analytics

With the ever-increasing number of analytical models required to make fact-based decisions, as well as increasing audit compliance regulations, it is more important than ever that these models can be created, monitored, retuned and deployed as quickly and automatically as possible. This paper, using a case study from a major financial organisation, will show how organisations can build a model factory efficiently using the latest SAS technology that utilizes the power of in-database processing.

Speaker: John Spooner, Analytics Specialist, SAS (UK)

Top of this page ] [ Agenda overview ]


2:35pm-2:45pm

Session Break
Room: Albert Suites


2:45pm-3:35pm

Retail
Case Study: SABMiller
Predictive Analytics & Global Marketing Strategy

Over the last few years SABMiller plc, the second largest brewing company in the world operating in 70 countries, has been systematically segmenting its markets in different countries globally in order optimize their portfolio strategy & align it to their long term country specific growth strategy. This presentation talks about the overall methodology followed and the challenges that had to be overcome both from a technical as well as from a change management stand point in order to successfully implement a standard analytics approach to diverse markets and diverse business positions in a highly global setting.

The session explains how country specific growth strategies were converted to objective variables and consumption occasion segments were created that differentiated the market effectively by their growth potential. In addition to this the presentation will also provide a discussion on issues like:

  • The dilemmas of static vs. dynamic solutions and standardization vs. adaptable solutions
  • Challenges in acceptability, local capability development, overcoming implementation inertia, cost effectiveness, etc
  • The role that business partners at SAB and analytics service partners at AbsolutData together play in providing impactful and actionable solutions

 

Speaker: Anne Stephens, SABMiller plc

Speaker: Titir Pal, AbsolutData

Top of this page ] [ Agenda overview ]


3:35pm-4:25pm

Retail
Case Study: Overtoom Belgium
Increasing Marketing Relevance Through Personalized Targeting

 

Since many years, Overtoom Belgium – a leading B2B retailer and division of the French Manutan group – focuses on an extensive use of CRM. In this presentation, we demonstrate how Overtoom has integrated Predictive Analytics to optimize customer relationships. In this process, they employ analytics to develop answers to the key question: “which product should we offer to which customer via which channel”. We show how Overtoom gained a 10% revenue increase by replacing the existing segmentation scheme with accurate predictive response models. Additionally, we illustrate how Overtoom succeeds to deliver more relevant communications by offering personalized promotional content to every single customer, and how these personalized offers positively impact Overtoom’s conversion rates.

Speaker: Dr. Geert Verstraeten, Python Predictions

Top of this page ] [ Agenda overview ]


4:25pm-4:50pm

Break / Exhibits
Room: Albert Suites


4:50pm-5:40pm
Uplift Modelling:
Case Study: Lloyds TSB General Insurance & US Bank
Uplift Modelling: You Should Not Only Measure But Model Incremental Response

Most marketing analysts understand that measuring the impact of a marketing campaign requires a valid control group so that uplift (incremental response) can be reported. However, it is much less widely understood that the targeting models used almost everywhere do not attempt to optimize that incremental measure. That requires an uplift model.

This session will explain why a switch to uplift modelling is needed, illustrate what can and does go wrong when they are not used and the hugely positive impact they can have when used effectively. It will also discuss a range of approaches to building and assessing uplift models, from simple basic adjustments to existing modelling processes through to full-blown uplift modelling.

The talk will use Lloyds TSB General Insurance & US Bank as a case study and also illustrate real-world results from other companies and sectors.

 

Speaker: Nicholas Radcliffe, Founder and Director, Stochastic Solutions

Top of this page ] [ Agenda overview ]


5:40pm-6:30pm

Consumer services
Case Study: Canadian Automobile Association and other B2C examples
The Diminishing Marginal Returns of Variable Creation in Predictive Analytics Solutions

 

Variable Creation is the key to success in any predictive analytics exercise. Many different approaches are adopted during this process, yet there are diminishing marginal returns as the number of variables increase. Our organization conducted a case study on four existing clients to explore this so-called diminishing impact of variable creation on predictive analytics solutions. Existing predictive analytics solutions were built using our traditional variable creation process. Yet, presuming that we could exponentially increase the number of variables, we wanted to determine if this added significant benefit to the existing solution.

Speaker: Richard Boire, BoireFillerGroup

Top of this page ] [ Agenda overview ]


6:30pm-7:30pm

Reception / Exhibits
Room: Albert Suites


Tuesday November 16, 2010
All conference sessions take place in Edward 5-7

8:00am-9:00am

Registration, Coffee and Danish
Room: Albert Suites


9:00am-9:55am
Keynote
Multiple Case Studies: Anheuser-Busch, Disney, HP, HSBC, Pfizer, and others
The High ROI of Data Mining for Innovative Organizations

Data mining and advanced analytics can enhance your bottom line in three basic ways, by 1) streamlining a process, 2) eliminating the bad, or 3) highlighting the good. In rare situations, a fourth way – creating something new – is possible. But modern organizations are so effective at their core tasks that data mining usually results in an iterative, rather than transformative, improvement. Still, the impact can be dramatic.

Dr. Elder will share the story (problem, solution, and effect) of nine projects conducted over the last decade for some of America’s most innovative agencies and corporations:

    Streamline:

  • Cross-selling for HSBC
  • Image recognition for Anheuser-Busch
  • Biometric identification for Lumidigm (for Disney)
  • Optimal decisioning for Peregrine Systems (now part of Hewlett-Packard)
  • Quick decisions for the Social Security Administration
    Eliminate Bad:

  • Tax fraud detection for the IRS
  • Warranty Fraud detection for Hewlett-Packard
    Highlight Good:

  • Sector trading for WestWind Foundation
  • Drug efficacy discovery for Pharmacia & UpJohn (now Pfizer)

Moderator: Eric Siegel, Program Chair, Predictive Analytics World

Speaker: John Elder, Ph.D., Elder Research, Inc.

Also see Dr. Elder’s full-day workshop

 

Top of this page ] [ Agenda overview ]


9:55am-10:30am

Break / Exhibits
Room: Albert Suites


10:30am-11:20am
Telecommunications
Case Study: Leading Telecommunications Operator
Predictive Analytics and Efficient Fact-based Marketing

The presentation describes what are the major topics and issues when you introduce predictive analytics and how to build a Fact-Based marketing environment. The introduced tools and methodologies proved to be highly efficient in terms of improving the overall direct marketing activity and customer contact operations for the involved companies. Generally, the introduced approaches have great potential for organizations with large customer bases like Mobile Operators, Internet Giants, Media Companies, or Retail Chains.

Main Introduced Solutions:-Automated Serial Production of Predictive Models for Campaign Targeting-Automated Campaign Measurements and Tracking Solutions-Precise Product Added Value Evaluation.

Speaker: Tamer Keshi, Ph.D., Long-term contractor, T-Mobile

Speaker: Beata Kovacs, International Head of CRM Solutions, Deutsche Telekom

Top of this page ] [ Agenda overview ]


11:20am-11:25am

Session Changeover


11:25am-12:15pm
Thought Leader
Nine Laws of Data Mining

Data mining is the predictive core of predictive analytics, a business process that finds useful patterns in data through the use of business knowledge. The industry standard CRISP-DM methodology describes the process, but does not explain why the process takes the form that it does. I present nine “laws of data mining”, useful maxims for data miners, with explanations that reveal the reasons behind the surface properties of the data mining process. The nine laws have implications for predictive analytics applications: how and why it works so well, which ambitions could succeed, and which must fail.

 

Speaker: Tom Khabaza, khabaza.com

 

Top of this page ] [ Agenda overview ]


12:15pm-1:30pm

Lunch / Exhibits
Room: Albert Suites


1:30pm-2:25pm
Expert Panel: Kaboom! Predictive Analytics Hits the Mainstream

Predictive analytics has taken off, across industry sectors and across applications in marketing, fraud detection, credit scoring and beyond. Where exactly are we in the process of crossing the chasm toward pervasive deployment, and how can we ensure progress keeps up the pace and stays on target?

This expert panel will address:

  • How much of predictive analytics’ potential has been fully realized?
  • Where are the outstanding opportunities with greatest potential?
  • What are the greatest challenges faced by the industry in achieving wide scale adoption?
  • How are these challenges best overcome?

 

Panelist: John Elder, Ph.D., Elder Research, Inc.

Panelist: Colin Shearer, WW Industry Solutions Leader, IBM UK Ltd

Panelist: Udo Sglavo, Global Analytic Solutions Manager, SAS

Panel moderator: Eric Siegel, Ph.D., Program Chair, Predictive Analytics World


2:25pm-2:30pm

Session Changeover


2:30pm-3:20pm
Crowdsourcing Data Mining
Case Study: University of Melbourne, Chessmetrics
Prediction Competitions: Far More Than Just a Bit of Fun

Data modelling competitions allow companies and researchers to post a problem and have it scrutinised by the world’s best data scientists. There are an infinite number of techniques that can be applied to any modelling task but it is impossible to know at the outset which will be most effective. By exposing the problem to a wide audience, competitions are a cost effective way to reach the frontier of what is possible from a given dataset. The power of competitions is neatly illustrated by the results of a recent bioinformatics competition hosted by Kaggle. It required participants to pick markers in HIV’s genetic sequence that coincide with changes in the severity of infection. Within a week and a half, the best entry had already outdone the best methods in the scientific literature. This presentation will cover how competitions typically work, some case studies and the types of business modelling challenges that the Kaggle platform can address.

Speaker: Anthony Goldbloom, Kaggle Pty Ltd

Top of this page ] [ Agenda overview ]


3:20pm-3:50pm

Breaks /Exhibits
Room: Albert Suites


3:50pm-4:40pm
Human Resources; e-Commerce
Case Study: Naukri.com, Jeevansathi.com
Increasing Marketing ROI and Efficiency of Candidate-Search with Predictive Analytics

InfoEdge, India’s largest and most profitable online firm with a bouquet of internet properties has been Google’s biggest customer in India. Our team used predictive modeling to double our profits across multiple fronts. For Naukri.com, India’s number 1 job portal, predictive models target jobseekers most relevant to the recruiter. Analytical insights provided a deeper understanding of recruiter behaviour and informed a redesign of this product’s recruiter search functionality. This session will describe how we did it, and also reveal how Jeevansathi.com, India’s 2nd-largest matrimony portal, targets the acquisition of consumers in the market for marriage.

 

Speaker: Suvomoy Sarkar, Chief Analytics Officer, HT Media & Info Edge India (parent company of the two companies above)

 

Top of this page ] [ Agenda overview ]


4:40pm-5:00pm
Closing Remarks

Speaker: Eric Siegel, Ph.D., Program Chair, Predictive Analytics World

Top of this page ] [ Agenda overview ]


Wednesday November 17, 2010

Full-day Workshop
The Best and the Worst of Predictive Analytics:
Predictive Modeling Methods and Common Data Mining Mistakes

Click here for the detailed workshop description

  • Workshop starts at 9:00am
  • First AM Break from 10:00 – 10:15
  • Second AM Break from 11:15 – 11:30
  • Lunch from 12:30 – 1:15pm
  • First PM Break: 2:00 – 2:15
  • Second PM Break: 3:15 – 3:30
  • Workshop ends at 4:30pm

Speaker: John Elder, Ph.D., CEO and Founder, Elder Research, Inc.

 

Interview Dean Abbott Abbott Analytics

Here is an interview with noted Analytics Consultant and trainer Dean Abbott. Dean is scheduled to take a workshop on Predictive Analytics at PAW (Predictive Analytics World Conference)  Oct 18 , 2010 in Washington D.C

Ajay-  Describe your upcoming hands on workshop at Predictive Analytics World and how it can help people learn more predictive modeling.

Refer- http://www.predictiveanalyticsworld.com/dc/2010/handson_predictive_analytics.php

Dean- The hands-on workshop is geared toward individuals who know something about predictive analytics but would like to experience the process. It will help people in two regards. First, by going through the data assessment, preparation, modeling and model assessment stages in one day, the attendees will see how predictive analytics works in reality, including some of the pain associated with false starts and mistakes. At the same time, they will experience success with building reasonable models to solve a problem in a single day. I have found that for many, having to actually build the predictive analytics solution if an eye-opener. Seeing demonstrations show the capabilities of a tool, but greater value for an end-user is the development of intuition of what to do at each each stage of the process that makes the theory of predictive analytics real.

Second, they will gain experience using a top-tier predictive analytics software tool, Enterprise Miner (EM). This is especially helpful for those who are considering purchasing EM, but also for those who have used open source tools and have never experienced the additional power and efficiencies that come with a tool that is well thought out from a business solutions standpoint (as opposed to an algorithm workbench).

Ajay-  You are an instructor with software ranging from SPSS, S Plus, SAS Enterprise Miner, Statistica and CART. What features of each software do you like best and are more suited for application in data cases.

Dean- I’ll add Tibco Spotfire Miner, Polyanalyst and Unica’s Predictive Insight to the list of tools I’ve taught “hands-on” courses around, and there are at least a half dozen more I demonstrate in lecture courses (JMP, Matlab, Wizwhy, R, Ggobi, RapidMiner, Orange, Weka, RandomForests and TreeNet to name a few). The development of software is a fascinating undertaking, and each tools has its own strengths and weaknesses.

I personally gravitate toward tools with data flow / icon interface because I think more that way, and I’ve tired of learning more programming languages.

Since the predictive analytics algorithms are roughly the same (backdrop is backdrop no matter which tool you use), the key differentiators are

(1) how data can be loaded in and how tightly integrated can the tool be with the database,

(2) how well big data can be handled,

(3) how extensive are the data manipulation options,

(4) how flexible are the model reporting options, and

(5) how can you get the models and/or predictions out.

There are vast differences in the tools on these matters, so when I recommend tools for customers, I usually interview them quite extensively to understand better how they use data and how the models will be integrated into their business practice.

A final consideration is related to the efficiency of using the tool: how much automation can one introduce so that user-interaction is minimized once the analytics process has been defined. While I don’t like new programming languages, scripting and programming often helps here, though some tools have a way to run the visual programming data diagram itself without converting it to code.

Ajay- What are your views on the increasing trend of consolidation and mergers and acquisitions in the predictive analytics space. Does this increase the need for vendor neutral analysts and consultants as well as conferences.

Dean- When companies buy a predictive analytics software package, it’s a mixed bag. SPSS purchasing of Clementine was ultimately good for the predictive analytics, though it took several years for SPSS to figure out what they wanted to do with it. Darwin ultimately disappeared after being purchased by Oracle, but the newer Oracle data mining tool, ODM, integrates better with the database than Darwin did or even would have been able to.

The biggest trend and pressure for the commercial vendors is the improvements in the Open Source and GNU tools. These are becoming more viable for enterprise-level customers with big data, though from what I’ve seen, they haven’t caught up with the big commercial players yet. There is great value in bringing both commercial and open source tools to the attention of end-users in the context of solutions (rather than sales) in a conference setting, which is I think an advantage that Predictive Analytics World has.

As a vendor-neutral consultant, flux is always a good thing because I have to be proficient in a variety of tools, and it is the breadth that brings value for customers entering into the predictive analytics space. But it is very difficult to keep up with the rapidly-changing market and that is something I am weighing myself: how many tools should I keep in my active toolbox.

Ajay-  Describe your career and how you came into the Predictive Analytics space. What are your views on various MS Analytics offered by Universities.

Dean- After getting a masters degree in Applied Mathematics, my first job was at a small aerospace engineering company in Charlottesville, VA called Barron Associates, Inc. (BAI); it is still in existence and doing quite well! I was working on optimal guidance algorithms for some developmental missile systems, and statistical learning was a key part of the process, so I but my teeth on pattern recognition techniques there, and frankly, that was the most interesting part of the job. In fact, most of us agreed that this was the most interesting part: John Elder (Elder Research) was the first employee at BAI, and was there at that time. Gerry Montgomery and Paul Hess were there as well and left to form a data mining company called AbTech and are still in analytics space.

After working at BAI, I had short stints at Martin Marietta Corp. and PAR Government Systems were I worked on analytics solutions in DoD, primarily radar and sonar applications. It was while at Elder Research in the 90s that began working in the commercial space more in financial and risk modeling, and then in 1999 I began working as an independent consultant.

One thing I love about this field is that the same techniques can be applied broadly, and therefore I can work on CRM, web analytics, tax and financial risk, credit scoring, survey analysis, and many more application, and cross-fertilize ideas from one domain into other domains.

Regarding MS degrees, let me first write that I am very encouraged that data mining and predictive analytics are being taught in specific class and programs rather than as just an add-on to an advanced statistics or business class. That stated, I have mixed feelings about analytics offerings at Universities.

I find that most provide a good theoretical foundation in the algorithms, but are weak in describing the entire process in a business context. For those building predictive models, the model-building stage nearly always takes much less time than getting the data ready for modeling and reporting results. These are cross-discipline tasks, requiring some understanding of the database world and the business world for us to define the target variable(s) properly and clean up the data so that the predictive analytics algorithms to work well.

The programs that have a practicum of some kind are the most useful, in my opinion. There are some certificate programs out there that have more of a business-oriented framework, and the NC State program builds an internship into the degree itself. These are positive steps in the field that I’m sure will continue as predictive analytics graduates become more in demand.

Biography-

DEAN ABBOTT is President of Abbott Analytics in San Diego, California. Mr. Abbott has over 21 years of experience applying advanced data mining, data preparation, and data visualization methods in real-world data intensive problems, including fraud detection, response modeling, survey analysis, planned giving, predictive toxicology, signal process, and missile guidance. In addition, he has developed and evaluated algorithms for use in commercial data mining and pattern recognition products, including polynomial networks, neural networks, radial basis functions, and clustering algorithms, and has consulted with data mining software companies to provide critiques and assessments of their current features and future enhancements.

Mr. Abbott is a seasoned instructor, having taught a wide range of data mining tutorials and seminars for a decade to audiences of up to 400, including DAMA, KDD, AAAI, and IEEE conferences. He is the instructor of well-regarded data mining courses, explaining concepts in language readily understood by a wide range of audiences, including analytics novices, data analysts, statisticians, and business professionals. Mr. Abbott also has taught both applied and hands-on data mining courses for major software vendors, including Clementine (SPSS, an IBM Company), Affinium Model (Unica Corporation), Statistica (StatSoft, Inc.), S-Plus and Insightful Miner (Insightful Corporation), Enterprise Miner (SAS), Tibco Spitfire Miner (Tibco), and CART (Salford Systems).

More PAWS

Dr Eric Siegel  (interviewed here at https://decisionstats.wordpress.com/2009/07/14/interview_eric-siege/ )

continues his series of excellent analytical conferences-

Oct 19-20 – WASHINGTON DC: PAW Conference & Workshops (pawcon.com/dc)

Oct 28-29 – SAN FRANCISCO: Workshop (businessprediction.com)

Nov 15-16 – LONDON: PAW Conference & Workshop (pawcon.com/london)

March 14-15, 2011 – SAN FRANCISCO: PAW Conference & Workshops

* Register by Sep 30 for PAW London Early-Bird – Save £200
http://pawcon.com/london/register.php

* For the Oct 28-29 workshop, see http://businessprediction.com

———————–

INFORMATION ABOUT THE PAW CONFERENCES:

Predictive Analytics World ( http://pawcon.com ) is the business-focused event for predictive analytics professionals, managers and commercial practitioners, covering today’s commercial deployment of predictive analytics, across industries and across software vendors.

PAW delivers the best case studies, expertise, keynotes, sessions, workshops, exposition, expert panel, live demos, networking coffee breaks, reception, birds-of-a-feather lunches, brand-name enterprise leaders, and industry heavyweights in the business.

Case study presentations cover campaign targeting, churn modeling, next-best-offer, selecting marketing channels, global analytics deployment, email marketing, HR candidate search, and other innovative applications. The Conference agendas cover hot topics such as social data, text mining, search marketing, risk management, uplift (incremental lift) modeling, survey analysis, consumer privacy, sales force optimization and other innovative applications that benefit organizations in new and creative ways.

PAW delivers two rich conference programs in Oct./Nov. with very little content overlap featuring a wealth of speakers with front-line experience. See which one is best for you:

PAW’s DC 2010 (Oct 19-20) program includes over 25 sessions across two tracks – an “All Audiences” and an “Expert/Practitioner” track — so you can witness how predictive analytics is applied at 1-800-FLOWERS, CIBC, Corporate Executive Board, Forrester, LifeLine, Macy’s, MetLife, Miles Kimball, Monster, Paychex, PayPal (eBay), SunTrust, Target, UPMC Health Plan, Xerox, YMCA, and Yahoo!, plus special examples from the U.S. government agencies DoD, DHS, and SSA.

Sign up for event updates in the US http://pawcon.com/signup-us.php
View the agenda at-a-glance: http://pawcon.com/dc/2010/agenda_overview.php
For more: http://pawcon.com/dc
Register: http://pawcon.com/dc/register.php

PAW London 2010 (Nov 15-16) will feature over 20 speakers from 10 countries with case studies from leading enterprises in e-commerce, finance, healthcare, retail, and telecom such as Canadian Automobile Association, Chessmetrics, e-Dialog, Hamburger Sparkasse, Jeevansathi.com (India’s 2nd-largest matrimony portal), Life Line Screening, Lloyds TSB, Naukri.com (India’s number 1 job portal), Overtoom, SABMiller, Univ. of Melbourne, and US Bank, plus special examples from Anheuser-Busch, Disney, HP, HSBC, Pfizer, U.S. SSA, WestWind Foundation and others.

Sign up for event updates in the UK http://pawcon.com/signup-uk.php
View the agenda at-a-glance: http://pawcon.com/london/2010/agenda_overview.php
For more: http://pawcon.com/london
Register: http://pawcon.com/london/register.php

——————————-

PAW San Francisco Save-the-Date and Call-for-Speakers:

March 14-15, 2011
San Francisco Marriott Marquis
San Francisco, CA

PAW call-for-speakers information and submission form: (Due Oct 8)
http://www.predictiveanalyticsworld.com/submit.php

If you wish to receive periodic call-for-speakers notifications regarding Predictive Analytics World, email chair@predictiveanalyticsworld.com with the subject line “call-for-speakers notifications”.

Predictive Analytics World
http://www.predictiveanalyticsworld.com
Washington DC – London – San Francisco