KDNuggets Survey on R

CRISP-DM
Image via Wikipedia

From http://www.kdnuggets.com/2011/03/new-poll-r-in-analytics-data-mining-work.html?k11n07

A new poll/survey on actual usage of R in Data Mining

R has been steadily growing in popularity among data miners and analytic professionals.

In KDnuggets 2010 Data Mining / Analytic Tools Poll, R was used by 30% of respondents.
In 2010 Rexer Analytics Data Miner SurveyR was the most popular tool, used by 43% of the data miners.

Another aspect of tool usefulness is how much does it help with the entire data mining process from data preparation and cleaning, modeling, evaluation, visualization and presentation (excluding deployment).

New KDnuggets Poll is asking:
What part of your analytics / data mining work in the past 12 months was done in R?

http://www.kdnuggets.com/2011/03/new-poll-r-in-analytics-data-mining-work.html?k11n07

 

Heritage Health Prize- Data Mining Contest for 3mill USD

An animation of the quicksort algorithm sortin...
Image via Wikipedia

If Netflix was about 1 mill USD to better online video choices, here is a chance to earn serious money, write great code, and save lives!

From http://www.heritagehealthprize.com/

Heritage Health Prize
Launching April 4

Laptop

More than 71 Million individuals in the United States are admitted to
hospitals each year, according to the latest survey from the American
Hospital Association. Studies have concluded that in 2006 well over
$30 billion was spent on unnecessary hospital admissions. Each of
these unnecessary admissions took away one hospital bed from someone
else who needed it more.

Prize Goal & Participation

The goal of the prize is to develop a predictive algorithm that can identify patients who will be admitted to the hospital within the next year, using historical claims data.

Official registration will open in 2011, after the launch of the prize. At that time, pre-registered teams will be notified to officially register for the competition. Teams must consent to be bound by final competition rules.

Registered teams will develop and test their algorithms. The winning algorithm will be able to predict patients at risk for an unplanned hospital admission with a high rate of accuracy. The first team to reach the accuracy threshold will have their algorithms confirmed by a judging panel. If confirmed, a winner will be declared.

The competition is expected to run for approximately two years. Registration will be open throughout the competition.

Data Sets

Registered teams will be granted access to two separate datasets of de-identified patient claims data for developing and testing algorithms: a training dataset and a quiz/test dataset. The datasets will be comprised of de-identified patient data. The datasets will include:

  • Outpatient encounter data
  • Hospitalization encounter data
  • Medication dispensing claims data, including medications
  • Outpatient laboratory data, including test outcome values

The data for each de-identified patient will be organized into two sections: “Historical Data” and “Admission Data.” Historical Data will represent three years of past claims data. This section of the dataset will be used to predict if that patient is going to be admitted during the Admission Data period. Admission Data represents previous claims data and will contain whether or not a hospital admission occurred for that patient; it will be a binary flag.

DataThe training dataset includes several thousand anonymized patients and will be made available, securely and in full, to any registered team for the purpose of developing effective screening algorithms.

The quiz/test dataset is a smaller set of anonymized patients. Teams will only receive the Historical Data section of these datasets and the two datasets will be mixed together so that teams will not be aware of which de-identified patients are in which set. Teams will make predictions based on these data sets and submit their predictions to HPN through the official Heritage Health Prize web site. HPN will use the Quiz Dataset for the initial assessment of the Team’s algorithms. HPN will evaluate and report back scores to the teams through the prize website’s leader board.

Scores from the final Test Dataset will not be made available to teams until the accuracy thresholds are passed. The test dataset will be used in the final judging and results will be kept hidden. These scores are used to preserve the integrity of scoring and to help validate the predictive algorithms.

Teams can begin developing and testing their algorithms as soon as they are registered and ready. Teams will log onto the official Heritage Health Prize website and submit their predictions online. Comparisons will be run automatically and team accuracy scores will be posted on the leader board. This score will be only on a portion of the predictions submitted (the Quiz Dataset), the additional results will be kept back (the Test Dataset).

Form

Once a team successfully scores above the accuracy thresholds on the online testing (quiz dataset), final judging will occur. There will be three parts to this judging. First, the judges will confirm that the potential winning team’s algorithm accurately predicts patient admissions in the Test Dataset (again, above the thresholds for accuracy).

Next, the judging panel will confirm that the algorithm does not identify patients and use external data sources to derive its predictions. Lastly, the panel will confirm that the team’s algorithm is authentic and derives its predictive power from the datasets, not from hand-coding results to improve scores. If the algorithm meets these three criteria, it will be declared the winner.

Failure to meet any one of these three parts will disqualify the team and the contest will continue. The judges reserve the right to award second and third place prizes if deemed applicable.

 

HIGHLIGHTS from REXER Survey :R gives best satisfaction

Simple graph showing hierarchical clustering. ...
Image via Wikipedia

A Summary report from Rexer Analytics Annual Survey

 

HIGHLIGHTS from the 4th Annual Data Miner Survey (2010):

 

•   FIELDS & GOALS: Data miners work in a diverse set of fields.  CRM / Marketing has been the #1 field in each of the past four years.  Fittingly, “improving the understanding of customers”, “retaining customers” and other CRM goals are also the goals identified by the most data miners surveyed.

 

•   ALGORITHMS: Decision trees, regression, and cluster analysis continue to form a triad of core algorithms for most data miners.  However, a wide variety of algorithms are being used.  This year, for the first time, the survey asked about Ensemble Models, and 22% of data miners report using them.
A third of data miners currently use text mining and another third plan to in the future.

 

•   MODELS: About one-third of data miners typically build final models with 10 or fewer variables, while about 28% generally construct models with more than 45 variables.

 

•   TOOLS: After a steady rise across the past few years, the open source data mining software R overtook other tools to become the tool used by more data miners (43%) than any other.  STATISTICA, which has also been climbing in the rankings, is selected as the primary data mining tool by the most data miners (18%).  Data miners report using an average of 4.6 software tools overall.  STATISTICA, IBM SPSS Modeler, and R received the strongest satisfaction ratings in both 2010 and 2009.

 

•   TECHNOLOGY: Data Mining most often occurs on a desktop or laptop computer, and frequently the data is stored locally.  Model scoring typically happens using the same software used to develop models.  STATISTICA users are more likely than other tool users to deploy models using PMML.

 

•   CHALLENGES: As in previous years, dirty data, explaining data mining to others, and difficult access to data are the top challenges data miners face.  This year data miners also shared best practices for overcoming these challenges.  The best practices are available online.

 

•   FUTURE: Data miners are optimistic about continued growth in the number of projects they will be conducting, and growth in data mining adoption is the number one “future trend” identified.  There is room to improve:  only 13% of data miners rate their company’s analytic capabilities as “excellent” and only 8% rate their data quality as “very strong”.

 

Please contact us if you have any questions about the attached report or this annual research program.  The 5th Annual Data Miner Survey will be launching next month.  We will email you an invitation to participate.

 

Information about Rexer Analytics is available at www.RexerAnalytics.com. Rexer Analytics continues their impressive journey see http://www.rexeranalytics.com/Clients.html

|My only thought- since most data miners are using multiple tools including free tools as well as paid software, Perhaps a pie chart of market share by revenue and volume would be handy.

Also some ideas on comparing diverse data mining projects by data size, or complexity.

 

Assumptions on Guns

This is a very crude yet functional homemade g...
Image via Wikipedia

While sitting in Delhi, India- I sometimes notice that there is one big new worthy gun related incident in the United States every six months (latest incident Gabrielle giffords incident) and the mythical NRA (which seems just as powerful as equally mythical Jewish American or Cuban American lobby ) . As someone who once trained to fire guns (.22 and SLR -rifles actually), comes from a gun friendly culture (namely Punjabi-North Indian), my dad carried a gun sometimes as a police officer during his 30 plus years of service, I dont really like guns (except when they are in a movie). My 3 yr old son likes guns a lot (for some peculiar genetic reason even though we are careful not to show him any violent TV or movie at all).

So to settle the whole guns are good- guns are bad thing I turned to the one resource -Internet

Here are some findings-

1) A lot of hard statistical data on guns is biased by the perspective of the writer- it reminds me of the old saying Lies, True lies and Statistics.

2) There is not a lot of hard data in terms of a universal research which can be quoted- unlike say lung cancer is caused by cigarettes- no broad research which can be definitive in this regards.

3) American , European and Asian attitudes on guns actually seem a function of historical availability , historic crime rates and cultural propensity for guns.

Switzerland and United States are two extreme outlier examples on gun causing violence causal statistics.

4) Lot of old and outdated data quoted selectively.

It seems you can fudge data about guns in the following ways-

1) Use relative per capita numbers vis a vis aggregate numbers

2) Compare and contrast gun numbers with crime numbers selectively

3) Remove drill down of type of firearm- like hand guns, rifles, automatic, semi automatic

Maybe I am being simplistic-but I found it easier to list credible data sources on guns than to summarize all assumptions on guns. Are guns good or bad- i dont know -it depends? Any research you can quote is welcome.

Data Sources on Guns and Firearms and Crime-

1) http://www.justfacts.com/guncontrol.asp

Ownership

* As of 2009, the United States has a population of 307 million people.[5]

* Based on production data from firearm manufacturers,[6] there are roughly 300 million firearms owned by civilians in the United States as of 2010. Of these, about 100 million are handguns.[7]

* Based upon surveys, the following are estimates of private firearm ownership in the U.S. as of 2010:

Households With a Gun Adults Owning a Gun Adults Owning a Handgun
Percentage 40-45% 30-34% 17-19%
Number 47-53 million 70-80 million 40-45 million

[8]

* A 2005 nationwide Gallup poll of 1,012 adults found the following levels of firearm ownership:

Category Percentage Owning 

a Firearm

Households 42%
Individuals 30%
Male 47%
Female 13%
White 33%
Nonwhite 18%
Republican 41%
Independent 27%
Democrat 23%

[9]

* In the same poll, gun owners stated they own firearms for the following reasons:

Protection Against Crime 67%
Target Shooting 66%
Hunting 41%

2) NationMaster.com

http://www.nationmaster.com/graph/cri_mur_wit_fir-crime-murders-with-firearms

VIEW DATA: Totals Per capita
Definition Source Printable version
Bar Graph Pie Chart Map

Showing latest available data.

Rank Countries Amount
# 1 South Africa: 31,918
# 2 Colombia: 21,898
# 3 Thailand: 20,032
# 4 United States: 9,369
# 5 Philippines: 7,708
# 6 Mexico: 2,606
# 7 Slovakia: 2,356
# 8 El Salvador: 1,441
# 9 Zimbabwe: 598
# 10 Peru: 442
# 11 Germany: 269
# 12 Czech Republic: 181
# 13 Ukraine: 173
# 14 Canada: 144
# 15 Albania: 135
# 16 Costa Rica: 131
# 17 Azerbaijan: 120
# 18 Poland: 111
# 19 Uruguay: 109
# 20 Spain: 97
# 21 Portugal: 90
# 22 Croatia: 76
# 23 Switzerland: 68
# 24 Bulgaria: 63
# 25 Australia: 59
# 26 Sweden: 58
# 27 Bolivia: 52
# 28 Japan: 47
# 29 Slovenia: 39
= 30 Hungary: 38
= 30 Belarus: 38
# 32 Latvia: 28
# 33 Burma: 27
# 34 Macedonia, The Former Yugoslav Republic of: 26
# 35 Austria: 25
# 36 Estonia: 21
# 37 Moldova: 20
# 38 Lithuania: 16
= 39 United Kingdom: 14
= 39 Denmark: 14
# 41 Ireland: 12
# 42 New Zealand: 10
# 43 Chile: 9
# 44 Cyprus: 4
# 45 Morocco: 1
= 46 Iceland: 0
= 46 Luxembourg: 0
= 46 Oman: 0
Total: 100,693
Weighted average: 2,097.8

DEFINITION: Total recorded intentional homicides committed with a firearm. Crime statistics are often better indicators of prevalence of law enforcement and willingness to report crime, than actual prevalence.

SOURCE: The Eighth United Nations Survey on Crime Trends and the Operations of Criminal Justice Systems (2002) (United Nations Office on Drugs and Crime, Centre for International Crime Prevention)

3)

Bureau of Justice Statistics

see

http://bjs.ojp.usdoj.gov/dataonline/Search/Homicide/State/RunHomTrendsInOneVar.cfm

or the brand new website (till 2009) on which I CANNOT get gun crime but can get total

http://www.ucrdatatool.gov/

Estimated  murder rate *
Year United States-Total

1960 5.1
1961 4.8
1962 4.6
1963 4.6
1964 4.9
1965 5.1
1966 5.6
1967 6.2
1968 6.9
1969 7.3
1970 7.9
1971 8.6
1972 9.0
1973 9.4
1974 9.8
1975 9.6
1976 8.7
1977 8.8
1978 9.0
1979 9.8
1980 10.2
1981 9.8
1982 9.1
1983 8.3
1984 7.9
1985 8.0
1986 8.6
1987 8.3
1988 8.5
1989 8.7
1990 9.4
1991 9.8
1992 9.3
1993 9.5
1994 9.0
1995 8.2
1996 7.4
1997 6.8
1998 6.3
1999 5.7
2000 5.5
2001 5.6
2002 5.6
2003 5.7
2004 5.5
2005 5.6
2006 5.7
2007 5.6
2008 5.4
2009 5.0
Notes: National or state offense totals are based on data from all reporting agencies and estimates for unreported areas.
* Rates are the number of reported offenses per 100,000 population
  • United States-Total –
    • The 168 murder and nonnegligent homicides that occurred as a result of the bombing of the Alfred P. Murrah Federal Building in Oklahoma City in 1995 are included in the national estimate.
    • The 2,823 murder and nonnegligent homicides that occurred as a result of the events of September 11, 2001, are not included in the national estimates.

     

  • Sources: 


    FBI, Uniform Crime Reports as prepared by the National Archive of Criminal Justice Data


    4) united nation statistics of 2002  were too old in my opinion.
    wikipedia seems too broad based to qualify as a research article but is easily accessible http://en.wikipedia.org/wiki/Gun_violence_in_the_United_States
    to actually buy a gun or see guns available for purchase in United States see
    http://www.usautoweapons.com/

    PAW Videos

    A message from Predictive Analytics World on  newly available videos. It has many free videos as well so you can check them out.

    Predictive Analytics World March 2011 in San Francisco

    Access PAW DC Session Videos Now

    Predictive Analytics World is pleased to announce on-demand access to the videos of PAW Washington DC, October 2010, including over 30 sessions and keynotes that you may view at your convenience. Access this leading predictive analytics content online now:

    View the PAW DC session videos online

    Register by January 18th and receive $150 off the full 2-day conference program videos (enter code PAW150 at checkout)

    Trial videos – view the following for no charge:

    Select individual conference sessions, or recognize savings by registering for access to one or two full days of sessions. These on-demand videos deliver PAW DC right to your desk, covering hot topics and advanced methods such as:

    Social data 

    Text mining

    Search marketing

    Risk management

    Survey analysis

    Consumer privacy

    Sales force optimization

    Response & cross-sell

    Recommender systems

    Featuring experts such as:
    Usama Fayyad, Ph.D.
    CEO, Open Insights Former Chief Data Officer, Yahoo!

    Andrew Pole
    Sr Mgr, Media/DB Mktng
    Target
    View Keynote for Free

    John F. Elder, Ph.D.
    CEO and Founder
    Elder Research

    Bruno Aziza
    Director, Worldwide Strategy Lead, BI
    Microsoft

    Eric Siegel, Ph.D.
    Conference Chair
    Predictive Analytics World

    PAW DC videos feature over 25 speakers with case studies from leading enterprises such as: CIBC, CEB, Forrester, Macy’s, MetLife, Microsoft, Miles Kimball, Monster.com, Oracle, Paychex, SunTrust, Target, UPMC, Xerox, Yahoo!, YMCA, and more.

    How video access works:

    View Slides on the Left See & Hear Speaker in the Right Window

    Sign up by January 18 for immediate video access and $150 discount


    San Francisco
    March 14-15, 2011
    Washington DC
    October, 2011
    London
    November, 2011
    Contact Us

    Produced by:

     

    Session Gallery: Day 1 of 2

    Viewing (17) Sessions of (31)

     

    keynote.jpg
    Add to Cart
    Keynote: Five Ways Predictive Analytics Cuts Enterprise Risk  

    Eric Siegel, Ph.D., Program Chair, Predictive Analytics World

    All business is an exercise in risk management. All organizations would benefit from measuring, tracking and computing risk as a core process, much like insurance companies do.

    Predictive analytics does the trick, one customer at a time. This technology is a data-driven means to compute the risk each customer will defect, not respond to an expensive mailer, consume a retention discount even if she were not going to leave in the first place, not be targeted for a telephone solicitation that would have landed a sale, commit fraud, or become a “loss customer” such as a bad debtor or an insurance policy-holder with high claims.

    In this keynote session, Dr. Eric Siegel reveals:

    – Five ways predictive analytics evolves your enterprise to reduce risk

    – Hidden sources of risk across operational functions

    – What every business should learn from insurance companies

    – How advancements have reversed the very meaning of fraud

    – Why “man + machine” teams are greater than the sum of their parts for enterprise decision support

    Length – 00:45:57 | Email to a Colleague

    Price: $195

     

     

    sponsor.jpg
    Play video of session: Platinum Sponsor Presentation, Analytics: The Beauty of Diversity
    Platinum Sponsor Presentation: Analytics – The Beauty of Diversity 

    Anne H. Milley, Senior Director of Analytic Strategy, Worldwide Product Marketing, SAS

    Analytics contributes to, and draws from, multiple disciplines. The unifying theme of “making the world a better place” is bred from diversity. For instance, the same methods used in econometrics might be used in market research, psychometrics and other disciplines. In a similar way, diverse paradigms are needed to best solve problems, reveal opportunities and make better decisions. This is why we evolve capabilities to formulate and solve a wide range of problems through multiple integrated languages and interfaces. Extending that, we have provided integration with other languages so that users can draw on the disciplines and paradigms needed to best practice their craft.

    Length – 20:11 | Email to a Colleague

    Free viewing enabled – no charge

     

    gold sponsor.jpg
    Play video of session: Gold Sponsor Presentation Predictive Analytics Accelerate Insight for Financial Services
    Gold Sponsor Presentation: Predictive Analytics Accelerate Insight for Financial Services 

    Finbarr Deely, Director of Business Development,ParAccel

    Financial services organizations face immense hurdles in maintaining profitability and building competitive advantage. Financial services organizations must perform “what-if” scenario analysis, identify risks, and detect fraud patterns. The advanced analytic complexity required often makes such analysis slow and painful, if not impossible. This presentation outlines the analytic challenges facing these organizations and provides a clear path to providing the accelerated insight needed to perform in today’s complex business environment to reduce risk, stop fraud and increase profits. * The value of predictive analytics in Accelerating Insight * Financial Services Analytic Case Studies * Brief Overview of ParAccel Analytic Database

    Length – 09:06 | Email to a Colleague

    Free viewing enabled – no charge

     

    isson1.jpg
    Add to Cart
    TOPIC: BUSINESS VALUE
    Case Study: Monster.com
    Creating Global Competitive Power with Predictive Analytics 

    Jean Paul Isson, Vice President, Globab BI & Predictive Analytics, Monster Worldwide

    Using Predictive analytics to gain a deeper understanding of customer behaviours, increase marketing ROI and drive growth

    – Creating global competitive power with business intelligence: Making the right decisions – at the right time

    – Avoiding common change management challenges in sales, marketing, customer service, and products

    – Developing a BI vision – and implementing it: successful business intelligence implementation models

    – Using predictive analytics as a business driver to stay on top of the competition

    – Following the Monster Worldwide global BI evolution: How Monster used BI to go from good to great

    Length – 51:17 | Email to a Colleague

    Price: $195

     

     

    abbot.jpg
    Add to Cart
    TOPIC: SURVEY ANALYSIS
    Case Study: YMCA
    Turning Member Satisfaction Surveys into an Actionable Narrative 

    Dean Abbott, President, Abbott Analytics

    Employees are a key constituency at the Y and previous analysis has shown that their attitudes have a direct bearing on Member Satisfaction. This session will describe a successful approach for the analysis of YMCA employee surveys. Decision trees are built and examined in depth to identify key questions in describing key employee satisfaction metrics, including several interesting groupings of employee attitudes. Our approach will be contrasted with other factor analysis and regression-based approaches to survey analysis that we used initially. The predictive models described are currently in use and resulted in both greater understanding of employee attitudes, and a revised “short-form” survey with fewer key questions identified by the decision trees as the most important predictors.

    Length – 50:19 | Email to a Colleague

    Price: $195

     

     

    rexer.jpg
    Add to Cart
    TOPIC: INDUSTRY TRENDS
    2010 Data Minter Survey Results: Highlights
     

    Karl Rexer, Ph.D., Rexer Analytics

    Do you want to know the views, actions, and opinions of the data mining community? Each year, Rexer Analytics conducts a global survey of data miners to find out. This year at PAW we unveil the results of our 4th Annual Data Miner Survey. This session will present the research highlights, such as:

    – Analytic goals & key challenges

    – Impact of the economy

    – Regional differences

    – Text mining trends

    Length – 15:20 | Email to a Colleague

    Price: $195

     

     

    elder.jpg
    Add to Cart
    Multiple Case Studies: U.S. DoD, U.S. DHS, SSA
    Text Mining: Lessons Learned 

    John F. Elder, Chief Scientist, Elder Research, Inc.

    Text Mining is the “Wild West” of data mining and predictive analytics – the potential for gain is huge, the capability claims are often tall tales, and the “land rush” for leadership is very much a race.

    In solving unstructured (text) analysis challenges, we found that principles from inductive modeling – learning relationships from labeled cases – has great power to enhance text mining. Dr. Elder highlights key technical breakthroughs discovered while working on projects for leading government agencies, including: Text Mining is the “Wild West” of data mining and predictive analytics – the potential for gain is huge, the capability claims are often tall tales, and the “land rush” for leadership is very much a race.

    – Prioritizing searches for the Dept. of Homeland Security

    – Quick decisions for Social Security Admin. disability

    – Document discovery for the Dept. of Defense

    – Disease discovery for the Dept. of Homeland Security

    – Risk profiling for the Dept. of Defense

    Length – 48:58 | Email to a Colleague

    Price: $195

     

     

    target.jpg
    Play video of session: Keynote: How Target Gets the Most out of Its Guest Data to Improve Marketing ROI
    Keynote: How Target Gets the Most out of Its Guest Data to Improve Marketing ROI 

    Andrew Pole, Senior Manager, Media and Database Marketing, Target

    In this session, you’ll learn how Target leverages its own internal guest data to optimize its direct marketing – with the ultimate goal of enhancing our guests’ shopping experience and driving in-store and online performance. You will hear about what guest data is available at Target, how and where we collect it, and how it is used to improve the performance and relevance of direct marketing vehicles. Furthermore, we will discuss Target’s development and usage of guest segmentation, response modeling, and optimization as means to suppress poor performers from mailings, determine relevant product categories and services for online targeted content, and optimally assign receipt marketing offers to our guests when offer quantities are limited.

    Length – 47:49 | Email to a Colleague

    Free viewing enabled – no charge

     

    analytics.jpg
    Play video of session: Platinum Sponsor Presentation: Driving Analytics Into Decision Making
    Platinum Sponsor Presentation: Driving Analytics Into Decision Making  

    Jason Verlen, Director, SPSS Product Strategy & Management, IBM Software Group

    Organizations looking to dramatically improve their business outcomes are turning to decision management, a convergence of technology and business processes that is used to streamline and predict the outcome of daily decision-making. IBM SPSS Decision Management technology provides the critical link between analytical insight and recommended actions. In this session you’ll learn how Decision Management software integrates analytics with business rules and business applications for front-line systems such as call center applications, insurance claim processing, and websites. See how you can improve every customer interaction, minimize operational risk, reduce fraud and optimize results.

    Length – 17:29 | Email to a Colleague

    Free viewing enabled – no charge

     

    macy.jpg
    Add to Cart
    TOPIC: DATA INFRASTRUCTURE AND INTEGRATION
    Case Study: Macy’s
    The world is not flat (even though modeling software has to think it is) 

    Paul Coleman, Director of Marketing Statistics, Macy’s Inc.

    Software for statistical modeling generally use flat files, where each record represents a unique case with all its variables. In contrast most large databases are relational, where data are distributed among various normalized tables for efficient storage. Variable creation and model scoring engines are necessary to bridge data mining and storage needs. Development datasets taken from a sampled history require snapshot management. Scoring datasets are taken from the present timeframe and the entire available universe. Organizations, with significant data, must decide when to store or calculate necessary data and understand the consequences for their modeling program.

    Length – 34:54 | Email to a Colleague

    Price: $195

     

     

    gwaltney.jpg
    Add to Cart
    TOPIC: CUSTOMER VALUE
    Case Study: SunTrust
    When One Model Will Not Solve the Problem – Using Multiple Models to Create One Solution 

    Dudley Gwaltney, Group Vice President, Analytical Modeling, SunTrust Bank

    In 2007, SunTrust Bank developed a series of models to identify clients likely to have large changes in deposit balances. The models include three basic binary and two linear regression models.

    Based on the models, 15% of SunTrust clients were targeted as those most likely to have large balance changes. These clients accounted for 65% of the absolute balance change and 60% of the large balance change clients. The targeted clients are grouped into a portfolio and assigned to individual SunTrust Retail Branch. Since 2008, the portfolio generated a 2.6% increase in balances over control.

    Using the SunTrust example, this presentation will focus on:

    – Identifying situations requiring multiple models

    – Determining what types of models are needed

    – Combining the individual component models into one output

    Length – 48:22 | Email to a Colleague

    Price: $195

     

     

    paychex1.jpg
    Add to Cart
    TOPIC: RESPONSE & CROSS-SELL
    Case Study: Paychex
    Staying One Step Ahead of the Competition – Development of a Predictive 401(k) Marketing and Sales Campaign 

    Jason Fox, Information Systems and Portfolio Manager,Paychex

    In-depth case study of Paychex, Inc. utilizing predictive modeling to turn the tides on competitive pressures within their own client base. Paychex, a leading provider of payroll and human resource solutions, will guide you through the development of a Predictive 401(k) Marketing and Sales model. Through the use of sophisticated data mining techniques and regression analysis the model derives the probability a client will add retirement services products with Paychex or with a competitor. Session will include roadblocks that could have ended development and ROI analysis. Speaker: Frank Fiorille, Director of Enterprise Risk Management, Paychex Speaker: Jason Fox, Risk Management Analyst, Paychex

    Length – 26:29 | Email to a Colleague

    Price: $195

     

     

    ling.jpg
    Add to Cart
    TOPIC: SEGMENTATION
    Practitioner: Canadian Imperial Bank of Commerce
    Segmentation Do’s and Don’ts 

    Daymond Ling, Senior Director, Modelling & Analytics,Canadian Imperial Bank of Commerce

    The concept of Segmentation is well accepted in business and has withstood the test of time. Even with the advent of new artificial intelligence and machine learning methods, this old war horse still has its place and is alive and well. Like all analytical methods, when used correctly it can lead to enhanced market positioning and competitive advantage, while improper application can have severe negative consequences.

    This session will explore what are the elements of success, and what are the worse practices that lead to failure. The relationship between segmentation and predictive modeling will also be discussed to clarify when it is appropriate to use one versus the other, and how to use them together synergistically.

    Length – 45:57 | Email to a Colleague

    Price: $195

     

     

    kobelius1.jpg
    Add to Cart
    TOPIC: SOCIAL DATA
    Thought Leadership
    Social Network Analysis: Killer Application for Cloud Analytics
     

    James Kobielus, Senior Analyst, Forrester Research

    Social networks such as Twitter and Facebook are a potential goldmine of insights on what is truly going through customers´minds. Every company wants to know whether, how, how often, and by whom they´re being mentioned across the billowing new cloud of social media. Just as important, every company wants to influence those discussions in their favor, target new business, and harvest maximum revenue potential. In this session, Forrester analyst James Kobielus identifies fruitful applications of social network analysis in customer service, sales, marketing, and brand management. He presents a roadmap for enterprises to leverage their inline analytics initiatives and leverage high-performance data warehousing (DW) clouds and appliances in order to analyze shifting patterns of customer sentiment, influence, and propensity. Leveraging Forrester’s ongoing research in advanced analytics and customer relationship management, Kobielus will discuss industry trends, commercial modeling tools, and emerging best practices in social network analysis, which represents a game-changing new discipline in predictive analytics.

    Length – 48:16 | Email to a Colleague

    Price: $195

     

     

    dogan.jpg
    Add to Cart
    TOPIC: HEALTHCARE – INTERNATIONAL TARGETING
    Case Study: Life Line Screening
    Taking CRM Global Through Predictive Analytics 

    Ozgur Dogan,
    VP, Quantitative Solutions Group, Merkle Inc

    Trish Mathe,
    Director of Database Marketing, Life Line Screening

    While Life Line is successfully executing a US CRM roadmap, they are also beginning this same evolution abroad. They are beginning in the UK where Merkle procured data and built a response model that is pulling responses over 30% higher than competitors. This presentation will give an overview of the US CRM roadmap, and then focus on the beginning of their strategy abroad, focusing on the data procurement they could not get anywhere else but through Merkle and the successful modeling and analytics for the UK. Speaker: Ozgur Dogan, VP, Quantitative Solutions Group, Merkle Inc Speaker: Trish Mathe, Director of Database Marketing, Life Line Screening

    Length – 40:12 | Email to a Colleague

    Price: $195

     

     

    sambamoorthi1.jpg
    Add to Cart
    TOPIC: SURVEY ANALYSIS
    Case Study: Forrester
    Making Survey Insights Addressable and Scalable – The Case Study of Forrester’s Technographics Benchmark Survey 

    Nethra Sambamoorthi, Team Leader, Consumer Dynamics & Analytics, Global Consulting, Acxiom Corporation

    Marketers use surveys to create enterprise wide applicable strategic insights to: (1) develop segmentation schemes, (2) summarize consumer behaviors and attitudes for the whole US population, and (3) use multiple surveys to draw unified views about their target audience. However, these insights are not directly addressable and scalable to the whole consumer universe which is very important when applying the power of survey intelligence to the one to one consumer marketing problems marketers routinely face. Acxiom partnered with Forrester Research, creating addressable and scalable applications of Forrester’s Technographics Survey and applied it successfully to a number of industries and applications.

    Length – 39:23 | Email to a Colleague

    Price: $195

     

     

    zasadil.jpg
    Add to Cart
    TOPIC: HEALTHCARE
    Case Study: UPMC Health Plan
    A Predictive Model for Hospital Readmissions 

    Scott Zasadil, Senior Scientist, UPMC Health Plan

    Hospital readmissions are a significant component of our nation’s healthcare costs. Predicting who is likely to be readmitted is a challenging problem. Using a set of 123,951 hospital discharges spanning nearly three years, we developed a model that predicts an individual’s 30-day readmission should they incur a hospital admission. The model uses an ensemble of boosted decision trees and prior medical claims and captures 64% of all 30-day readmits with a true positive rate of over 27%. Moreover, many of the ‘false’ positives are simply delayed true positives. 53% of the predicted 30-day readmissions are readmitted within 180 days.

    Length – 54:18 | Email to a Colleague

    Price: $195

    PAWCON Bay Area March

    The biggest Predictive Analytics Conference comes back to the SF Bay in March next year.

    From

    http://www.predictiveanalyticsworld.com/sanfrancisco/2011/

    Predictive Analytics World March 2011 in San Francisco is packed with the top predictive analytics experts, practitioners, authors and business thought leaders, including keynote speakers:


    Sugato Basu, Ph.D.
    Senior Research Scientist
    Google
    Lessons Learned in Predictive Modeling 
    for Ad Targeting

    Eric Siegel, Ph.D.
    Conference Chair
    Predictive Analytics World
    Five Ways Predictive Analytics
    Cuts Enterprise Risk




    Plus special plenary sessions from industry heavy-weights:


    Andreas S. Weigend, Ph.D.
    weigend.com
    Former Chief Scientist, Amazon.com
    The State of the Social Data Revoltion

    John F. Elder, Ph.D.
    CEO and Founder
    Elder Research
    Data Mining Lessons Learned




    Predictive Analytics World focuses on concrete examples of deployed predictive analytics. Hear from the horse’s mouth precisely how Fortune 500 analytics competitors and other top practitioners deploy predictive modeling, and what kind of business impact it delivers. Click here to view the agenda at-a-glance.

    PAW SF 2011 will feature speakers with case studies from leading enterprises. such as:

    PAW’s March agenda covers hot topics and advanced methods such as uplift (net lift) modeling, ensemble models, social data, search marketing, crowdsourcing, blackbox trading, fraud detection, risk management, survey analysis and otherinnovative applications that benefit organizations in new and creative ways.

    Join PAW and access the best keynotes, sessions, workshops, exposition, expert panel, live demos, networking coffee breaks, reception, birds-of-a-feather lunches, brand-name enterprise leaders, and industry heavyweights in the business.

     

    Predictive Analytics World March2011 SF

    USGS Satellite photo of the San Francisco Bay ...
    Image via Wikipedia

    Message from PAWCON-

     

    Predictive Analytics World, Mar 14-15 2011, San Francisco, CA

    More info: pawcon.com/sanfrancisco

    Agenda at-a-glance: pawcon.com/sanfrancisco/2011/agenda_overview.php

    PAW’s San Francisco 2011 program is the richest and most diverse yet, including over 30 sessions across two tracks – an “All Audiences” and an “Expert/Practitioner” track — so you can witness how predictive analytics is applied at Bank of America, Bank of the West, Best Buy, CA State Automobile Association, Cerebellum Capital, Chessmetrics, Fidelity, Gaia Interactive, GE Capital, Google, HealthMedia, Hewlett Packard, ICICI Bank (India), MetLife, Monster.com, Orbitz, PayPal/eBay, Richmond, VA Police Dept, U. of Melbourne, Yahoo!, YMCA, and a major N. American telecom, plus insights from projects for Anheiser-Busch, the SSA, and Netflix.

    PAW’s agenda covers hot topics and advanced methods such as uplift modeling (net lift), ensemble models, social data (6 sessions on this), search marketing, crowdsourcing, blackbox trading, fraud detection, risk management, survey analysis, and other innovative applications that benefit organizations in new and creative ways.

    Predictive Analytics World is the only conference of its kind, delivering vendor-neutral sessions across verticals such as banking, financial services, e-commerce, education, government, healthcare, high technology, insurance, non-profits, publishing, social gaming, retail and telecommunications

    And PAW covers the gamut of commercial applications of predictive analytics, including response modeling, customer retention with churn modeling, product recommendations, fraud detection, online marketing optimization, human resource decision-making, law enforcement, sales forecasting, and credit scoring.

    WORKSHOPS. PAW also features pre- and post-conference workshops that complement the core conference program. Workshop agendas include advanced predictive modeling methods, hands-on training and enterprise decision management.

    More info: pawcon.com/sanfrancisco

    Agenda at-a-glance: pawcon.com/sanfrancisco/2011/agenda_overview.php

    Be sure to register by Dec 7 for the Super Early Bird rate (save $400):
    pawcon.com/sanfrancisco/register.php

    If you’d like our informative event updates, sign up at:
    pawcon.com/signup-us.php