The Analytics 2011 Conference Series combines the power of SAS’s M2010 Data Mining Conference and F2010 Business Forecasting Conference into one conference covering the latest trends and techniques in the field of analytics. Analytics 2011 Conference Series brings the brightest minds in the field of analytics together with hundreds of analytics practitioners. Join us as these leading conferences change names and locations. At Analytics 2011, you’ll learn through a series of case studies, technical presentations and hands-on training. If you are in the field of analytics, this is one conference you can’t afford to miss.
October 24-25, 2011
Grande Lakes Resort
Analytics 2011 topic areas include:
- Data Mining
- Text Analytics
- Fraud Detection
- Data Visualization
- Predictive Modeling
- Data Optimization
- NEW! Operations Research
- NEW! Credit Scoring
Registration is now open!
Be sure to register before July 15 and save $400 on conference fees.
Don’t forget to check out the list of pre- and post-conference training.
a contest for students to go for free-
Here’s your chance to be recognized by the analytics community for your work in the field. Poster presentations provide an excellent opportunity for analytics practitioners to present their projects in a one-to-one setting and receive professional feedback from leaders in the field of analytics.
The Analytics 2011 Poster session is open to all analytics practitioners – from corporate or academic fields.
Posters will be located inside the Exhibit Hall and accessible throughout the conference. We request that authors make themselves available during dedicated Exhibit Hall hours to speak with attendees and answer questions.
View the list of posters presented at M2010.
Poster submission guidelines:
- To participate, attendees must submit a poster abstract through the submission form
- The abstract must include a description of how you have used analytics to improve your processes and/or analyze your work
- You must define your problem/research goal and show the application of analytics methodology
- You must be able to document the steps and show your results.
- The content of the poster must be either a class assignment (non-research), a research project, or a business application
- Poster abstracts must be 250 words or less
- No abstracts will be accepted after August 19, 2011.
- For students participating in the contest, posters will be judged by a committee and applicants will be notified by September 16, 2011. Final posters must be received by September 2, 2011 for judging.
- SAS will provide a display board and a header denoting the Poster Title and Author. SAS will print the header and poster for display at the conference.
- Poster presentations that are accepted from academia (faculty and full-time students) will allow the primary presenter to attend the conference for free. You must be currently enrolled in or employed by an accredited university or college to be eligible for the free conference registration. After being informed of the poster’s acceptance, please simply note in the “additional comments” field of your registration your participation as a poster presenter and your college affiliation. (You will be required to fax a letter with your department head’s signature as verification of your affiliation.)
- Poster presentations that are accepted from the business world will allow the primary presenter to attend the conference at the early bird rate ($400 off the regular fees). After being informed of the poster’s acceptance, please simply note in the “additional comments” field of your registration your participation as a poster presenter.
Further instructions and specifications for poster presentations will be provided when your abstract is accepted.
Questions? Contact us.
This Call for Posters is extended to attendees to submit original research. A limited number of abstracts will be accepted, so don’t wait!
and of course
These are the abstracts for the session speakers at Analytics 2011. This information is updated frequently so be sure to check back often.
As the nation’s second largest university, the University of Central Florida is committed to providing a world-class education to a growing student body that is quickly approaching 60,000 students. At the same time the university is experiencing growth, the current economic climate in the state of Florida has seen drastic budget reductions for all state universities. It is essential for university leadership to constantly monitor all areas of operational and strategic importance, particularly when maneuvering in such a rapidly changing, yet fiscally constrained, environment. This presentation will describe an approach that the University of Central Florida uses to evaluate university performance through benchmarking analyses. The SAS/STAT procedure, PROC CLUSTER, is used to establish a list of university benchmarking peers in an unbiased fashion. Several key performance indicators are evaluated on a regular basis by comparing university metrics to its institutional peers. On the front-end, creative visualizations deliver these volumes of data to university leadership so they provide information in an easy-to-read format. A series of SAS formatted tables are used on the back-end to form a database to track and monitor key performance indicators on an ongoing basis. Performing salary studies, determining post-graduation success, establishing performance goals, and evaluating operational efficiencies are all examples of how UCF uses benchmarking to remain competitive in this constrained environment while moving closer to its strategic goals.
A common assumption in any analytics and/or data mining application, is that customer behavior is independent and identically distributed, often referred to as the iid assumption. However, in many real-life settings this assumption is simply not valid! Social network effects between customers, both implicit and explicit, create collective correlational behavior which needs to be appropriately analyzed and modeled. In this talk, we will start by outlining the architecture of a social network learning environment, consisting of a local model (e.g. a logistic regression model), a relational learner (e.g. a relational neighbor classifier), and a collective inferencing procedure (e.g. Gibbs sampling). The ideas and concepts will be illustrated using two real-life case studies about churn detection in the Telco sector (postpaid segment) with social networks using call detail record (CDR) data obtained from two major European Telco providers. It will be empirically shown how social network effects can be efficiently modeled using these huge data sets, hereby generating both additional lift and profit compared to a flat logistic regression model. The methodology presented easily generalizes to other contexts, e.g. customer acquisition, risk management and fraud detection, where social networks also play a crucial role!
As collaborative and social networking tools are mainstreamed throughout the organization, there is a need to manage their content consistent with that of other enterprise applications. This means having access to some form of semantic technology within these applications. While today’s collaborative and social networking tools may come with some built in semantic technologies, they are not as robust as the capabilities offered by tools such as the SAS Categorization Suite. This talk addresses the architectural and workflow issues involved in integrating the Categorization Suite into SharePoint, Social Networking applications and enterprise content management systems. This talk places greater emphasis on the integration and linkages than on the design and use of the individual semantic technologies.
Consisting of 4 Theme Parks, 2 Water Parks, 24 Resorts, and Downtown Disney, Walt Disney World Resort is the world’s largest vacation destination. Visiting Guests are struck by how they are immersed into the storytelling at the parks’ attractions as well as the legendary Guest service offered by Disney Cast Members. What isn’t as apparent are the sophisticated analytical models at work behind the scenes to provide that seamless Guest experience. Pete will provide an overview of Disney Parks, as well as dive into some analytical case studies that highlight the complex thinking required to operate this world-class destination.
Segmenting customers based on their behavior or their attitudinal attributes are the mainstream of customer segmentation basics, however, until recently the idea of combining these different customer segmentations has been reported in the literature since the late 1990’s. However, there has not been a lot of practical usage of this technique reported in industry. Several potential applications in image and scene segmentation analysis has been mentioned, but the applications for combining segmentations from clustering or other techniques has far reaching potential in areas such as advertising and media, new insights on customers or prospects, market research, medical applications in diagnostics, etc. An example of combining segments that will be discussed is a marketing example where an attitudinal survey segmentation is combined with a behavioral segmentation on a set of customers and the resulting final ensemble combines attributes of both while revealing interesting insights that are not in the original segmentations.
The ability to combine groups of segments together actually stems from a Bayesian methodology for combining information from different sources together to form a new insight not found in the uncombined information sources alone. The algorithm to perform these combinations, however, can take on a Bayesian approach or a more traditional approach such as k-means clustering. Until recently, ensemble segmentations or ensemble clustering has been just a cool sounding algorithm, however, this article and presentation will show you two possible algorithms of ensemble segmentations in SAS Enterprise Miner? using only a point-and-click interface that business and technical analysts can both perform and apply. Discussion of the possible business applications for performing these ensemble segmentations will be given as well.
This presentation will focus on strategies and initiatives undertaken by the Tribune Company to automatically index and tag news content using text analytics applications and approaches. The goal of the automated indexing program is to provide value-added services to the Tribune community of editors and producers, advertising partners, and other 3rd party collaborators with the ultimate goal of providing its audience of readers with state-of-the-technology news delivery.
The Tribune Company’s automated indexing program began in late 2007 to support a variety of business requirements and help lay the foundation for actively participating in the development of Web 3.0 (Semantic Web.) With the influx of news sources coupled with user generated content, the Tribune, like many other news organizations, continuously works to manage its news content to meet numerous expectations while keeping news content overload at a minimum. The text analytics approaches the Tribune has implemented allow the company to explore its vast body of continuously updated content to find and present content relevant to its end-users.
Based on direct experience, this presentation will share the impact of text analytics applications, methodologies, and lessons learned as the Tribune Company manages and deploys its content for the future.
A lot of literature has been written about credit risk models, but most focuses on the application of data mining to develop credit scorecards for either acquisition of new costumers or to measure the behavior of existing customers. Besides these models, there are enumerable applications of analytics that can be carried out along the credit cycle.
In this presentation we are going to present analytics innovations that have been applied through the different stages of the credit life. For this purpose we have divided the credit cycle in three main parts: pre-origination, origination and post-origination. At all stages we use methodologies such as neural networks, survival analysis, basket market, association rules, CART, CHAID, particle swarm optimization, genetic algorithm optimization, clustering and more.
The pre-origination stage refers to the exploration of potential customers. At this stage, analytics are applied to develop propensity models and to segment the potential costumers in order to split hairs and deliver value offers differentiated by profiles. The origination stage refers to the process of granting a product to a costumer. Beside the well-known origination scorecards there are other tools such as: income models, profitability models, costumer profiling model before developing the origination scorecard, optimization to calculate the credit lines and make multiple offers to the client, etc. Finally, the post-origination stage includes everything that has to do with managing existing customers. Here we use analytics tools like: a profitability index in order to increase or decrease credit limits, attrition models, cross-selling models, fraud models, collection models, recovery models, behavior models by product and by client, among others.
Load forecasting plays an important role in the electric power industry. Similar to many other industries that need demand forecasts, forecasting the electricity demand of the special days, such as holidays and their surrounding days, has been a challenging problem. Since the holiday effect often varies from one region to another, a specific model developed for one utility may not be suitable for another. This presentation will introduce a regression based approach to model the holiday effect of electricity demand, followed by the comparative studies of 8 utility systems.
Currently, the utility industry is undergoing the “Smart Grid Revolution” by installing meters which can provide 15 minute reads to hourly for every customer on their system. This creates gigabytes of data for utilities to navigate. This presentation will explore methods, and tools available to cull out the pertinent information for different utility analyses and decision making. The presentation will conclude by showing how this newly acquired data can be applied to electric load forecasts.
At the core of insurance and banking businesses lies the need to manage risks due to both expected and unexpected loss events. Risk management is required not only to comply with regulations but also to maintain a solvent and profitable business. A model-based approach to risk management consists of building predictive models for the losses that are expected to occur in a given time period. The widely used method is to use past loss data to build separate models for the frequency and severity (magnitude) of losses. These models are then combined to estimate the distribution for the aggregate loss. For certain situations, an alternative approach based on the Tweedie distribution can be useful.
This talk will demonstrate the capabilities of SAS/ETS software in different predictive tasks for various types of loss data. Topics include using the SEVERITY procedure for severity modeling, using the COUNTREG procedure for frequency modeling, using a simulation method for aggregate loss modeling, and using scale regression with Tweedie distribution for pure premium modeling.
Business and IT leaders responsible for Analytics want their projects to be successful, deliver promised business value faster and cheaper. For many these objectives are elusive. Among other things, leadership success depends on their ability to skillfully navigate the complex technology landscape and their ability to pick the right implementation strategy. There are few tested ways to simplify these decisions, deliver Analytics faster, control costs and accelerate the time-to-value. This presentation centers on a comprehensive technology and implementation framework with several case studies of real-life implementations at clients.
The benefit of the integration of management adjustments to a mathematically-driven baseline forecast have, subject to qualifying parameters, been long established. This is particularly the case when the application of these adjustments is both judicious and systematic. Thiel’s correction – a simple coefficient of forecast error by manager against actual sales – is perhaps the best-known, and does indeed lend an increased level of precision to hybrid forecasts; however it, and every permutation thereof, suffers from the same impediment as time-series forecasting itself: it ignores the causality of those inputs.
In credit risk, three risk parameters, namely Probability of Default (PD), Exposure at Default (EAD), and Loss Given Default (LGD), are key components of Expected Losses (EL) calculation, which is essential in estimating Economic Capital, Basel Accords Regulatory Capital Requirement, and Risk Adjusted Return on Capital.
While there are various means to model these three risk parameters, the approach that we’d like discuss here is a practice widely used in the consumer risk arena with a focus on the information of individual account. Within this modeling framework, both account-level risk characteristics, e.g. updated credit score, and systematic risk factors, e.g. housing pricing index and unemployment rate, are used to develop a suite of statistical models for each risk parameter of interest. Compared with alternative methodologies, our preferred practice allows a granular risk profiling at the account-level without losing the dynamic view of economic trends.
In our presentation, the modeling methodology employed to estimate each risk parameter should be demonstrated from the practitioner’s point of view, including the Through-The-Cycle (TTC) PD estimation with panel data, EAD estimation for revolving exposures through EADF / LEQ / CCF, LGD estimation by the regression model on Recovery Rate (RR). In addition, related practices in back testing and stress testing will also be covered briefly.
There is no shortage of algorithms for determining clusters or hidden patterns in numerical data. This is in part due to the fact that there is no single “best” method that works well in all situations. Nevertheless, virtually all popular clustering techniques treat the data as being static at the time of analysis. But just as in nature, motion is a key to revealing hidden patterns. The mottled pattern of a motionless fawn completely camouflages it against the forest background, but as soon as the fawn moves, its spots, speckles, and stripes move in unison against the static background, and this immediately exposes the fawn to probing eyes. The purpose of this presentation is to explain how to implement this natural phenomenon for data analysis. In particular, it is described how to put static numerical data into “motion” for the purpose of revealing and analyzing hidden patterns or clusters of information.
Forecasting and Demand Management Organizations are often given the goal of improving forecast accuracy. Accurate forecasts can reduce a company’s costs and increase customer service by providing the company with a better ability to build the right products in the right quantities in the right places at the right times. Forecasting professionals today have a variety of sophisticated techniques at their disposal that can search historical demand data for baselines, patterns, and trends and project this out into the future. However, even with these tools and techniques, many organizations find themselves unable to achieve the level of forecast accuracy that they desire.
This presentation will introduce the concept that not all demand streams are equally forecastable. The forecastability of a demand stream is limited by its volatility.
In this presentation we will learn:
- How to use a “Comet Chart” to understand the relationship between forecast accuracy and demand volatility.
- How to identify the factors contributing to demand stream volatility.
- How Forecast Value Added analysis can be used to identify forecast improvement opportunities.
This presentation will provide a background on the application of various levels of Analytics at Wyndham Exchange and Rentals. It will also explain in detail about one complex business problem, “Optimizing Vacation Exchange,” that has been recently implemented. The analytical solution to this business problem spans various forecasting, optimization and heuristic techniques. The talk will also address the system solution that generates about 1.4 billion forecasts and over 6,000 optimizations daily.
Prediction markets leverage the wisdom of crowds, aggregating information held by many people to predict future events. They often produce forecasts that are better than experts, surveys, or analytical approaches. In recent years, corporations have begun experimenting with these artificial stock markets to project sales, determine research investments, and anticipate program timing. Getting a prediction market up and running requires practical advice on:
- phrasing questions (including questions that cannot be externally verified)
- short-term versus continuous markets
- maintaining confidentiality
- investigating market manipulation
- yond predictions free from traditional bureaucratic limits, corporate prediction markets offer valuable secondary benefits including:
- trader commentary
- insights from changing stock prices
- knowledge dissemination to/among employees
Ford Motor Company’s experience with over 1,300 employee traders has also produced some surprises including the influence of corporate culture, biases against short selling, unexpectedly popular questions, and true motivators of participation.
Most prospect scoring models used in higher education recruiting seek to answer the question of how much does this prospective candidate look like other candidates that were successfully recruited into a program. Based on the answer to this single question, generally well before a prospective student starts an application, determines the level of engagement the recruiting department will have with that prospect. The analysis that was conducted in this new scoring model seeks to identify the level of engagement needed to attract a candidate to a program by channeling them into the right communication mix.
In a recent project, SAS Text Analytics was selected by a Telecom service provider to develop a number of capabilities. This talk presents the results of that project – what worked well and what did not. The talk will cover how SAS Enterprise Content Categorization was selected over IBM and other vendors through an extended Proof of Concept that also formed the basis of two applications.
The first application was analyzing customer support reps hastily written notes for determining what the customer motivation was for the call, what special problems had been dealt with, and to identify what steps were required to resolve the issue. This application used ECC’s unique categorization and entity extraction functionality. The second application was to mine social media for both sentiment about the range of products and services that the Telecom offered and to look for potential problems and potential solutions to those problems.
Statisticians tend to look at model fit measures such as AIC or R-squared that are statistically meaningful, but mean nothing to the majority of stakeholders that must make or accept decisions based on the models. If statisticians focus on metrics that are meaningful to the stakeholders, instead of general statistical measures, then they can spend time in dialogue instead of dissertation. And they might even learn that they should be building different models than those that they have been building. This session provides examples from the hospitality industry of the development of meaningful metrics, how they change the models that we develop, and how these metrics improve our communication ability.
Lack of simple adherence to prescribed medications results in up to $300Billion of direct costs to the US healthcare system. While the problem is complex ? and will require multiple players to solve ? this talk will focus on some simple, analytic-driven insights and solutions that Prescription Benefit Managers (PBMs) and group health plans can use to mitigate the problem within insured populations. We will cover some of the challenges in overcoming non-adherence, including a surprising gap between the perception and reality of individual non-adherence. We will also cover some patent-pending work that Express Scripts (one of the nation’s largest PBMs) has done around predicting non-adherence risk, and tailoring interventions to improve adherence.
Forecasters often deal with data accumulated at different time intervals (for example, monthly data and daily data). A common practice is to generate the forecasts at the two time intervals independently so as to choose the best model for each series. That practice can result in forecasts that do not agree.
This paper shows how the SAS High-Performance Forecasting HPFTEMPRECON procedure uses the lower frequency forecast as a benchmark to adjust the higher-frequency forecast to take the best advantage of both forecasts.
Driven by the need to optimize marketing investments around the world, this analytical journey utilizes a string of intuitive methods to successfully design an end to end road map towards making high quality investment decisions in a true global environment. With increased globalization and economic instability, there is a strong call for analytics to control for these challenges and ensure quality decisions can be made. The journey starts with overcoming data difficulties, maximizing insights from limited but globally available data points. These insights are used to create customer behavioral segmentation, key in enabling cross department communication. It continues with tools that enhance the effectiveness of communicating the insights, and directly aid in creating the experimental design of marketing treatments. After design execution, segmentation is upgraded to more unsupervised behavioral clustering, and its innovative use in the powerful SAS forecasting engine that is the cornerstone in the assessment of results. Once marketing design, execution and assessment are done, the ever circular journey completes with an approach to use the robust set of information to optimize investment decisions across the globe. This is the journey of American Express Traveler Cheques analytics, and how it helped bring together a largely fragmented global organization as one force driving strong results
So far so good- but the website uses an aweful parrot/lime green color. Now if we could only have color analytics too 😉