SAS and Hadoop

Awesomely informative post on sascom magazine (whose editor I have I interviewed before here at http://www.decisionstats.com/interview-alison-bolen-sas-com/ – )

Great piece by Michael Ames ,SAS Data Integration Product Manager.

http://www.sas.com/news/sascom/hadoop-tips.html

 

Also see SAS’s big data thingys here at

http://www.sas.com/software/high-performance-analytics/in-memory-analytics/index.html

Solutions and Capabilities Using SAS® In-Memory Analytics

  • High-Performance Analytics – Get near-real-time insights with appliance-ready analytics software designed to tackle big data and complex problems.
  • High-Performance Risk – Faster, better risk management decisions based on the most up-to-date views of your overall risk exposure.
  • High-Performance Liquidity Risk Management – Take quick, decisive actions to secure adequate funding, especially in times of volatility.
  • High-Performance Stress Testing – Make faster, more precise decisions to protect the health of the firm.
  • Visual Analytics – Explore big data using in-memory capabilities to better understand all of your data, discover new patterns and publish reports to the Web and iPad®.

(Ajay- I liked the Visual Analytics piece especially for Big Data )

Note-

 

Predictive analytics in the cloud : Angoss

I interviewed Angoss in depth here at http://www.decisionstats.com/interview-eberhard-miethke-and-dr-mamdouh-refaat-angoss-software/

Well they just announced a predictive analytics in the cloud.

 

http://www.angoss.com/predictive-analytics-solutions/cloud-solutions/

Solutions

Overview

KnowledgeCLOUD™ solutions deliver predictive analytics in the Cloud to help businesses gain competitive advantage in the areas of sales, marketing and risk management by unlocking the predictive power of their customer data.

KnowledgeCLOUD clients experience rapid time to value and reduced IT investment, and enjoy the benefits of Angoss’ industry leading predictive analytics – without the need for highly specialized human capital and technology.

KnowledgeCLOUD solutions serve clients in the asset management, insurance, banking, high tech, healthcare and retail industries. Industry solutions consist of a choice of analytical modules:

KnowledgeCLOUD for Sales/Marketing

KnowledgeCLOUD solutions are delivered via KnowledgeHUB™, a secure, scalable cloud-based analytical platform together with supporting deployment processes and professional services that deliver predictive analytics to clients in a hosted environment. Angoss industry leading predictive analytics technology is employed for the development of models and deployment of solutions.

Angoss’ deep analytics and domain expertise guarantees effectiveness – all solutions are back-tested for accuracy against historical data prior to deployment. Best practices are shared throughout the service to optimize your processes and success. Finely tuned client engagement and professional services ensure effective change management and program adoption throughout your organization.

For businesses looking to gain a competitive edge and put their data to work, Angoss is the ideal partner.

—-

Hmm. Analytics in the cloud . Reduce hardware costs. Reduce software costs . Increase profitability margins.

Hmmmmm

My favorite professor in North Carolina who calls cloud as a time sharing, are you listening Professor?

Analytics 2011 Conference

From http://www.sas.com/events/analytics/us/

The Analytics 2011 Conference Series combines the power of SAS’s M2010 Data Mining Conference and F2010 Business Forecasting Conference into one conference covering the latest trends and techniques in the field of analytics. Analytics 2011 Conference Series brings the brightest minds in the field of analytics together with hundreds of analytics practitioners. Join us as these leading conferences change names and locations. At Analytics 2011, you’ll learn through a series of case studies, technical presentations and hands-on training. If you are in the field of analytics, this is one conference you can’t afford to miss.

Conference Details

October 24-25, 2011
Grande Lakes Resort
Orlando, FL

Analytics 2011 topic areas include:

PAW Videos

A message from Predictive Analytics World on  newly available videos. It has many free videos as well so you can check them out.

Predictive Analytics World March 2011 in San Francisco

Access PAW DC Session Videos Now

Predictive Analytics World is pleased to announce on-demand access to the videos of PAW Washington DC, October 2010, including over 30 sessions and keynotes that you may view at your convenience. Access this leading predictive analytics content online now:

View the PAW DC session videos online

Register by January 18th and receive $150 off the full 2-day conference program videos (enter code PAW150 at checkout)

Trial videos – view the following for no charge:

Select individual conference sessions, or recognize savings by registering for access to one or two full days of sessions. These on-demand videos deliver PAW DC right to your desk, covering hot topics and advanced methods such as:

Social data 

Text mining

Search marketing

Risk management

Survey analysis

Consumer privacy

Sales force optimization

Response & cross-sell

Recommender systems

Featuring experts such as:
Usama Fayyad, Ph.D.
CEO, Open Insights Former Chief Data Officer, Yahoo!

Andrew Pole
Sr Mgr, Media/DB Mktng
Target
View Keynote for Free

John F. Elder, Ph.D.
CEO and Founder
Elder Research

Bruno Aziza
Director, Worldwide Strategy Lead, BI
Microsoft

Eric Siegel, Ph.D.
Conference Chair
Predictive Analytics World

PAW DC videos feature over 25 speakers with case studies from leading enterprises such as: CIBC, CEB, Forrester, Macy’s, MetLife, Microsoft, Miles Kimball, Monster.com, Oracle, Paychex, SunTrust, Target, UPMC, Xerox, Yahoo!, YMCA, and more.

How video access works:

View Slides on the Left See & Hear Speaker in the Right Window

Sign up by January 18 for immediate video access and $150 discount


San Francisco
March 14-15, 2011
Washington DC
October, 2011
London
November, 2011
Contact Us

Produced by:

 

Session Gallery: Day 1 of 2

Viewing (17) Sessions of (31)

 

keynote.jpg
Add to Cart
Keynote: Five Ways Predictive Analytics Cuts Enterprise Risk  

Eric Siegel, Ph.D., Program Chair, Predictive Analytics World

All business is an exercise in risk management. All organizations would benefit from measuring, tracking and computing risk as a core process, much like insurance companies do.

Predictive analytics does the trick, one customer at a time. This technology is a data-driven means to compute the risk each customer will defect, not respond to an expensive mailer, consume a retention discount even if she were not going to leave in the first place, not be targeted for a telephone solicitation that would have landed a sale, commit fraud, or become a “loss customer” such as a bad debtor or an insurance policy-holder with high claims.

In this keynote session, Dr. Eric Siegel reveals:

– Five ways predictive analytics evolves your enterprise to reduce risk

– Hidden sources of risk across operational functions

– What every business should learn from insurance companies

– How advancements have reversed the very meaning of fraud

– Why “man + machine” teams are greater than the sum of their parts for enterprise decision support

Length – 00:45:57 | Email to a Colleague

Price: $195

 

 

sponsor.jpg
Play video of session: Platinum Sponsor Presentation, Analytics: The Beauty of Diversity
Platinum Sponsor Presentation: Analytics – The Beauty of Diversity 

Anne H. Milley, Senior Director of Analytic Strategy, Worldwide Product Marketing, SAS

Analytics contributes to, and draws from, multiple disciplines. The unifying theme of “making the world a better place” is bred from diversity. For instance, the same methods used in econometrics might be used in market research, psychometrics and other disciplines. In a similar way, diverse paradigms are needed to best solve problems, reveal opportunities and make better decisions. This is why we evolve capabilities to formulate and solve a wide range of problems through multiple integrated languages and interfaces. Extending that, we have provided integration with other languages so that users can draw on the disciplines and paradigms needed to best practice their craft.

Length – 20:11 | Email to a Colleague

Free viewing enabled – no charge

 

gold sponsor.jpg
Play video of session: Gold Sponsor Presentation Predictive Analytics Accelerate Insight for Financial Services
Gold Sponsor Presentation: Predictive Analytics Accelerate Insight for Financial Services 

Finbarr Deely, Director of Business Development,ParAccel

Financial services organizations face immense hurdles in maintaining profitability and building competitive advantage. Financial services organizations must perform “what-if” scenario analysis, identify risks, and detect fraud patterns. The advanced analytic complexity required often makes such analysis slow and painful, if not impossible. This presentation outlines the analytic challenges facing these organizations and provides a clear path to providing the accelerated insight needed to perform in today’s complex business environment to reduce risk, stop fraud and increase profits. * The value of predictive analytics in Accelerating Insight * Financial Services Analytic Case Studies * Brief Overview of ParAccel Analytic Database

Length – 09:06 | Email to a Colleague

Free viewing enabled – no charge

 

isson1.jpg
Add to Cart
TOPIC: BUSINESS VALUE
Case Study: Monster.com
Creating Global Competitive Power with Predictive Analytics 

Jean Paul Isson, Vice President, Globab BI & Predictive Analytics, Monster Worldwide

Using Predictive analytics to gain a deeper understanding of customer behaviours, increase marketing ROI and drive growth

– Creating global competitive power with business intelligence: Making the right decisions – at the right time

– Avoiding common change management challenges in sales, marketing, customer service, and products

– Developing a BI vision – and implementing it: successful business intelligence implementation models

– Using predictive analytics as a business driver to stay on top of the competition

– Following the Monster Worldwide global BI evolution: How Monster used BI to go from good to great

Length – 51:17 | Email to a Colleague

Price: $195

 

 

abbot.jpg
Add to Cart
TOPIC: SURVEY ANALYSIS
Case Study: YMCA
Turning Member Satisfaction Surveys into an Actionable Narrative 

Dean Abbott, President, Abbott Analytics

Employees are a key constituency at the Y and previous analysis has shown that their attitudes have a direct bearing on Member Satisfaction. This session will describe a successful approach for the analysis of YMCA employee surveys. Decision trees are built and examined in depth to identify key questions in describing key employee satisfaction metrics, including several interesting groupings of employee attitudes. Our approach will be contrasted with other factor analysis and regression-based approaches to survey analysis that we used initially. The predictive models described are currently in use and resulted in both greater understanding of employee attitudes, and a revised “short-form” survey with fewer key questions identified by the decision trees as the most important predictors.

Length – 50:19 | Email to a Colleague

Price: $195

 

 

rexer.jpg
Add to Cart
TOPIC: INDUSTRY TRENDS
2010 Data Minter Survey Results: Highlights
 

Karl Rexer, Ph.D., Rexer Analytics

Do you want to know the views, actions, and opinions of the data mining community? Each year, Rexer Analytics conducts a global survey of data miners to find out. This year at PAW we unveil the results of our 4th Annual Data Miner Survey. This session will present the research highlights, such as:

– Analytic goals & key challenges

– Impact of the economy

– Regional differences

– Text mining trends

Length – 15:20 | Email to a Colleague

Price: $195

 

 

elder.jpg
Add to Cart
Multiple Case Studies: U.S. DoD, U.S. DHS, SSA
Text Mining: Lessons Learned 

John F. Elder, Chief Scientist, Elder Research, Inc.

Text Mining is the “Wild West” of data mining and predictive analytics – the potential for gain is huge, the capability claims are often tall tales, and the “land rush” for leadership is very much a race.

In solving unstructured (text) analysis challenges, we found that principles from inductive modeling – learning relationships from labeled cases – has great power to enhance text mining. Dr. Elder highlights key technical breakthroughs discovered while working on projects for leading government agencies, including: Text Mining is the “Wild West” of data mining and predictive analytics – the potential for gain is huge, the capability claims are often tall tales, and the “land rush” for leadership is very much a race.

– Prioritizing searches for the Dept. of Homeland Security

– Quick decisions for Social Security Admin. disability

– Document discovery for the Dept. of Defense

– Disease discovery for the Dept. of Homeland Security

– Risk profiling for the Dept. of Defense

Length – 48:58 | Email to a Colleague

Price: $195

 

 

target.jpg
Play video of session: Keynote: How Target Gets the Most out of Its Guest Data to Improve Marketing ROI
Keynote: How Target Gets the Most out of Its Guest Data to Improve Marketing ROI 

Andrew Pole, Senior Manager, Media and Database Marketing, Target

In this session, you’ll learn how Target leverages its own internal guest data to optimize its direct marketing – with the ultimate goal of enhancing our guests’ shopping experience and driving in-store and online performance. You will hear about what guest data is available at Target, how and where we collect it, and how it is used to improve the performance and relevance of direct marketing vehicles. Furthermore, we will discuss Target’s development and usage of guest segmentation, response modeling, and optimization as means to suppress poor performers from mailings, determine relevant product categories and services for online targeted content, and optimally assign receipt marketing offers to our guests when offer quantities are limited.

Length – 47:49 | Email to a Colleague

Free viewing enabled – no charge

 

analytics.jpg
Play video of session: Platinum Sponsor Presentation: Driving Analytics Into Decision Making
Platinum Sponsor Presentation: Driving Analytics Into Decision Making  

Jason Verlen, Director, SPSS Product Strategy & Management, IBM Software Group

Organizations looking to dramatically improve their business outcomes are turning to decision management, a convergence of technology and business processes that is used to streamline and predict the outcome of daily decision-making. IBM SPSS Decision Management technology provides the critical link between analytical insight and recommended actions. In this session you’ll learn how Decision Management software integrates analytics with business rules and business applications for front-line systems such as call center applications, insurance claim processing, and websites. See how you can improve every customer interaction, minimize operational risk, reduce fraud and optimize results.

Length – 17:29 | Email to a Colleague

Free viewing enabled – no charge

 

macy.jpg
Add to Cart
TOPIC: DATA INFRASTRUCTURE AND INTEGRATION
Case Study: Macy’s
The world is not flat (even though modeling software has to think it is) 

Paul Coleman, Director of Marketing Statistics, Macy’s Inc.

Software for statistical modeling generally use flat files, where each record represents a unique case with all its variables. In contrast most large databases are relational, where data are distributed among various normalized tables for efficient storage. Variable creation and model scoring engines are necessary to bridge data mining and storage needs. Development datasets taken from a sampled history require snapshot management. Scoring datasets are taken from the present timeframe and the entire available universe. Organizations, with significant data, must decide when to store or calculate necessary data and understand the consequences for their modeling program.

Length – 34:54 | Email to a Colleague

Price: $195

 

 

gwaltney.jpg
Add to Cart
TOPIC: CUSTOMER VALUE
Case Study: SunTrust
When One Model Will Not Solve the Problem – Using Multiple Models to Create One Solution 

Dudley Gwaltney, Group Vice President, Analytical Modeling, SunTrust Bank

In 2007, SunTrust Bank developed a series of models to identify clients likely to have large changes in deposit balances. The models include three basic binary and two linear regression models.

Based on the models, 15% of SunTrust clients were targeted as those most likely to have large balance changes. These clients accounted for 65% of the absolute balance change and 60% of the large balance change clients. The targeted clients are grouped into a portfolio and assigned to individual SunTrust Retail Branch. Since 2008, the portfolio generated a 2.6% increase in balances over control.

Using the SunTrust example, this presentation will focus on:

– Identifying situations requiring multiple models

– Determining what types of models are needed

– Combining the individual component models into one output

Length – 48:22 | Email to a Colleague

Price: $195

 

 

paychex1.jpg
Add to Cart
TOPIC: RESPONSE & CROSS-SELL
Case Study: Paychex
Staying One Step Ahead of the Competition – Development of a Predictive 401(k) Marketing and Sales Campaign 

Jason Fox, Information Systems and Portfolio Manager,Paychex

In-depth case study of Paychex, Inc. utilizing predictive modeling to turn the tides on competitive pressures within their own client base. Paychex, a leading provider of payroll and human resource solutions, will guide you through the development of a Predictive 401(k) Marketing and Sales model. Through the use of sophisticated data mining techniques and regression analysis the model derives the probability a client will add retirement services products with Paychex or with a competitor. Session will include roadblocks that could have ended development and ROI analysis. Speaker: Frank Fiorille, Director of Enterprise Risk Management, Paychex Speaker: Jason Fox, Risk Management Analyst, Paychex

Length – 26:29 | Email to a Colleague

Price: $195

 

 

ling.jpg
Add to Cart
TOPIC: SEGMENTATION
Practitioner: Canadian Imperial Bank of Commerce
Segmentation Do’s and Don’ts 

Daymond Ling, Senior Director, Modelling & Analytics,Canadian Imperial Bank of Commerce

The concept of Segmentation is well accepted in business and has withstood the test of time. Even with the advent of new artificial intelligence and machine learning methods, this old war horse still has its place and is alive and well. Like all analytical methods, when used correctly it can lead to enhanced market positioning and competitive advantage, while improper application can have severe negative consequences.

This session will explore what are the elements of success, and what are the worse practices that lead to failure. The relationship between segmentation and predictive modeling will also be discussed to clarify when it is appropriate to use one versus the other, and how to use them together synergistically.

Length – 45:57 | Email to a Colleague

Price: $195

 

 

kobelius1.jpg
Add to Cart
TOPIC: SOCIAL DATA
Thought Leadership
Social Network Analysis: Killer Application for Cloud Analytics
 

James Kobielus, Senior Analyst, Forrester Research

Social networks such as Twitter and Facebook are a potential goldmine of insights on what is truly going through customers´minds. Every company wants to know whether, how, how often, and by whom they´re being mentioned across the billowing new cloud of social media. Just as important, every company wants to influence those discussions in their favor, target new business, and harvest maximum revenue potential. In this session, Forrester analyst James Kobielus identifies fruitful applications of social network analysis in customer service, sales, marketing, and brand management. He presents a roadmap for enterprises to leverage their inline analytics initiatives and leverage high-performance data warehousing (DW) clouds and appliances in order to analyze shifting patterns of customer sentiment, influence, and propensity. Leveraging Forrester’s ongoing research in advanced analytics and customer relationship management, Kobielus will discuss industry trends, commercial modeling tools, and emerging best practices in social network analysis, which represents a game-changing new discipline in predictive analytics.

Length – 48:16 | Email to a Colleague

Price: $195

 

 

dogan.jpg
Add to Cart
TOPIC: HEALTHCARE – INTERNATIONAL TARGETING
Case Study: Life Line Screening
Taking CRM Global Through Predictive Analytics 

Ozgur Dogan,
VP, Quantitative Solutions Group, Merkle Inc

Trish Mathe,
Director of Database Marketing, Life Line Screening

While Life Line is successfully executing a US CRM roadmap, they are also beginning this same evolution abroad. They are beginning in the UK where Merkle procured data and built a response model that is pulling responses over 30% higher than competitors. This presentation will give an overview of the US CRM roadmap, and then focus on the beginning of their strategy abroad, focusing on the data procurement they could not get anywhere else but through Merkle and the successful modeling and analytics for the UK. Speaker: Ozgur Dogan, VP, Quantitative Solutions Group, Merkle Inc Speaker: Trish Mathe, Director of Database Marketing, Life Line Screening

Length – 40:12 | Email to a Colleague

Price: $195

 

 

sambamoorthi1.jpg
Add to Cart
TOPIC: SURVEY ANALYSIS
Case Study: Forrester
Making Survey Insights Addressable and Scalable – The Case Study of Forrester’s Technographics Benchmark Survey 

Nethra Sambamoorthi, Team Leader, Consumer Dynamics & Analytics, Global Consulting, Acxiom Corporation

Marketers use surveys to create enterprise wide applicable strategic insights to: (1) develop segmentation schemes, (2) summarize consumer behaviors and attitudes for the whole US population, and (3) use multiple surveys to draw unified views about their target audience. However, these insights are not directly addressable and scalable to the whole consumer universe which is very important when applying the power of survey intelligence to the one to one consumer marketing problems marketers routinely face. Acxiom partnered with Forrester Research, creating addressable and scalable applications of Forrester’s Technographics Survey and applied it successfully to a number of industries and applications.

Length – 39:23 | Email to a Colleague

Price: $195

 

 

zasadil.jpg
Add to Cart
TOPIC: HEALTHCARE
Case Study: UPMC Health Plan
A Predictive Model for Hospital Readmissions 

Scott Zasadil, Senior Scientist, UPMC Health Plan

Hospital readmissions are a significant component of our nation’s healthcare costs. Predicting who is likely to be readmitted is a challenging problem. Using a set of 123,951 hospital discharges spanning nearly three years, we developed a model that predicts an individual’s 30-day readmission should they incur a hospital admission. The model uses an ensemble of boosted decision trees and prior medical claims and captures 64% of all 30-day readmits with a true positive rate of over 27%. Moreover, many of the ‘false’ positives are simply delayed true positives. 53% of the predicted 30-day readmissions are readmitted within 180 days.

Length – 54:18 | Email to a Colleague

Price: $195

Interview Carole Jesse Experienced Analytics Professional

An interview with Carole Jesse, an experienced Analytics professional in SAS, JMP , analytics and Risk Management.

CAJphoto_20091019(2)

Ajay- Describe your career in science from school to now.

Carole- Truthfully, my career in science started in 7th grade. Hey, I know this is further back in time than you intended the question to go!  However, something significant happened that year that pretty much set me on the path that I am still on today.  I discovered Algebra.  Up to that point in time, I was an average student in ‘arithmetic’. Algebra introduced LETTERS into the mix with numbers, in the simplest of ways that we have all seen: ‘Solve for x in the equation x+2=5’.  That was something I could get behind, AND I excelled at it immediately. Without mathematical excellence, efforts in learning science can fall apart.  Mathematics is everywhere!

I spent the rest of my secondary education consuming all the math and science that I could get. By the time I entered college I had already been exposed to pre-calculus and physics and was actually surprised by those in my college Freshman courses who had not seen anti-derivatives, memorized the quotient rule, or worked an inclined plane friction problem before.

My goal as an undergraduate was to become a Veterinarian.  The beauty of a pre-Vet curriculum is that it is pretty much like pre-Med, rigorous and broad in the sciences.  In my first two years of undergraduate work, I was exposed to more Chemistry, more Mathematics, more Physics, along with things like Genetics, Biology, even the Plant and Animal Sciences.  Although I did not stick with my pursuit of Veterinary Medicine, it laid a solid foundation that has served me very well in the strangest of places.

I consider myself a Mathematician/Statistician due to my academic degrees in those areas, first a BS in Mathematics/Physics at the University of Wisconsin followed by a MS in Statistics at Montana State University. In between the BS and MS I also dabbled briefly in Electrical Engineering at the University of Minnesota.

Since academia, it is my breadth in ALL sciences which has allowed me to be very fluid in straddling diverse industries: from High Volume Manufacturing of Consumer Products, to Nuclear Energy, to Semiconductor Manufacturing/Packaging, to Financial Services, to Health Care. I succeed at business problem solving in these industries by applying my Statistical Methods knowledge, coupled with business acumen and peripheral understanding of the technologies used. I have worked closely with scientists and engineers, and could enter THEIR world speaking THEIR language, which was an aid in getting to these solutions quickly.

I can not place enough emphasis on the importance of exposure to a broad range of sciences, and as early as possible, for anyone who wants to be involved in Advanced Analytics and Business Intelligence. As a manager, I look closely at candidates for these diverse sorts of backgrounds.

Ajay- I find the number of computer scientists and analysts to be overwhelmingly male despite this being a lucrative profession. Do you think that BI and Analytics are male dominated?  How can the trend be re-shaped?

Carole- Welcome to my world!  All kidding aside, yes that has been my observation as well. While I am not versed in the specifics of actual gender statistics in Computer Science and Advanced Analytics versus other fields, based on my years in and around these fields, there does appear to be a bias.

This is not due to a lack of capability or interest in these fields on the part of women. I believe it is more due to the long history of cultural norms and negative social messages that perhaps push woman away from these fields.  The messages can be subtle, but if you pay close attention, you will see them.  Being one of 10 females in an undergraduate engineering class of 150 students has a message right there.  Even though these 10 women were able to make entry to the class, the pressure of being a minority, whether gender based or otherwise, can be a powerful influencer in remaining there.

In my own experience, I have encountered frequent judgments where I was made to feel “good at math” was an unacceptable trait for a woman to have.  It is important to note that these judgments have been delivered equally by men AND women. So I think until both genders develop higher expectations of women in the hard science areas, the trends will continue.  It has been decades since my 7th grade introduction to algebra, but it appears the negative social messages regarding girls in math and science are still present today. Otherwise there would be no need (i.e. no market) for books like Danica McKellar’s “Math Doesn’t Suck,” and the follow-up “Kiss My Math,” both aimed at battling these negative messages at the middle school level.

As to how I have battled these cultural expectations, I developed a thick skin. I have also learned to expect excellence from myself even when a teacher, or a peer, or a boss may have had lower expectations for me than for a male counterpart. Sort of a John Mayer “Who Says” type of attitude.  Who says I can’t do Math and Science. Watch me.

Ajay- How would you explain Risk Management using software to a class of graduate students in mathematics and statistics?

Carole- There are many areas of Risk Management.  My specific experience has been on the Credit Risk Management and Fraud Risk Management sides in a couple of industries.  For credit risk in financial services, typically there is a specific department whose role is to quantify and predict credit risk.  Not just for the current portfolio, but for new products as well.  Various methodologies are utilized, ranging from summarization of portfolio characteristics that have a known relationship to default to using historical data to build out predictive models for production implementation.

Key skills needed here are good understanding of the business, solid statistical methods knowledge, and computing skills.  As far as the computing /software skills needed, there are three main categories 1) query and preparation of data, 2) model building and validation, and 3) model implementation.  The actual tools will likely differ across these categories.

For example, 1) might be tackled with SAS®, Business Objects, or straight SQL;

2) requires a true modeling package or coding language like SAS®, SPSS, R, etc; and lastly

3) is the trickiest, as implementation can have many system limitations, but SAS® or C++ are often seen at implementation.

Ajay- Describe some of your most challenging and most exciting projects over the years.

Carole- I have been very fortunate to have many challenges and good projects in every role I have been in, but as I look back today, some things that stand out the most were in ‘high tech’.  By virtue of being high tech, there is no fear of technology, and it is fast-paced and ever evolving to the next generation of product.

I spent seven years in the Semiconductor industry during the 90’s at Micron Technology, Intel, and Motorola. At the beginning of that window, we left the 486 processor world, and during that window we spanned the realm of Pentium processors.” Moore’s Law dominated all of this. To stay competitive all of these companies embraced statistical methods to help speed up development time.

At one point, I supported a group of about 10 R&D engineers in the Design and Analysis of their process improvement and simplification experiments.  This afforded me exposure to much of the leading edge research the team was working on.

I recall one project with the goal of optimizing capacitance via surface roughness of the capacitor structures.  In addition to all the science involved at the manufacturing step, what made this so interesting was the difficulty in measuring capacitance at the point in the process where film roughness was introduced. All we had were surface images after this step.  The semiconductor wafers had to pass through several more process steps to get to the point where capacitance could actually be measured. All of this provided challenges around the design of the experiment and the data handling and analysis.

By working closely with both the process engineer and the process technician I was able to gather the image files off the image tool that were taken from the experimental runs. I used SAS® (yes, another shameless plug for my favorite software) to process the images using Fast Fourier Transforms. Subsequently, the transformed data was correlated to the capacitance in the analysis of the experimental results.  Finding the sweet spot for capacitance, as driven by surface roughness, provided a huge leap for this process technology team.

The challenges of today are much different than they were in the 90s.  In the more recent years, I have been working with transactional data related to financial services or health care claims.  The challenges manifest themselves in the sheer volume of the data. In the last decade in particular most industries have been able to put the infrastructures in place to gather and store massive amounts of data related to their businesses.  The challenge of turning this data into meaningful actionable information has been equally exciting as using Fast Fourier Transforms on image processing to optimize capacitance!

Currently I am working with an Oracle database where one table in the schema has 250 million records and a couple hundred fields.  I refer to this as a “Pushing Tera” situation, since this one table is close to a Terabyte in size. As far as storing the data, that is not a big deal, but working with data this large or larger is the challenge.

Different skill sets are needed here beyond those of just an analyst, data miner, or statistician.  These VLDB situations have morphed me into a bit of an IT person.

  • How do you efficiently query such large databases? An inefficient SQL query will not be a bother in a situation where the database is small. But when the database is large, SQL efficiency is key. Many skills needed for industry are not necessarily taught in academia, but rather get picked up along the way, like Unix and SQL. I now write efficient SQL code, but many poorly written jobs gave their lives so that I could learn these efficiencies!
  • Eventually I will need to organize this data into an application specific format and put data security controls around the process.  Again, is this Advanced Analytics?  Not really, it is more of an MIS role. The newness in these challenges keeps me excited about my work.

Ajay- How important do you think work life balance is for people in this profession? What do you do to chill out?

Carole- I don’t think the work-life balance is any more or less important to the decision science professionals than it is to any other profession really.  I have friends in many other professions like Law, Nursing, Financial Planning, etc. with the same work-life balance struggles.

We live in a busy culture that includes more and more demands placed on us professionally.  Let’s face it, most of us are care-takers to someone besides ourselves.  It might be a spouse, or a child, or a dog, or even an elderly parent. Therefore, a total focus on work is bound to upset the work-life balance for most of us.

My biggest struggle comes in the form of balancing the two sides of my brain.  That may sound weird, but one thing you have to agree with is that all of this is pretty Left Brained:  mathematics, statistics, business intelligence, computing, etc.

To balance this out, and tap into my Right Brain, I like to dabble in the arts to some extent.  Don’t get me wrong, I am not an artist!  But that doesn’t mean I can’t draw on creativity in the artistic sense. For example, this past summer I took a course on Adobe Photoshop and Illustrator at Minneapolis College of Art and Design. This provided the best of both worlds, combining software and art! In addition to learning how to remove Cindy Crawford’s mole (yes, we did this), there were some very useful projects.  One of my course projects was creating my customized Twitter background. An endeavor like this provides me a ‘chilling out’ factor from the normal work world. I know of many other Left Brain leaners that do similar things, like playing a musical instrument, or painting, etc. This is another reason why I took up digital photography: more visual arts.

Volunteer work has a balancing effect too. I try to give back to the community when I can. Swinging a hammer at Habitat for Humanity, or doing record keeping for an Animal Rescue organization, are things I have participated in.

And if none of this works, I enjoy cooking for my family and friends, and plying them with wine!

Ajay- What are you views on:

Carole- Data Quality

I’d have to say I am for data Quality! Who isn’t? But the reality is that data is dirty.  That “Pushing Tera” Oracle table I mentioned earlier, well it turns out it has some issues.  And it is incumbent upon me to determine the quality of that data before attempting to do anything analytical with it.  One place in industry where value enhancement are needed:  database administrators with business knowledge.  It seems that more times than not, even if there was a business savvy DBA they may have moved on, leaving the consumers of that data (that would be me) to fend for themselves. There is some debate over which philosopher said “Know thyself.”  Today’s job challenge is to “Know thy data” or perhaps “Value those that know thy data.”

B) Predictive Analytics for Fraud Monitoring

There is a huge market for analytics in fraud detection and prevention.  But it is not for the faint of heart. Insiders, at least in Mortgage and Health Care, are the typical perpetrators of lucrative fraud. These insiders know how the industry processes work and they exploit this.  As soon as one loophole is discovered and patched, fraudsters are looking for another loophole to exploit.  This makes the task of predictive analytics different for Fraud than other areas where underlying patterns are probably more stable.  Any methodology used here must have “turn on a dime” features built in, if possible.  With economic conditions as they are, fraud detection/monitoring will remain important and challenging field.

Biography

Carole Jesse has been applying statistical methods and advanced analytics in a variety of industries for the last 20 years.  Her career spans High Volume Manufacturing of Consumer Products, Nuclear Energy, Semiconductor Manufacturing/Packaging, Financial Services, and Health Care.  Applications have ranged from Design and Analysis of Experiments to Credit Risk Prediction to Fraud Pattern Recognition.  Carole holds a B.S. in Mathematics from the University of Wisconsin and a M.S. in Statistics from Montana State University, as well as several professional certifications.  All the opinions expressed here are her own, and not those of her employers: past, present, or future.  (Although her dog Angie may have had some influence.)  Ms. Jesse currently lives and works in Minneapolis, Minnesota.

You can find Carole on Twitter as @CaroleJesse and at LinkedIn http://www.linkedin.com/in/CaroleJesse



1) Describe your career in science from school to now.

Truthfully, my career in science started in 7th grade. Hey, I know this is further back in time than you intended the question to go!  However, something significant happened that year that pretty much set me on the path that I am still on today.  I discovered Algebra.  Up to that point in time, I was an average student in ‘arithmetic’. Algebra introduced LETTERS into the mix with numbers, in the simplest of ways that we have all seen: ‘Solve for x in the equation x+2=5’.  That was something I could get behind, AND I excelled at it immediately. Without mathematical excellence, efforts in learning science can fall apart.  Mathematics is everywhere!

I spent the rest of my secondary education consuming all the math and science that I could get. By the time I entered college I had already been exposed to pre-calculus and physics and was actually surprised by those in my college Freshman courses who had not seen anti-derivatives, memorized the quotient rule, or worked an inclined plane friction problem before.

My goal as an undergraduate was to become a Veterinarian.  The beauty of a pre-Vet curriculum is that it is pretty much like pre-Med, rigorous and broad in the sciences.  In my first two years of undergraduate work, I was exposed to more Chemistry, more Mathematics, more Physics, along with things like Genetics, Biology, even the Plant and Animal Sciences.  Although I did not stick with my pursuit of Veterinary Medicine, it laid a solid foundation that has served me very well in the strangest of places.

I consider myself a Mathematician/Statistician due to my academic degrees in those areas, first a BS in Mathematics/Physics at the University of Wisconsin followed by a MS in Statistics at Montana State University. In between the BS and MS I also dabbled briefly in Electrical Engineering at the University of Minnesota.

Since academia, it is my breadth in ALL sciences which has allowed me to be very fluid in straddling diverse industries: from High Volume Manufacturing of Consumer Products, to Nuclear Energy, to Semiconductor Manufacturing/Packaging, to Financial Services, to Health Care. I succeed at business problem solving in these industries by applying my Statistical Methods knowledge, coupled with business acumen and peripheral understanding of the technologies used. I have worked closely with scientists and engineers, and could enter THEIR world speaking THEIR language, which was an aid in getting to these solutions quickly.

I can not place enough emphasis on the importance of exposure to a broad range of sciences, and as early as possible, for anyone who wants to be involved in Advanced Analytics and Business Intelligence. As a manager, I look closely at candidates for these diverse sorts of backgrounds.

2) I find the number of computer scientists and analysts to be overwhelmingly male despite this being a lucrative profession. Do you think that BI and Analytics are male dominated?  How can the trend be re-shaped?

Welcome to my world!  All kidding aside, yes that has been my observation as well. While I am not versed in the specifics of actual gender statistics in Computer Science and Advanced Analytics versus other fields, based on my years in and around these fields, there does appear to be a bias.

This is not due to a lack of capability or interest in these fields on the part of women. I believe it is more due to the long history of cultural norms and negative social messages that perhaps push woman away from these fields.  The messages can be subtle, but if you pay close attention, you will see them.  Being one of 10 females in an undergraduate engineering class of 150 students has a message right there.  Even though these 10 women were able to make entry to the class, the pressure of being a minority, whether gender based or otherwise, can be a powerful influencer in remaining there.

In my own experience, I have encountered frequent judgments where I was made to feel “good at math” was an unacceptable trait for a woman to have.  It is important to note that these judgments have been delivered equally by men AND women. So I think until both genders develop higher expectations of women in the hard science areas, the trends will continue.  It has been decades since my 7th grade introduction to algebra, but it appears the negative social messages regarding girls in math and science are still present today. Otherwise there would be no need (i.e. no market) for books like Danica McKellar’s “Math Doesn’t Suck,” and the follow-up “Kiss My Math,” both aimed at battling these negative messages at the middle school level.

As to how I have battled these cultural expectations, I developed a thick skin. I have also learned to expect excellence from myself even when a teacher, or a peer, or a boss may have had lower expectations for me than for a male counterpart. Sort of a John Mayer “Who Says” type of attitude.  Who says I can’t do Math and Science. Watch me.

3) How would you explain Risk Management using software to a class of graduate students in mathematics and statistics?

There are many areas of Risk Management.  My specific experience has been on the Credit Risk Management and Fraud Risk Management sides in a couple of industries.  For credit risk in financial services, typically there is a specific department whose role is to quantify and predict credit risk.  Not just for the current portfolio, but for new products as well.  Various methodologies are utilized, ranging from summarization of portfolio characteristics that have a known relationship to default to using historical data to build out predictive models for production implementation.  Key skills needed here are good understanding of the business, solid statistical methods knowledge, and computing skills.  As far as the computing /software skills needed, there are three main categories 1) query and preparation of data, 2) model building and validation, and 3) model implementation.  The actual tools will likely differ across these categories.  For example, 1) might be tackled with SAS®, Business Objects, or straight SQL; 2) requires a true modeling package or coding language like SAS®, SPSS, R, etc; and lastly 3) is the trickiest, as implementation can have many system limitations, but SAS® or C++ are often seen at implementation.

4) Describe some of your most challenging and most exciting projects over the years.

I have been very fortunate to have many challenges and good projects in every role I have been in, but as I look back today, some things that stand out the most were in ‘high tech’.  By virtue of being high tech, there is no fear of technology, and it is fast-paced and ever evolving to the next generation of product.

I spent seven years in the Semiconductor industry during the 90’s at Micron Technology, Intel, and Motorola. At the beginning of that window, we left the 486 processor world, and during that window we spanned the realm of Pentium processors.” Moore’s Law dominated all of this. To stay competitive all of these companies embraced statistical methods to help speed up development time.

At one point, I supported a group of about 10 R&D engineers in the Design and Analysis of their process improvement and simplification experiments.  This afforded me exposure to much of the leading edge research the team was working on.

I recall one project with the goal of optimizing capacitance via surface roughness of the capacitor structures.  In addition to all the science involved at the manufacturing step, what made this so interesting was the difficulty in measuring capacitance at the point in the process where film roughness was introduced. All we had were surface images after this step.  The semiconductor wafers had to pass through several more process steps to get to the point where capacitance could actually be measured. All of this provided challenges around the design of the experiment and the data handling and analysis.

By working closely with both the process engineer and the process technician I was able to gather the image files off the image tool that were taken from the experimental runs. I used SAS® (yes, another shameless plug for my favorite software) to process the images using Fast Fourier Transforms. Subsequently, the transformed data was correlated to the capacitance in the analysis of the experimental results.  Finding the sweet spot for capacitance, as driven by surface roughness, provided a huge leap for this process technology team.

The challenges of today are much different than they were in the 90s.  In the more recent years, I have been working with transactional data related to financial services or health care claims.  The challenges manifest themselves in the sheer volume of the data. In the last decade in particular most industries have been able to put the infrastructures in place to gather and store massive amounts of data related to their businesses.  The challenge of turning this data into meaningful actionable information has been equally exciting as using Fast Fourier Transforms on image processing to optimize capacitance!

Currently I am working with an Oracle database where one table in the schema has 250 million records and a couple hundred fields.  I refer to this as a “Pushing Tera” situation, since this one table is close to a Terabyte in size. As far as storing the data, that is not a big deal, but working with data this large or larger is the challenge.

Different skill sets are needed here beyond those of just an analyst, data miner, or statistician.  These VLDB situations have morphed me into a bit of an IT person.

  • How do you efficiently query such large databases? An inefficient SQL query will not be a bother in a situation where the database is small. But when the database is large, SQL efficiency is key. Many skills needed for industry are not necessarily taught in academia, but rather get picked up along the way, like Unix and SQL. I now write efficient SQL code, but many poorly written jobs gave their lives so that I could learn these efficiencies!
  • Eventually I will need to organize this data into an application specific format and put data security controls around the process.  Again, is this Advanced Analytics?  Not really, it is more of an MIS role. The newness in these challenges keeps me excited about my work.

5)  How important do you think work life balance is for people in this profession? What do you do to chill out?

I don’t think the work-life balance is any more or less important to the decision science professionals than it is to any other profession really.  I have friends in many other professions like Law, Nursing, Financial Planning, etc. with the same work-life balance struggles.

We live in a busy culture that includes more and more demands placed on us professionally.  Let’s face it, most of us are care-takers to someone besides ourselves.  It might be a spouse, or a child, or a dog, or even an elderly parent. Therefore, a total focus on work is bound to upset the work-life balance for most of us.

My biggest struggle comes in the form of balancing the two sides of my brain.  That may sound weird, but one thing you have to agree with is that all of this is pretty Left Brained:  mathematics, statistics, business intelligence, computing, etc.

To balance this out, and tap into my Right Brain, I like to dabble in the arts to some extent.  Don’t get me wrong, I am not an artist!  But that doesn’t mean I can’t draw on creativity in the artistic sense. For example, this past summer I took a course on Adobe Photoshop and Illustrator at Minneapolis College of Art and Design. This provided the best of both worlds, combining software and art! In addition to learning how to remove Cindy Crawford’s mole (yes, we did this), there were some very useful projects.  One of my course projects was creating my customized Twitter background. An endeavor like this provides me a ‘chilling out’ factor from the normal work world. I know of many other Left Brain leaners that do similar things, like playing a musical instrument, or painting, etc. This is another reason why I took up digital photography: more visual arts.

Volunteer work has a balancing effect too. I try to give back to the community when I can. Swinging a hammer at Habitat for Humanity, or doing record keeping for an Animal Rescue organization, are things I have participated in.

And if none of this works, I enjoy cooking for my family and friends, and plying them with wine!

6) What are you views on:

A)  Data Quality

I’d have to say I am for data Quality! Who isn’t? But the reality is that data is dirty.  That “Pushing Tera” Oracle table I mentioned earlier, well it turns out it has some issues.  And it is incumbent upon me to determine the quality of that data before attempting to do anything analytical with it.  One place in industry where value enhancement are needed:  database administrators with business knowledge.  It seems that more times than not, even if there was a business savvy DBA they may have moved on, leaving the consumers of that data (that would be me) to fend for themselves. There is some debate over which philosopher said “Know thyself.”  Today’s job challenge is to “Know thy data” or perhaps “Value those that know thy data.”

B) Predictive Analytics for Fraud Monitoring

There is a huge market for analytics in fraud detection and prevention.  But it is not for the faint of heart. Insiders, at least in Mortgage and Health Care, are the typical perpetrators of lucrative fraud. These insiders know how the industry processes work and they exploit this.  As soon as one loophole is discovered and patched, fraudsters are looking for another loophole to exploit.  This makes the task of predictive analytics different for Fraud than other areas where underlying patterns are probably more stable.  Any methodology used here must have “turn on a dime” features built in, if possible.  With economic conditions as they are, fraud detection/monitoring will remain important and challenging field.

%d bloggers like this: