Who will forecast for the forecasters?

An interesting blog post appeared here at http://www.information-management.com/blogs/business_intelligence_bi_statistics-10016491-1.html basically laying down the competitive landscape for analytical companies.

“-One safe bet is that IBM, with newly-acquired SPSS and Cognos, is gearing up to take on SAS in the high-end enterprise analytics market that features very large data and operational analytics with significant capacity challenges. In this segment, IBM can leverage its hardware, database and consulting strengths to become a formidable SAS competitor.

and

A number of start-up companies promoting competitive SAS language tools at a fraction of SAS prices may begin chipping away at many SAS annuity customers. As I wrote in last week’s blog, WPS from World Programming Systems is an outstanding SAS compiler that can replace expensive SAS licenses in many cases – especially those primarily used for data step programs. Similarly, another competitor, Carolina, from Dulles Research, LLC, converts Base SAS program code to Java, which can then be deployed in a Java run-time environment. Large SAS customer Discover Card is currently evaluating Carolina as a replacement for some its SAS applications.

CITATION-Steve Miller’s blog can also be found at miller.openbi.com.”

I think all companies have hired smart enough people and many of their efforts would cancel each other out in a true game theory manner.

I also find it extremely hypocritical for commercial vendors not to incentive R algorithm developers and treat the 2000 plus packages as essentially free ware.

If used for academics and research, R package creators expect and demand no money. But if used commercially – shouldnt the leading analytical vendors like SAS, SPSS, and even the newly cash infused REVolution create some kind of royalty sharing agreement.

If iTunes can help sell songs for 99 cents per download and really help the Music industry come to the next generation- how much would commercial vendors agree to share their solutions which ARE DEPENDENT on popular R packages like Plier or even Dr Frank’s Hmisc.

Unless you think Britney Spears has a better right to intellectual property than venerable professors and academics.

Even a monthly 10000 USD prize for the best R package created ( that can be used by that specific company’s use for commercial packages) can help speed up the R software movement- just like NetFlix prize.

More importantly – it can free up valuable resources for companies to concentrate on customer solutions like data quality, systems integration and computational environment shift to clouds which even todayis sadly lacking in the whole analytical ecosystem.

One interesting paradigm I find is that who ever masters the new computational requirements of unstructured large amounts of data ( not just row and column numeric data) but text sentiment analysis like data, and can integrate this for a complete customer solution in an easy to understand data visualization enabled system- that specific package,platform  or company would be leading the next decade

( Q -if the 90s were the Nineties will the next decade be the teen years)

Tera Data and SAS innovate together

I missed out posting this one- it’s the big big news of Tera Data and SAS coming closer to create a much needed Business Analytics Centre.

SAS and Teradata will establish a Business Innovation Analytic Center.

In conjunction with Elder Research Inc., the Center will offer leading-edge analytic thinking and implementation.

Core news facts:

With this centralized “think-tank,” customers can discuss analytic best practices with domain subject matter experts and quickly test or implement innovative models focused on uncovering unique insights for optimizing business operations.

The Business Analytic Innovation Center will combine the strengths of SAS, Teradata and Elder Research (the world’s leading analytical consulting firm for data mining and predictive analytics) to help customers across all industries grow revenue, reduce risk and improve operations.

The Center will incorporate unmatched thought leadership with a visionary lab for pilot programs, analytic workshops, and proofs-of-concept for prospective customers across a myriad of industries, including financial services, retail, government, health care and life sciences, and insurance.

Also see- http://www.sas.com/partners/directory/teradata/index.html

teradata1

Press Release

CARY, NC (Oct. 26, 2009) – Recognizing the growing need and challenges businesses face driving operational analytics across enterprises, SAS and Teradata are planning to establish a centralized “think tank” where customers can discuss analytic best practices with domain and subject-matter experts, and quickly test or implement innovative models that uncover unique insights for optimizing business operations. The Business Analytics Innovation Center will combine the strengths of SAS, the leader in business analytics <http://www.sas.com/businessanalytics/> software and services, Teradata Corporation <http://www.teradata.com> (NYSE: TDC <http://www.nyse.com/about/listed/lcddata.html?ticker=TDC> ), the world’s largest company solely focused on data warehousing <http://www.teradata.com/t/enterprise-data-warehousing/> and enterprise analytics <http://www.teradata.com/t/business-needs/data-mining-and-analytics/> , and Elder Research Inc <http://www.datamininglab.com/> . (ERI), the world’s leading analytical consulting firm for data mining and predictive analytics, to help customers across all industries grow revenue, reduce risk and improve operations.

“For decades, SAS, Teradata and ERI have provided state-of-the-art analytical solutions for their respective customers,” said Jim Davis, SAS Senior Vice President and Chief Marketing Officer. “The Business Analytics Innovation Center combines our individual strengths for a robust approach to enterprise analytics. By acting faster and asking questions they previously couldn’t ask or didn’t think of, participating clients will be able to improve their operations and strengthen competitive advantage.”

The Center will incorporate unmatched thought leadership with a visionary lab for pilot programs, analytic workshops, and proofs of concept for prospective customers across a myriad of industries, including financial services, retail, government, health care and life sciences, and insurance.

The joint offerings that will comprise the Center are expected to yield powerful results. For example, a top global consumer electronics firm used resources from ERI that will become part of the Business Analytics Innovation Center to discover and prevent fraud committed by clients and partners. Within the first year, the firm recovered more than $20 million.

They continue to rely on ERI for new analytical programs, rewarding their anti-fraud division’s effectiveness with expanded budget, staff and charter.

(Note from Ajay- Read earlier article on case management solutions by SAS in recent press release announcement at Business Leadership series ( the event following the Data Mining 2009 series)

Screenshot- Terrific web and social media campaign by www.teradata.com
( The Only BI company I know of that has an Iphone application AND a Facebook application)

Running Stats Softwares on Clouds

If you have a small beatup laptop and want to rent say heavy hardware and expensive software, a technology demonstration is underway at www.analysis.utk.edu

We basically use a Citrix Server to run R, SAS and JMP ( among others on the Web through the browser)

Note- If you just run this on a normal Web Server with lots of hardware packed behind it- you can start giving cloud computing solutions to your clients for free.

And eliminate the OS ( Windows unless you like them ) and HW ( Read HP etc etc)

thus reducing Total Cost of Ownership for your final customers.

Is Revolution / SAS / SPSS Listening.

See Screenshot-

(Note this can be of terrific use for say software companies wanting to license out new markets in Asia as they can also analyze usage data and share the efficiencies with newer users. By adopting a Software as a Service Model ( by optimizing the revenue stream for cannibalization effects) they can also gain an advantage over more established players ( like R and SPSS /IBM can do with its need for more hardware and broader sales and distribution network)

Note- these are my personal views only and dont represent the University’s views. For more on University of Tennessee ‘s technology initiatives please go to http://oit.utk.edu/index.php where Scott Studham is using his expertise to revolutionalize the way education costs can be lowered using technology.

sas

Audio Interview Anne Milley , Part 1

Here is an interview I did at M2009 with Anne Milley, senior SAS Strategist – This is an audio interview so listen on-

n1105016319_322

or if bandwidth is too slow, you can download the interview here itself

https://decisionstats.com/wp-content/uploads/2009/11/memo1.m4a

 

 

 

 

 

Faye Meredith of SAS was kind enough to speak aloud the questions

This was a nice follow up on the March interview

<https://decisionstats.wordpress.com/2009/03/04/interview-anne-milley-sas-part-1/>.

Biography-

Anne H. Milley is senior director of SAS’ technology product marketing in worldwide marketing, Anne Milley oversees the marketing of SAS technologies.  Her ties to SAS began with her thesis on bank failure prediction models and the term structure of interest rates.  She completed this at The Federal Home Loan Bank of Dallas and became a manager in the credit group.  She continued her use of SAS at 7-Eleven, Inc. as a senior business consultant performing sales analysis and designing and conducting tests to aid in strategic decision-making, e.g., price sensitivity studies, advertising and promotion analysis.

Milley has authored various papers, articles and an award-winning report for the 1999 KDD Contest: http://www-cse.ucsd.edu/users/elkan/kdresults.html.  She co-chaired the SAS Data Mining Technology Conferences, M2001 and M2002 as well as SAS’ inaugural forecasting conference, F2006.  She has served on web mining committees for KDD and SIAM and on the Scientific Advisory Committee for Data Mining 2002.  In 2008 she completed a 5-month working sabbatical at a major financial services company in the United Kingdom.

Disclaimer- Travel and Stay Expenses to Data Mining 2009 were paid for me by SAS Institute. This is as per FCC regulations to bloggers. All opinions expressed are personal and not representing any organization.

SAS on Fraud

As part of the ongoing leadership series of SAS Inc in Las Vegas here is a press release-

http://www.sas.com/news/preleases/casemanagementPBLSVegas09.html

 

SAS supplies super sleuthing powers with new case management solution

The Premier Business Leadership Series, LAS VEGAS  (Oct. 28, 2009)  –  Organizations in areas such as banking, insurance, government and healthcare need to stay one step ahead of criminals and better manage investigations of fraud and other financial crimes. New software from SAS, the leader in business analytics software and services, arms investigators with the analytical power to streamline processes and investigations, helping to reduce costs and improve fraud prevention. SAS® Enterprise Case Management is the latest component to the recently announced SAS Fraud Framework.

A 2008 survey by the Association of Certified Fraud Examiners (ACFE) revealed that 90 percent of ACFE member companies expect fraud to continue growing, with internal fraud increasing most rapidly. A recent TowerGroup report1 in August 2009 found “case management can contribute a return on investment beyond fraud and risk mitigation by reducing operating costs and even uncovering new business opportunities. Case management offers an effective and automated means to leverage data used in fraud investigation for marketing and proactive customer outreach to improve retention.”

“Leading case management technologies will make it easier to see, aggregate, and integrate operational and other firmwide risks while helping to mitigate the cost and complexity of compliance,” said Rodney Nelsestuen, Senior Research Director at TowerGroup and author of the August 2009 report.

Case management, the backbone of documenting investigations, provides information for financial reporting, such as fraud losses, and is the primary resource for filing regulatory reports to government agencies. In addition, the disposition of cases under investigation is a critical element to enhance future monitoring and overall operational efficiencies of the organization.

Other advantages of SAS Enterprise Case Management include integrating information from siloed monitoring systems used by other lines of business, products, or channels. The system offers a triage queue where users can review work items sent to the system automatically prior to appending to an existing case, creating a new case, or closing the work item. SAS Enterprise Case Management pre-populates forms and automatically prepares batch reports to help the user meet e-filing standards for regulatory submission. The system includes interactive dashboards that enable management to analyze operational performance of investigative functions and recognize trends to provide improved oversight and governance of risk management practices.

Today’s announcement came at The Premier Business Leadership Series event in Las Vegas, a business conference presented by SAS that brings together more than 600 attendees from the public and private sectors to share ideas on critical business issues.

1. TowerGroup. The Evolution of Case Management: Converging Fraud, Risk, and Opportunity Management, Rodney Nelsestuen, August 31, 2009.

About SAS

Back to Recent SAS Press Releases

SAS is the leader in business analytics software and services, and the largest independent vendor in the business intelligence market. Through innovative solutions delivered within an integrated framework, SAS helps customers at more than 45,000 sites improve performance and deliver value by making better decisions faster. Since 1976 SAS has been giving customers around the world The Power to Know® .

SAS Data Mining 2009 Las Vegas

I am going to Las Vegas as a guest of SAS Institute for the Data Mining 2009 Conference. ( Note FCC regulations on bloggers come in effective December but my current policies are in ADVERTISE page unchanged since some months now)

With the big heavyweight of analytics, SAS Institute showcases events in both the SAS Global Forum and the Data Mining 2009

conference has a virtual who’s- who of partners there. This includes my friends at Aster Data and Shawn Rogers, Beye Network

in addition to Anne Milley, Senior Product Director. Anne is a frequent speaker for SAS Institute and has shrug off the beginning of the year NY Times spat with R /Open Source. True to their word they did go ahead and launch the SAS/IML with the interface to R – mindful of GPL as well as open source sentiments.

. While SPSS does have a data mining product there is considerable discussion on that help list today on what direction IBM will allow the data mining product to evolve.

Charlie Berger, from Oracle Data Mining , also announced at Oracle World that he is going to launch a GUI based data mining product for free ( or probably Software as a Service Model)- Thanks to Karl Rexer from Rexer Analytics for this tip.

While this is my first trip to Las Vegas ( a change from cold TN weather), I hope to read new stuff on data mining including sessions on blog and text mining and statistical usage of the same. Data Mining continues to be an enduring passion for me even though I need to get maybe a Divine Miracle for my Phd to get funded on that topic.

Also I may have some tweets at #M2009 for you and some video interviews/ photos. Ok- Watch this space.

ps _ We lost to Alabama #2 in the country by two points because 2 punts were blocked by hand which were as close as it gets.

Next week I hope to watch the South Carolina match in Orange Country.

Screenshot-32

Interview Carole Jesse Experienced Analytics Professional

An interview with Carole Jesse, an experienced Analytics professional in SAS, JMP , analytics and Risk Management.

CAJphoto_20091019(2)

Ajay- Describe your career in science from school to now.

Carole- Truthfully, my career in science started in 7th grade. Hey, I know this is further back in time than you intended the question to go!  However, something significant happened that year that pretty much set me on the path that I am still on today.  I discovered Algebra.  Up to that point in time, I was an average student in ‘arithmetic’. Algebra introduced LETTERS into the mix with numbers, in the simplest of ways that we have all seen: ‘Solve for x in the equation x+2=5’.  That was something I could get behind, AND I excelled at it immediately. Without mathematical excellence, efforts in learning science can fall apart.  Mathematics is everywhere!

I spent the rest of my secondary education consuming all the math and science that I could get. By the time I entered college I had already been exposed to pre-calculus and physics and was actually surprised by those in my college Freshman courses who had not seen anti-derivatives, memorized the quotient rule, or worked an inclined plane friction problem before.

My goal as an undergraduate was to become a Veterinarian.  The beauty of a pre-Vet curriculum is that it is pretty much like pre-Med, rigorous and broad in the sciences.  In my first two years of undergraduate work, I was exposed to more Chemistry, more Mathematics, more Physics, along with things like Genetics, Biology, even the Plant and Animal Sciences.  Although I did not stick with my pursuit of Veterinary Medicine, it laid a solid foundation that has served me very well in the strangest of places.

I consider myself a Mathematician/Statistician due to my academic degrees in those areas, first a BS in Mathematics/Physics at the University of Wisconsin followed by a MS in Statistics at Montana State University. In between the BS and MS I also dabbled briefly in Electrical Engineering at the University of Minnesota.

Since academia, it is my breadth in ALL sciences which has allowed me to be very fluid in straddling diverse industries: from High Volume Manufacturing of Consumer Products, to Nuclear Energy, to Semiconductor Manufacturing/Packaging, to Financial Services, to Health Care. I succeed at business problem solving in these industries by applying my Statistical Methods knowledge, coupled with business acumen and peripheral understanding of the technologies used. I have worked closely with scientists and engineers, and could enter THEIR world speaking THEIR language, which was an aid in getting to these solutions quickly.

I can not place enough emphasis on the importance of exposure to a broad range of sciences, and as early as possible, for anyone who wants to be involved in Advanced Analytics and Business Intelligence. As a manager, I look closely at candidates for these diverse sorts of backgrounds.

Ajay- I find the number of computer scientists and analysts to be overwhelmingly male despite this being a lucrative profession. Do you think that BI and Analytics are male dominated?  How can the trend be re-shaped?

Carole- Welcome to my world!  All kidding aside, yes that has been my observation as well. While I am not versed in the specifics of actual gender statistics in Computer Science and Advanced Analytics versus other fields, based on my years in and around these fields, there does appear to be a bias.

This is not due to a lack of capability or interest in these fields on the part of women. I believe it is more due to the long history of cultural norms and negative social messages that perhaps push woman away from these fields.  The messages can be subtle, but if you pay close attention, you will see them.  Being one of 10 females in an undergraduate engineering class of 150 students has a message right there.  Even though these 10 women were able to make entry to the class, the pressure of being a minority, whether gender based or otherwise, can be a powerful influencer in remaining there.

In my own experience, I have encountered frequent judgments where I was made to feel “good at math” was an unacceptable trait for a woman to have.  It is important to note that these judgments have been delivered equally by men AND women. So I think until both genders develop higher expectations of women in the hard science areas, the trends will continue.  It has been decades since my 7th grade introduction to algebra, but it appears the negative social messages regarding girls in math and science are still present today. Otherwise there would be no need (i.e. no market) for books like Danica McKellar’s “Math Doesn’t Suck,” and the follow-up “Kiss My Math,” both aimed at battling these negative messages at the middle school level.

As to how I have battled these cultural expectations, I developed a thick skin. I have also learned to expect excellence from myself even when a teacher, or a peer, or a boss may have had lower expectations for me than for a male counterpart. Sort of a John Mayer “Who Says” type of attitude.  Who says I can’t do Math and Science. Watch me.

Ajay- How would you explain Risk Management using software to a class of graduate students in mathematics and statistics?

Carole- There are many areas of Risk Management.  My specific experience has been on the Credit Risk Management and Fraud Risk Management sides in a couple of industries.  For credit risk in financial services, typically there is a specific department whose role is to quantify and predict credit risk.  Not just for the current portfolio, but for new products as well.  Various methodologies are utilized, ranging from summarization of portfolio characteristics that have a known relationship to default to using historical data to build out predictive models for production implementation.

Key skills needed here are good understanding of the business, solid statistical methods knowledge, and computing skills.  As far as the computing /software skills needed, there are three main categories 1) query and preparation of data, 2) model building and validation, and 3) model implementation.  The actual tools will likely differ across these categories.

For example, 1) might be tackled with SAS®, Business Objects, or straight SQL;

2) requires a true modeling package or coding language like SAS®, SPSS, R, etc; and lastly

3) is the trickiest, as implementation can have many system limitations, but SAS® or C++ are often seen at implementation.

Ajay- Describe some of your most challenging and most exciting projects over the years.

Carole- I have been very fortunate to have many challenges and good projects in every role I have been in, but as I look back today, some things that stand out the most were in ‘high tech’.  By virtue of being high tech, there is no fear of technology, and it is fast-paced and ever evolving to the next generation of product.

I spent seven years in the Semiconductor industry during the 90’s at Micron Technology, Intel, and Motorola. At the beginning of that window, we left the 486 processor world, and during that window we spanned the realm of Pentium processors.” Moore’s Law dominated all of this. To stay competitive all of these companies embraced statistical methods to help speed up development time.

At one point, I supported a group of about 10 R&D engineers in the Design and Analysis of their process improvement and simplification experiments.  This afforded me exposure to much of the leading edge research the team was working on.

I recall one project with the goal of optimizing capacitance via surface roughness of the capacitor structures.  In addition to all the science involved at the manufacturing step, what made this so interesting was the difficulty in measuring capacitance at the point in the process where film roughness was introduced. All we had were surface images after this step.  The semiconductor wafers had to pass through several more process steps to get to the point where capacitance could actually be measured. All of this provided challenges around the design of the experiment and the data handling and analysis.

By working closely with both the process engineer and the process technician I was able to gather the image files off the image tool that were taken from the experimental runs. I used SAS® (yes, another shameless plug for my favorite software) to process the images using Fast Fourier Transforms. Subsequently, the transformed data was correlated to the capacitance in the analysis of the experimental results.  Finding the sweet spot for capacitance, as driven by surface roughness, provided a huge leap for this process technology team.

The challenges of today are much different than they were in the 90s.  In the more recent years, I have been working with transactional data related to financial services or health care claims.  The challenges manifest themselves in the sheer volume of the data. In the last decade in particular most industries have been able to put the infrastructures in place to gather and store massive amounts of data related to their businesses.  The challenge of turning this data into meaningful actionable information has been equally exciting as using Fast Fourier Transforms on image processing to optimize capacitance!

Currently I am working with an Oracle database where one table in the schema has 250 million records and a couple hundred fields.  I refer to this as a “Pushing Tera” situation, since this one table is close to a Terabyte in size. As far as storing the data, that is not a big deal, but working with data this large or larger is the challenge.

Different skill sets are needed here beyond those of just an analyst, data miner, or statistician.  These VLDB situations have morphed me into a bit of an IT person.

  • How do you efficiently query such large databases? An inefficient SQL query will not be a bother in a situation where the database is small. But when the database is large, SQL efficiency is key. Many skills needed for industry are not necessarily taught in academia, but rather get picked up along the way, like Unix and SQL. I now write efficient SQL code, but many poorly written jobs gave their lives so that I could learn these efficiencies!
  • Eventually I will need to organize this data into an application specific format and put data security controls around the process.  Again, is this Advanced Analytics?  Not really, it is more of an MIS role. The newness in these challenges keeps me excited about my work.

Ajay- How important do you think work life balance is for people in this profession? What do you do to chill out?

Carole- I don’t think the work-life balance is any more or less important to the decision science professionals than it is to any other profession really.  I have friends in many other professions like Law, Nursing, Financial Planning, etc. with the same work-life balance struggles.

We live in a busy culture that includes more and more demands placed on us professionally.  Let’s face it, most of us are care-takers to someone besides ourselves.  It might be a spouse, or a child, or a dog, or even an elderly parent. Therefore, a total focus on work is bound to upset the work-life balance for most of us.

My biggest struggle comes in the form of balancing the two sides of my brain.  That may sound weird, but one thing you have to agree with is that all of this is pretty Left Brained:  mathematics, statistics, business intelligence, computing, etc.

To balance this out, and tap into my Right Brain, I like to dabble in the arts to some extent.  Don’t get me wrong, I am not an artist!  But that doesn’t mean I can’t draw on creativity in the artistic sense. For example, this past summer I took a course on Adobe Photoshop and Illustrator at Minneapolis College of Art and Design. This provided the best of both worlds, combining software and art! In addition to learning how to remove Cindy Crawford’s mole (yes, we did this), there were some very useful projects.  One of my course projects was creating my customized Twitter background. An endeavor like this provides me a ‘chilling out’ factor from the normal work world. I know of many other Left Brain leaners that do similar things, like playing a musical instrument, or painting, etc. This is another reason why I took up digital photography: more visual arts.

Volunteer work has a balancing effect too. I try to give back to the community when I can. Swinging a hammer at Habitat for Humanity, or doing record keeping for an Animal Rescue organization, are things I have participated in.

And if none of this works, I enjoy cooking for my family and friends, and plying them with wine!

Ajay- What are you views on:

Carole- Data Quality

I’d have to say I am for data Quality! Who isn’t? But the reality is that data is dirty.  That “Pushing Tera” Oracle table I mentioned earlier, well it turns out it has some issues.  And it is incumbent upon me to determine the quality of that data before attempting to do anything analytical with it.  One place in industry where value enhancement are needed:  database administrators with business knowledge.  It seems that more times than not, even if there was a business savvy DBA they may have moved on, leaving the consumers of that data (that would be me) to fend for themselves. There is some debate over which philosopher said “Know thyself.”  Today’s job challenge is to “Know thy data” or perhaps “Value those that know thy data.”

B) Predictive Analytics for Fraud Monitoring

There is a huge market for analytics in fraud detection and prevention.  But it is not for the faint of heart. Insiders, at least in Mortgage and Health Care, are the typical perpetrators of lucrative fraud. These insiders know how the industry processes work and they exploit this.  As soon as one loophole is discovered and patched, fraudsters are looking for another loophole to exploit.  This makes the task of predictive analytics different for Fraud than other areas where underlying patterns are probably more stable.  Any methodology used here must have “turn on a dime” features built in, if possible.  With economic conditions as they are, fraud detection/monitoring will remain important and challenging field.

Biography

Carole Jesse has been applying statistical methods and advanced analytics in a variety of industries for the last 20 years.  Her career spans High Volume Manufacturing of Consumer Products, Nuclear Energy, Semiconductor Manufacturing/Packaging, Financial Services, and Health Care.  Applications have ranged from Design and Analysis of Experiments to Credit Risk Prediction to Fraud Pattern Recognition.  Carole holds a B.S. in Mathematics from the University of Wisconsin and a M.S. in Statistics from Montana State University, as well as several professional certifications.  All the opinions expressed here are her own, and not those of her employers: past, present, or future.  (Although her dog Angie may have had some influence.)  Ms. Jesse currently lives and works in Minneapolis, Minnesota.

You can find Carole on Twitter as @CaroleJesse and at LinkedIn http://www.linkedin.com/in/CaroleJesse



1) Describe your career in science from school to now.

Truthfully, my career in science started in 7th grade. Hey, I know this is further back in time than you intended the question to go!  However, something significant happened that year that pretty much set me on the path that I am still on today.  I discovered Algebra.  Up to that point in time, I was an average student in ‘arithmetic’. Algebra introduced LETTERS into the mix with numbers, in the simplest of ways that we have all seen: ‘Solve for x in the equation x+2=5’.  That was something I could get behind, AND I excelled at it immediately. Without mathematical excellence, efforts in learning science can fall apart.  Mathematics is everywhere!

I spent the rest of my secondary education consuming all the math and science that I could get. By the time I entered college I had already been exposed to pre-calculus and physics and was actually surprised by those in my college Freshman courses who had not seen anti-derivatives, memorized the quotient rule, or worked an inclined plane friction problem before.

My goal as an undergraduate was to become a Veterinarian.  The beauty of a pre-Vet curriculum is that it is pretty much like pre-Med, rigorous and broad in the sciences.  In my first two years of undergraduate work, I was exposed to more Chemistry, more Mathematics, more Physics, along with things like Genetics, Biology, even the Plant and Animal Sciences.  Although I did not stick with my pursuit of Veterinary Medicine, it laid a solid foundation that has served me very well in the strangest of places.

I consider myself a Mathematician/Statistician due to my academic degrees in those areas, first a BS in Mathematics/Physics at the University of Wisconsin followed by a MS in Statistics at Montana State University. In between the BS and MS I also dabbled briefly in Electrical Engineering at the University of Minnesota.

Since academia, it is my breadth in ALL sciences which has allowed me to be very fluid in straddling diverse industries: from High Volume Manufacturing of Consumer Products, to Nuclear Energy, to Semiconductor Manufacturing/Packaging, to Financial Services, to Health Care. I succeed at business problem solving in these industries by applying my Statistical Methods knowledge, coupled with business acumen and peripheral understanding of the technologies used. I have worked closely with scientists and engineers, and could enter THEIR world speaking THEIR language, which was an aid in getting to these solutions quickly.

I can not place enough emphasis on the importance of exposure to a broad range of sciences, and as early as possible, for anyone who wants to be involved in Advanced Analytics and Business Intelligence. As a manager, I look closely at candidates for these diverse sorts of backgrounds.

2) I find the number of computer scientists and analysts to be overwhelmingly male despite this being a lucrative profession. Do you think that BI and Analytics are male dominated?  How can the trend be re-shaped?

Welcome to my world!  All kidding aside, yes that has been my observation as well. While I am not versed in the specifics of actual gender statistics in Computer Science and Advanced Analytics versus other fields, based on my years in and around these fields, there does appear to be a bias.

This is not due to a lack of capability or interest in these fields on the part of women. I believe it is more due to the long history of cultural norms and negative social messages that perhaps push woman away from these fields.  The messages can be subtle, but if you pay close attention, you will see them.  Being one of 10 females in an undergraduate engineering class of 150 students has a message right there.  Even though these 10 women were able to make entry to the class, the pressure of being a minority, whether gender based or otherwise, can be a powerful influencer in remaining there.

In my own experience, I have encountered frequent judgments where I was made to feel “good at math” was an unacceptable trait for a woman to have.  It is important to note that these judgments have been delivered equally by men AND women. So I think until both genders develop higher expectations of women in the hard science areas, the trends will continue.  It has been decades since my 7th grade introduction to algebra, but it appears the negative social messages regarding girls in math and science are still present today. Otherwise there would be no need (i.e. no market) for books like Danica McKellar’s “Math Doesn’t Suck,” and the follow-up “Kiss My Math,” both aimed at battling these negative messages at the middle school level.

As to how I have battled these cultural expectations, I developed a thick skin. I have also learned to expect excellence from myself even when a teacher, or a peer, or a boss may have had lower expectations for me than for a male counterpart. Sort of a John Mayer “Who Says” type of attitude.  Who says I can’t do Math and Science. Watch me.

3) How would you explain Risk Management using software to a class of graduate students in mathematics and statistics?

There are many areas of Risk Management.  My specific experience has been on the Credit Risk Management and Fraud Risk Management sides in a couple of industries.  For credit risk in financial services, typically there is a specific department whose role is to quantify and predict credit risk.  Not just for the current portfolio, but for new products as well.  Various methodologies are utilized, ranging from summarization of portfolio characteristics that have a known relationship to default to using historical data to build out predictive models for production implementation.  Key skills needed here are good understanding of the business, solid statistical methods knowledge, and computing skills.  As far as the computing /software skills needed, there are three main categories 1) query and preparation of data, 2) model building and validation, and 3) model implementation.  The actual tools will likely differ across these categories.  For example, 1) might be tackled with SAS®, Business Objects, or straight SQL; 2) requires a true modeling package or coding language like SAS®, SPSS, R, etc; and lastly 3) is the trickiest, as implementation can have many system limitations, but SAS® or C++ are often seen at implementation.

4) Describe some of your most challenging and most exciting projects over the years.

I have been very fortunate to have many challenges and good projects in every role I have been in, but as I look back today, some things that stand out the most were in ‘high tech’.  By virtue of being high tech, there is no fear of technology, and it is fast-paced and ever evolving to the next generation of product.

I spent seven years in the Semiconductor industry during the 90’s at Micron Technology, Intel, and Motorola. At the beginning of that window, we left the 486 processor world, and during that window we spanned the realm of Pentium processors.” Moore’s Law dominated all of this. To stay competitive all of these companies embraced statistical methods to help speed up development time.

At one point, I supported a group of about 10 R&D engineers in the Design and Analysis of their process improvement and simplification experiments.  This afforded me exposure to much of the leading edge research the team was working on.

I recall one project with the goal of optimizing capacitance via surface roughness of the capacitor structures.  In addition to all the science involved at the manufacturing step, what made this so interesting was the difficulty in measuring capacitance at the point in the process where film roughness was introduced. All we had were surface images after this step.  The semiconductor wafers had to pass through several more process steps to get to the point where capacitance could actually be measured. All of this provided challenges around the design of the experiment and the data handling and analysis.

By working closely with both the process engineer and the process technician I was able to gather the image files off the image tool that were taken from the experimental runs. I used SAS® (yes, another shameless plug for my favorite software) to process the images using Fast Fourier Transforms. Subsequently, the transformed data was correlated to the capacitance in the analysis of the experimental results.  Finding the sweet spot for capacitance, as driven by surface roughness, provided a huge leap for this process technology team.

The challenges of today are much different than they were in the 90s.  In the more recent years, I have been working with transactional data related to financial services or health care claims.  The challenges manifest themselves in the sheer volume of the data. In the last decade in particular most industries have been able to put the infrastructures in place to gather and store massive amounts of data related to their businesses.  The challenge of turning this data into meaningful actionable information has been equally exciting as using Fast Fourier Transforms on image processing to optimize capacitance!

Currently I am working with an Oracle database where one table in the schema has 250 million records and a couple hundred fields.  I refer to this as a “Pushing Tera” situation, since this one table is close to a Terabyte in size. As far as storing the data, that is not a big deal, but working with data this large or larger is the challenge.

Different skill sets are needed here beyond those of just an analyst, data miner, or statistician.  These VLDB situations have morphed me into a bit of an IT person.

  • How do you efficiently query such large databases? An inefficient SQL query will not be a bother in a situation where the database is small. But when the database is large, SQL efficiency is key. Many skills needed for industry are not necessarily taught in academia, but rather get picked up along the way, like Unix and SQL. I now write efficient SQL code, but many poorly written jobs gave their lives so that I could learn these efficiencies!
  • Eventually I will need to organize this data into an application specific format and put data security controls around the process.  Again, is this Advanced Analytics?  Not really, it is more of an MIS role. The newness in these challenges keeps me excited about my work.

5)  How important do you think work life balance is for people in this profession? What do you do to chill out?

I don’t think the work-life balance is any more or less important to the decision science professionals than it is to any other profession really.  I have friends in many other professions like Law, Nursing, Financial Planning, etc. with the same work-life balance struggles.

We live in a busy culture that includes more and more demands placed on us professionally.  Let’s face it, most of us are care-takers to someone besides ourselves.  It might be a spouse, or a child, or a dog, or even an elderly parent. Therefore, a total focus on work is bound to upset the work-life balance for most of us.

My biggest struggle comes in the form of balancing the two sides of my brain.  That may sound weird, but one thing you have to agree with is that all of this is pretty Left Brained:  mathematics, statistics, business intelligence, computing, etc.

To balance this out, and tap into my Right Brain, I like to dabble in the arts to some extent.  Don’t get me wrong, I am not an artist!  But that doesn’t mean I can’t draw on creativity in the artistic sense. For example, this past summer I took a course on Adobe Photoshop and Illustrator at Minneapolis College of Art and Design. This provided the best of both worlds, combining software and art! In addition to learning how to remove Cindy Crawford’s mole (yes, we did this), there were some very useful projects.  One of my course projects was creating my customized Twitter background. An endeavor like this provides me a ‘chilling out’ factor from the normal work world. I know of many other Left Brain leaners that do similar things, like playing a musical instrument, or painting, etc. This is another reason why I took up digital photography: more visual arts.

Volunteer work has a balancing effect too. I try to give back to the community when I can. Swinging a hammer at Habitat for Humanity, or doing record keeping for an Animal Rescue organization, are things I have participated in.

And if none of this works, I enjoy cooking for my family and friends, and plying them with wine!

6) What are you views on:

A)  Data Quality

I’d have to say I am for data Quality! Who isn’t? But the reality is that data is dirty.  That “Pushing Tera” Oracle table I mentioned earlier, well it turns out it has some issues.  And it is incumbent upon me to determine the quality of that data before attempting to do anything analytical with it.  One place in industry where value enhancement are needed:  database administrators with business knowledge.  It seems that more times than not, even if there was a business savvy DBA they may have moved on, leaving the consumers of that data (that would be me) to fend for themselves. There is some debate over which philosopher said “Know thyself.”  Today’s job challenge is to “Know thy data” or perhaps “Value those that know thy data.”

B) Predictive Analytics for Fraud Monitoring

There is a huge market for analytics in fraud detection and prevention.  But it is not for the faint of heart. Insiders, at least in Mortgage and Health Care, are the typical perpetrators of lucrative fraud. These insiders know how the industry processes work and they exploit this.  As soon as one loophole is discovered and patched, fraudsters are looking for another loophole to exploit.  This makes the task of predictive analytics different for Fraud than other areas where underlying patterns are probably more stable.  Any methodology used here must have “turn on a dime” features built in, if possible.  With economic conditions as they are, fraud detection/monitoring will remain important and challenging field.