Interview Sarah Burnett BI Analyst,Ovum group

Here is an interview with the terrific Sarah Burnett, well known BI analyst at Ovum Group.

sarah

Ajay- Describe your career in science. How do you think science careers can be made more popular to young students.


Sarah-
Other than a little time in electronics engineering, I have spent all my career in the computer industry. Science degrees give you the kind of training and credentials that you need for a good start in the world of work. There are many jobs in the applied sciences domain e.g. electronics and computer science, but science graduates get into a number of other fields e.g. the financial markets. We are having a crisis in science education here in the UK where a number of good universities have had to close some of their science departments. This is primarily due to the lack of students. The trouble is that science is not considered to be “cool” amongst the would be students. I believe the way to tackle this is by making science more interesting at school to inspire young people to take it up in tertiary education. Without science and innovation we will be on a slippery slope to economic decline. TV and the media in general do not help with their constant portrayal of scientists as geeks. Perhaps we should find a couple of good looking scientists to promote in the media to compete for the attention of young people against celebrities. Do you know any Brad Pitt look alike scientists?

Ajay- I feel the Business Intelligence world is overwhelming male in terms of statistics. Do you agree- what makes BI and data mining a not so attractive career traditionally for women.

Sarah- I agree. I think it reflects the general trends in the take up of science and mathematics by girls at university and also the number of women in management positions. Whatever we do to increase the number of women in those areas is likely to have an effect on the number of women in BI. My view is that we need to make subjects such as statistics more appealing and relevant to girls at school. I believe we can do that by teaching analysis of trends in some weird and wonderful topics with some funny facts thrown in to make it fun. I also think that women tend to be less confident than men in trying technical or scientific subjects. Again, I do not think that the culture of celebrity helps. It puts pressure on girls for the wrong reasons and discourages them from pusuing some worthwhile careers.

Ajay- What are your views on Government spending to be measured by Business Intelligence tools the same way as corporate spending is measured by it. Do you think that some of the stimulus package to shore up failing banks was big enough to educate every high school graduate through college ( in both UK and the US)

Sarah- I am a proponent of BI in public sector and have written articles and blogged about it. There are many examples of Government bodies getting into financial or capacity difficulties due to the lack of good data and actionable information. BI can be applied at departmental level to improve departmental outcomes and services. BI at multi-agency level can help make the customer/citizen’s journey a positive experience through the maze of public services and help with efficiency targets. The trouble is bringing data together from multiple-agencies within legal frameworks such as the Data Protection Act in the UK and also without losing the trust of voters that the data is not being shared amongst Government bodies for some sinister reason.

Ajay- If you can not measure it, you cannot manage it. Do you think measuring carbon footprint of organizations is the first step to managing environmental fallout. How can IT help make the world greener?

Sarah- Yes, it is a start but like any other performance measure, it has to be linked to strategic objectives and the findings have to be acted upon in order to reduce the environmental fall out and towards achieving objectives. IT can play a number of different parts in this: automate the necessary processes, capture data and then enable the measuring, monitoring and reporting part of the initiative. IT of course has to put its own house in order to become a sustainable service that helps with the overall green objectives.

Ajay- Increasing number of personal data is now available on the web about consumers, just as financial records are available from credit bureaus. How long do you think will it take for software to catch up in text mining for propensity to buy, or risk behaviour by adding this social media data to traditional data sources.

Sarah- I do not believe it will take long at all. Text analytics solutions can be trained and put to analyse the sort of information that you are interested in. Of course where personal information is concerned legal requirements have to be complied with.

Ajay- What are your views on social network analysis and how it can help BI measurement and predictive analytics.

Sarah- It is an interesting and developing area that can augment BI. From an analysis point of view, I think it is important to validate the source and the accuracy of information and to give it some kind of relevance and quality mark. The information must then be treated in accordance to its mark so that pointless hype can be eliminated. Although, in some cases hype might matter e.g. we need to understand if it is likely to affect our business. For that, we need to predict when hype or word of mouth can rapidly spread through a social network. The network itself can provide some information revealing its influencer/follower relationships so that hype can be predicted by what the influencers are saying.

Ajay- What are the biggest, most common mistakes you see in implementation of Information technology strategy.


Sarah-
My answer to that question could take several pages and I speak from bitter experience. For now lets just mention a few examples: Trying to do things too fast, not gaining end-user buy-in or input, lack of communication, failing to appreciate the extent of skills requirements and supposedly cutting costs by avoiding consultants, not enough hand-holding of end-users on roll out and so on.

Ajay- What do you do while not working or writing to relax. How important is the work-life balance for analysts.

Sarah- Pilates keeps me sane during the week. Other times I enjoy playing sports and doing leisure activities with my family and friends. Work/life balance is very important to me – so much so that I took a break from corporate life during a peak in my career to raise a family and have never regretted it. Many years ago I read a poem by Nadine Stair called “Afterwords: If I could live it over….”. The best line says “If I had to live my life over again I would eat more ice cream”. Joking apart, I do not want to be old and regretting all the things that I have not done.

Sarah Burnett is a Senior Analyst with Ovum. An experienced analyst and consultant, Sarah has worked in a variety of IT roles over the last twenty years including software development, programme management and project management. She provides analysis and thought leadership on Business Intelligence software and market; she also provides expert analysis of public sector IT developments and trends. Sarah is a regular contributor to the company’s monthly journals providing articles as well as writing a monthly column on public sector IT. She is a regular speaker at conferences and provides personalised advice and consultancy to Ovum clients.

Biography-

Since joining the Ovum-Butler organisation three years ago, Sarah has co-authored a number of in-depth reports and recently led the research and production of the Business Intelligence (Corporate Performance Management) report.

Sarah holds a BSc in Physics and Electronics as well as an MSc in Applied Optics. She has Prince2 practitioner qualifications and is a member of the British Computer Society. Sarah can be found on-line on Twitter -sarahburnett- and on her own blog, Sarah Burnett’s Web Musings.

Interview Eric A. King President The Modeling Agency

Here is an interview with Eric King, President, The Modeling Agency.

eric-king

Ajay- Describe your career journey. What interested you in science? How do you think we can help more young people get interested in science?

Eric- I was a classic underachiever in school. I was bright, but was generally disinterested in academics, and focused on… well, other things at the time. However, I had always excelled in math and science, and actually paid attention those classes.

I was a high school junior when my school acquired its first computers: Apple IIs. There were no formal computer courses, so instead of study hall, I would go to the lab and tinker. Sure, I would join a few other geeks (well before it was cool to be such) for a few primitive games, but would spend the majority of my time reading about the Basic programming language and coding graphic designs, math formulas and simple games.

I loved it so much that I had decided to pursue computer science as a college major before my senior year and it went into my yearbook entry. Fortunately, my relatively high SAT scores offset my poor high school GPA and squeaked me into the University of Pittsburgh’s trial-by-fire summer program. It was the first time I really felt I had to perform (or else) and had to work hard to overcome poor study habits — but rose to the occasion with room to spare.

I’m glad I did not realize at the time that Pitt was #9 in the nation for computer science. I did have a hint though when I realized the extremely high attrition rate. In the end, our freshman class of 240 graduated 36. I did make it through the freshman year that trimmed the first half of the original group, but was a casualty my sophomore year when I fell short of a passing grade in a core CS course that was only offered annually. I repeated it the following year and graduated with extra credits – to include a directed study in table tennis (no kidding).

I loved the programming assignments but loathed the tests. After slogging through the program and graduating, I took a three month break. I figured it would be my last opportunity to be free of responsibility for that period of time possibly until retirement – and so far, I’m right.

Then, my cousin who graduated with me told me about a neural computing software tools company in Pittsburgh, called NeuralWare. I was always intrigued by “artificial intelligence”, but they were seeking a technical support representative. I realized my junior year that I did not want to code or remain on the technical side for a living, but go into business development, project management, business management and entrepreneurship. Yet, after having survived the majority of the attrition, I did want to complete my technical degree, then seek the business angle.

A short while later, NeuralWare contacted me again to start up their sales operation (a role previously fulfilled a co-founder). This was the start I was seeking: cut my teeth in business for highly technical products. I participated in numerous training sessions for neural computing and related technologies and loved it. The notion that the computer could leverage mathematics that emulated the basic learning function of the brain, or treat a formula like a gene – split it, mutate it, test and progress toward the most fitting solution was beyond exciting to me. So much so, that I’ve not left the technology in the 19 years since.

Drawing others to science, I believe is more a matter of nature over nurture. I am the father of twin boys who couldn’t have greater differences in interests, personalities and talents. In that spirit, I believe that science should be made readily available, involve both theory and practice, and be presented in a manner that motivates those who are drawn to science to excel. But I don’t believe science can be effectively pushed to those whose inherit interests and passion lie elsewhere (reference the character Neil Perry in The Dead Poet’s Society).

Ajay- Describe the path that The Modeling Agency has traveled. What is your vision for it for the future.

Eric- The Modeling Agency (TMA) was established as a highly structured formal network of senior-level consultants in January of 2000. TMA’s initial vision (and sustained slogan) was to “provide guidance and results to those who are data-rich, yet information-poor.” I still have not encountered an organization that holds a larger bench of senior-level data mining consultants and trainers. And to be senior-level, TMA consultants must be far more than technically steeped in data mining. TMA’s senior consulting staff are business consultants first – not rushing to analyze data, but assessing an organization’s environment and designing a fitting solution to resources that support stated objectives.

There are three primary divisions to TMA: training, consulting and solutions. Each division is part of an overarching business and technology maturation process. For example, training generates technology advocates for data mining that encourages consulting engagements which at times lead to productizable vertical market services that create solutions which allow other organizations to capitalize on the risk that pioneering organizations had undertaken, and springboard on the return realized by implementations within their vertical – which leads to new discoveries and innovations that feed back to training.

Beyond further developing the brand of TMA’s quickly emerging niche (described later), our future vision involves developing two specific types of vendor partnerships to allow TMA to redirect the substantial margins enjoyed by its clients through the application of predictive modeling into a residual stream of income to accelerate the growth of TMA itself. While this operation is confidential, we will be pleased to tell our future clients that we do indeed apply our services for the benefit of our own business.

Ajay- Describe the challenges and opportunities in modeling through recent innovations. i.e social network analysis software and increasing amounts of customer text data available on social media.

Eric- Please allow me to shift the focus of this question slightly. So many organizations are still making their way down the Business Intelligence chain to applying predictive modeling on standard operational data, that social network analysis and customer text analytics remains more of a research endeavor in my opinion. As a practical applications company, TMA focuses its experience in pragmatically applying its business problem solving creativity on operational and transactional data enriched by demographic and psychographic attributes. I feel that the areas of social media and social network analysis are not yet mature enough to be formalized as established practice on TMA’s menu of service offerings.

Having said that, the greatest challenges in predictive modeling are no longer in applying the methodological tactics, but rather in the comprehensive assessment, strategic problem design, project definition, results interpretation and ROI calculation. Popular data mining software is now highly effective at automating the tactical model building process – many packages running numerous methods in parallel and selecting the best performer.

So, the challenges that remain today are in tackling the tails of the process as mentioned above. This is where TMA’s expertise is focused and where our niche is quickly emerging: guiding organizations to establish their own internal predictive analytics operation.

Ajay- In the increasing game of consolidation of business intelligence vendors and data mining and analytics, which are the vendors that you have worked with and what are their relative merits.

Eric- TMA has established formal partnerships with several popular data mining tool vendors and services companies. Despite these alliances, TMA remains vendor neutral and method agnostic for clients that approach TMA directly. Having said that, I will make a general statement that there is notable merit for the organizations that recognize that they must ensure their client’s success in the full implementation cycle of data mining – not just provide a great tool that addresses the center.

In fact, it was one of TMA’s earliest partners who saw the value in teaming with TMA to support the ends of the data mining process (assessment, business understanding project definition and design, results interpretation, implementation) while their solution addressed the middle (data preparation and modeling). They recognized that as great as their tool was, it was still hitting the shelf soon after the sale. The realized that their clients were building very good models that answered the wrong questions, or were uninterpretable and incapable of implementation.

TMA soon recognized that these excellent tools combined with TMA’s strategic data mining mentorship and counsel provided the capability for organizations to essentially establish their own internal predictive analytic practice with existing business practitioners – not requiring senior statisticians or PhDs. This has become a popular and fast growing service, for which TMA’s large bench of senior-level data mining consultants is perfectly suited to fulfill.

And the best candidates for this service are those organizations who have attempted pilots or projects but fell short of their objectives. And while the acquisition of SPSS (who licenses a reputable predictive analytics tool, “PASW”) by IBM (the gold standard for IT and BI services and solutions) may be the closest competition that TMA may encounter, TMA enjoys a substantial head start and foothold with its numerous formal alliances, vendor neutrality and sizable client list specific to predictive modeling. TMA is quickly becoming the standard to turn to for progressive organizations that realize internalizing predictive analytics is not just a matter of when rather than whether, but that it is within their grasp with TMA’s guidance and the right tool(s).

Ajay- What do people at The Modeling Agency do for fun?

Eric- Our interests are as diverse as we are geographically disbursed. One of our senior consultants is a talented and fairly established tango dancer. He’s always willing to travel for assignments, as he’s anxious to tap into that city’s tango circuit. Another consultant is an avid runner, entering marathons and charity races. One common thread that most of us share is our dedication to parenting. We all love trips and time with our children. In fact, I’m writing this on a return trip from Disney World on the Auto Train with my 5 year old twin boys – a trip I know I’ll recall fondly through my remaining years.

Bio

Eric A. King is President and Founder of The Modeling Agency (TMA), a US-based company started in January 2000 that provides trainingconsultingsolutions and a popular introductory webinar in predictive modeling “for those who are data-rich, yet information-poor.”  King holds a BS in computer science from the University of Pittsburgh and has over 19 years of experience specifically in data mining, business development and project management.  Prior to TMA, King worked for NeuralWare, a neural network tools company, and American Heuristics Corporation, an artificial intelligence consulting firm.  He may be reached at eric@the-modeling-agency.com or (281) 667-4200 x210.

R releases new version R 2.9.2

What is new in 2.9.2 (technical details not marketing spit and shine),

what didnt work in 2.9.1 ( shockingly bugs are fixed openly !!)

NEW FEATURES

    o   install.packages(NULL) now lists packages only once even if they
        occur in more than one repository (as the latest compatible
        version of those available will always be downloaded).

    o   approxfun() and approx() now accept a 'rule' of length two, for
        easy specification of different interpolation rules on left and
        right.

        They no longer segfault for invalid zero-length specification
        of 'yleft, 'yright', or 'f'.

    o   seq_along(x) is now equivalent to seq_len(length(x)) even where
        length() has an S3/S4 method; previously it (intentionally)
        always used the default method for length().

    o   PCRE has been updated to version 7.9 (for bug fixes).

    o   agrep() uses 64-bit ints where available on 32-bit platforms
        and so may do a better job with complex matches.
        (E.g. PR#13789, which failed only on 32-bit systems.)

DEPRECATED & DEFUNCT

    o   R CMD Rd2txt is deprecated, and will be removed in 2.10.0.
        (It is just a wrapper for R CMD Rdconv -t txt.)

    o   tools::Rd_parse() is deprecated and will be removed in 2.10.0
        (which will use only Rd version 2).

BUG FIXES

    o   parse_Rd() still did not handle source reference encodings
        properly.

    o   The C utility function PrintValue no longer attempts to print
        attributes for CHARSXPs as those attributes are used
        internally for the CHARSXP cache.  This fixes a segfault when
        calling it on a CHARSXP from C code.

    o   PDF graphics output was producing two instances of anything
        drawn with the symbol font face. (Report from Baptiste Auguie.)

    o   length(x) <- newval and grep() could cause memory corruption.
        (PR#13837)

    o   If model.matrix() was given too large a model, it could crash
        R. (PR#13838, fix found by Olaf Mersmann.)

    o   gzcon() (used by load()) would re-open an open connection,
        leaking a file descriptor each time. (PR#13841)

    o   The checks for inconsistent inheritance reported by setClass()
        now detect inconsistent superclasses and give better warning
        messages.

    o   print.anova() failed to recognize the column labelled
        P(>|Chi|) from a Poisson/binomial GLM anova as a p-value
        column in order to format it appropriately (and as a
        consequence it gave no significance stars).

    o   A missing PROTECT caused rare segfaults during calls to
        load().  (PR#13880, fix found by Bill Dunlap.)

    o   gsub() in a non-UTF-8 locale with a marked UTF-8 input
        could in rare circumstances overrun a buffer and so segfault.

    o   R CMD Rdconv --version was not working correctly.

    o   Missing PROTECTs in nlm() caused "random" errors. (PR#13381 by
        Adam D.I. Kramer, analysis and suggested fix by Bill Dunlap.)

    o   Some extreme cases of pbeta(log.p = TRUE) are more accurate
        (finite values < -700 rather than -Inf).  (PR#13786)

        pbeta() now reports on more cases where the asymptotic
        expansions lose accuracy (the underlying TOMS708 C code was
        ignoring some of these, including the PR#13786 example).

    o   new.env(hash = TRUE, size = NA) now works the way it has been
        documented to for a long time.

    o   tcltk::tk_choose.files(multi = TRUE) produces better-formatted
        output with filenames containing spaces.  (PR#13875)

    o   R CMD check --use-valgrind did not run valgrind on the package
tests.

    o   The tclvalue() and the print() and as.xxx methods for class
        "tclObj" crashed R with an invalid object -- seen with an
        object saved from an earlier session.

    o   R CMD BATCH garbled options -d <debugger> (useful for
        valgrind, although --debugger=valgrind always worked)

    o   INSTALL with LazyData and Encoding declared in DESCRIPTION
        might have left options("encoding") set for the rest of the
        package installation.

And from www.r-project.org the remaining updated news
  • R version 2.9.2 has been released on 2009-08-24. The source code will first become available in this directory, and eventually via all of CRAN. Binaries will arrive in due course (see download instructions above).
  • The first issue of The R Journal is now available
  • The R Foundation as been awarded four slots for R projects in the Google Summer of Code 2009.
  • DSC 2009, The 6th workshop on Directions in Statistical Computing, has been held at the Center for Health and Society, University of Copenhagen, Denmark, July 13-14, 2009.
  • useR! 2009, the R user conference, has been be held at Agrocampus Rennes, France, July 8-10, 2009.
  • useR! 2010, the R user conference, will be held at NIST, Gaithersburg, Maryland, USA, July 21-23, 2010.
  • We have started to collect information about local UseR Groups in the R Wiki.

Citation – http://www.r-project.org

Book Review (short) Data Driven-Profiting from your most important business asset -Tom Redman

Once in a whle comes a book that squeezes a lot of common sense in easy to execute paradigms, adds some flavours of anecdotes and adds penetrating insights as the topping. The Book Data Driven by Tom Redman is such a book- and it may rightly called the successor to the now epic Davenport Tome on Competing on Analytics.

Data Driven, the book is divided in 3 parts.

1) Data Quality – Including opportunity costs of bad data management.

2)  Putting Data and Information to work

3) Creating a Management system for Data and Information.

At 218 pages not including the appendix- this is one easy read for someone who needs to refresh their mental batteries with data hygiene perspectives. With terrific wisdom and easy to communicate language and paradigms this would surely mark another important chapter in bring data quality to the forefront rather than the back burner of Business Intelligence and Business Analytics. All the trillion dollar algorthms in the world and software is useless without data qquality. Read this book and it will show you how to use the most important valuable and under used asset- data.

How they stack up: IDC on Business Analytics

So here is intelligent enterprise on the latest IDC rankings on Business Intelligence and Business Analytics vendors. If you ever wondered how big the bog boys were- read it at

Citation:

http://www.intelligententerprise.com/info_centers/ent_dev/showArticle.jhtml;jsessionid=QL4IYMWB1MSIHQE1GHPSKHWATMY32JVN?articleID=219401120

In 2008, Oracle led the overall market, followed in order by SAP, IBM, SAS and Microsoft, the report said. Rounding out the top 10 were Teradata, Fair Isaac, Informatica, Infor and MicroStrategy, respectively

and

IDC divides the business analytics software market into four primary segments: analytic applications, business intelligence tools, data warehousing platform software and spatial information analytics tools.

and

Fourth-place SAS’ broad portfolio spans all business analytics market segments and is exclusively dedicated to this market. “The company leads in the advanced analytics tools segment and is within the top two vendors in two other market segments,”IDC said.

It’s a brilliant analysis and survey. IDC and Intelligent Enterprise- thanks a tonne for letting us know.

OT: How would you fix the economy

The Business Section asked readers for ideas on “How Would You Fix the Economy?”

I think this guy nailed it!

Dear Mr. President:

Please find below my suggestion for fixing America ‘s economy.

Instead of giving billions of dollars to companies that will squander the money on lavish parties and unearned bonuses, use the following plan.

You can call it the Patriotic Retirement Plan:

There are about 40 million people over 50 in the work force.

Pay them $1 million apiece severance for early retirement with the following stipulations:

1) They MUST retire. Forty million job openings – Unemployment fixed.

2) They MUST buy a new American CAR. Forty million cars ordered – Auto Industry fixed.

3) They MUST either buy Continue reading “OT: How would you fix the economy”

Decisionstats| Miscellaneous Part 5

If you think that adding a seperate category for poetry and humourous articles is too much, well it seems the most popular articles came from this section., The poemon Michael Jackson continues to be all time 1, in terms of number of page views (I had hoped one of the interviews would be number 1), and the breakthrough article on Not using R is even quoted in Australia in a university course on data mining. lol!

1) http://www.decisionstats.com/2009/06/26/tribute-to-michael-jackson/

Poem on MJ. Tribute. May he R.I.P.

2) Top Ten Reasons R language is bad for you. Satire and tongue firmly Continue reading “Decisionstats| Miscellaneous Part 5”