Business Intelligence and The Heisenberg Principle

The Heisenberg Principle states that for certain things accuracy and certainty in knowing one quality ( say position of an atom) has to be a trade off with certainty of another quality (like momentum). I was drawn towards the application of this while in an email interaction with Edith Ohri , who is a leading data mining person in Israel and has her own customized GT solution.Edith said that it seems it is impossible to have data that is both accurate (data quality) and easy to view across organizations (data transparency). More often than not the metrics that we measure are the metrics we are forced to measure due to data adequacy and data quality issues.

Now there exists a tradeoff in the price of perfect information in managerial economics , but is it really true that the Business Intelligence we deploy is more often than not constrained by simple things like input data and historic database tables.And that more often than not Data quality is the critical constraint that determines speed and efficacy of deployment.

I personally find that much more of the time in database projects goes in data measurement,aggregation, massaging outliers, missing value assumptions than in the �high value� activities like insight generation and business issue resolution.

Is it really true ? Analysis is easy but the data which is tough ?

What do you think in terms of the uncertainty inherent in data quality and data transparency-

Author: Ajay Ohri

http://about.me/ajayohri View all posts by Ajay Ohri

5 thoughts on “Business Intelligence and The Heisenberg Principle”

Kesavan Hariharasubramanian says:

March 27, 2009 at 6:12 am

Technically speaking the activities mentioned such as data measurement, aggregation, massaging outliers,etc do take a good amount of time but this is warranted if we need to deliver a successful BI implementation. Analysis becomes easier if time is well spent in data identification followed by data “triaging”. By “triaging”, I mean “categorization and prioritization”. But I guess more than the data itself, the toughest part of BI implementation is to to gather momentum for the BI initiative especially with the differing demands and priorities of the various stakeholders in an organization and the dominant notion that any IT initiative is an overhead.

Reply
Jacklyn Kearns says:

March 26, 2009 at 10:28 pm

To make it really simple, it’s the same as a snapshot versus a moving picture; or in Eastern terms, you never stand in the same river twice.
A space and time issue…..

Reply
Edith Ohri says:

March 20, 2009 at 4:06 pm

I feel responsible for being unclear on the Hiesenberg thing. That subject was brought up during a LinkeIn discussion about information Transparency – http://www.linkedin.com/groupAnswers?viewQuestionAndAnswers&discussionID=2042135&gid=1851195. My stand was that one cannot have with accuracy both data and meta-data. Data qualities are never fully known in advance, and if we extend their content so to cover all of the potential future developments, than the we are bound to miss on the meta-data part which is yet unknown in that stage. Either we assume everything at the beginning (suppose we can do that) but then miss some of the undefined yet meta-data, or we have a complete meta-data but miss part of the information which will evolve later. If the system has both complete data and meta-data, it does not stop there, what about the meta-data of the meta-data then?!
The point which I tried to make is that complete data qualioty is a flimzy theoretical ideal, and a very costly one…

I cannot agree with Sandip’s note, that the easy analysis is unrelated to data quality for the simple reason, that low data quality are at greater risk of inconclusive results, and require additional work for substantiating the findings.

I agree with Sharon’s comment on quality versus accuracy and inadequate system planning� As an analyst data quality is always less than desired. Yet, for my model (GT), unfiltered data input is better: it is more authentic and contains extreme state information which is many time crucial for the analytics.
The way I see it, a data mining tool must be able to cope with low quality input, it should be able to deal with “field” quality data, otherwise it cannot analyze data from the web, for example.
I think anyway, that when one is given a free access to a gold mine, it does not matter if the mine is situated in a remote “swarming crocodiles” place.

Reply
Sharon Neuenfeldt says:

March 19, 2009 at 5:16 pm

Data quality and analytic accuracy can seem mutually exclusive, but they don’t have to be. The problems begin when data entry systems are built quickly and sloppily by people who do not understand how the data will be used. The other key function often ignored is building accuracy filters into data entry systems. Usually this is considered a “nice to have” instead of a “need to have”. Too often system architects assume everyone will always enter good data, which is naive. You need to build in as many controls as you can given time and money, then build processes to get rid of the junk as soon as possible, before it causes your data analysis to fail. If you’re stuck with a system that was built for speed instead of accuracy and there are no data stewardship controls in place, then yes, you will have problems. The cost of manipulating the data to do good analysis should be added as a risk of cutting corners on data acquisition.

Reply
Sandip says:

March 19, 2009 at 9:06 am

I am not quite sure how there is a trade-off between an easy analysis and the quality of the data. In fact whether a analysis is easy or not would depend on the methodology and the kind of data that methodology demands. The accuracy or the quality of the data is something different from this. Although I can see a direct trade-off between the quality of the data and the cost of the analysis. As the quality of data increases the cost of the analysis would definitly go up to maintain the high quality of data.

Reply