Interview Charlie Berger Oracle Data Mining

Here is an interview with Charlie Berger, Oracle Data Mining Product Management. Oracle is a company much respected for its ability to handle and manage data, and with it’s recent acquisition of Sun- has now considerable software and financial muscle to take the world of data mining to the next generation.

Ajay- Describe your career in data mining so far from college, jobs, assignments and projects. How would you convince high school students to take up science careers?

Charlie- In my family, we were all encouraged to pursue science and technical fields. My Dad was a Mechanical Engineer and all my siblings are in scientific and medical fields. Early on, I had narrowed my career choices to engineering or medicine; the question when I left for college was which kind. My Freshman Engineering exposed students to 6 weeks of the curriculum for each of the engineering disciplines. I found myself drawn to the field of Operations Research and Industrial Engineering. I liked the applied math and problem solving aspects. While not everyone has an aptitude or an interest in Math or the Sciences, if you do, it can be a fascinating field.

Ajay- Please tell us some technical stuff about Oracle Data Mining and Oracle Data Miner products. How do they compare with other products notably from SAS and SPSS? What is unique in Oracle’s suite of data mining products- and some market share numbers to back these please?

Charlie- Oracle doesn’t share product level revenue numbers. I can say that Oracle is changing the analytics industry. Ten years ago, when Oracle acquired the assets of Thinking Machines, we shared a vision that over time, as the volumes of data expand, at some point, you reach a point where you have to ask whether it makes more sense to “move the data to the algorithms” or to “move the algorithms to the data”. Obviously, you can see the direction that Oracle pursued. Now after 10 years of investing in in-database analytics, we have 50+ statistical techniques and 12 machine learning algorithms running natively inside the kernel of the Oracle Database. Essentially, we have transformed the database to become an analytical database. Today, you now see the traditional statistical software vendors announcing partnering initiatives for in-database processing or in the case of IBM, acquiring SPSS. Oracle pioneered the concept of using a relational database to not only store data, but to analyze it too. Moving forward, I think that we are close to the tipping point where in-database analytics are accepted as the winning IT architecture.

This trend towards moving the analytics to where the data are stored makes a lot of sense for many reasons. First, you don’t have to move the data. You don’t have to have copies of the data in external analytical sandboxes where it open to security risks and over time, becomes more aged and irrelevant.

I know of one major e-tailor who constantly experiments by randomly showing web visitors either offers “A” or a new experimental offer “B”. They would export massive amounts of data to SAS afterwards to perform simple statistical analyses. First, they would calculate the median purchase amounts for the duration of the experiment for customers who were shown both offers. Then, they would perform a t-test hypothesis test to determine whether a statistically valid monetary advantage could be gained. If offer “B” were outperforming offer “A”, the e-tailor would Continue reading “Interview Charlie Berger Oracle Data Mining”

Interview Terri Rylander Advanced Marketing Collateral BI

Here is an interview with the fabulous Terri Rylander, innovative and creative Business Intelligence marketing consultant and the principal of Advanced Marketing Collateral . As the BI marketing wars heat up, cost pressures on optimize marketing ROI and emerging marketing channels will lead to a trend in which BI vendors would choose the best resources not just the in-house resources. Marketing communication remain the un-sung heroes of Business Intelligence with all the glamour and focus on the techies- who surprisingly are now building more and more similar algorthms. Design in user interfaces and creativity in marketing could be a new tool in marketing Business Intelligence.

Ajay- Explain briefly what it is you do in the business intelligence space.
Terri-
I am a freelance writer creating marketing material for BI vendors. I create case studies, white papers, brochures, articles, web content and short copy like e-mails and postcards. Because I’m still a techie girl at heart, I also design and develop websites including WordPress customization which is quite popular now.

Ajay- How did you come to specialize in marketing for BI?
Terri-
In the late ‘90s I went to work as a web developer for a large telecommunication company. In just a few years, I began managing the development group. Then I was asked to come to a young wireless company and manage their reporting group. After building out a solid enterprise reporting system (that still stands today), I went after a more holistic approach to reporting and analysis and created one of the first BICCs (business intelligence competency centers) in the country. Our group managed the oversight of the entire BI program including strategy, training, data quality, end user support, and BI communications. After some shifting in the winds, I knew it was time to move on.
In looking for that next “thing” I did a fair amount of soul searching. I knew I wanted a career that was both flexible and portable. I thought, “Why not use my experience in BI to create the very things I had consumed as a customer?” That’s when Advanced Marketing Collateral was born.

Ajay- How do you see the BI market place changing?
Terri-
I guess that’s part of why I’ve been drawn to the BI field all these years. It just continues to change and improve. Just when you think they’ve done it all, a totally new concept emerges. However, I don’t envy the vendors. The competition is always at your heels. It seems like all the vendors have a solution for every industry or business line (though not so much for the emerging corporate sustainability area). Vendors have to continually experiment with new features and directions. Some will succeed and others will fail. It’s increasingly important and now even easier to communicate with both potential and existing customers, letting them know where you’re going and why, getting feedback, and just getting them to know more about the personality of your company. This can create such a strong bond.

Ajay- So how has that changed the way vendors should market their BI products?
Terri-
It used to be you took out ads in the various BI publications, published and sent 12 page white papers, put up a website with technical descriptions for your products, and sent your sales force off to do “dog-and-pony” shows at prospective customer sites. Most of that is still valid, but Continue reading “Interview Terri Rylander Advanced Marketing Collateral BI”

What's my website traffic, Dude?

Some website traffic numbers for potential server cost sharers. The server is now slow despite caching enabled and server RAM ramped up, way beyond my student budget. Please bear with me and do continue to visit it.

Thanks to you- I have managed to ramp up to over 20000 number of visits a month ( Aug figure incomplete till 28th only). This is quite feel good considering that I decided to move to full time blogging just 3 months ago from earlier consulting- blog mode. I am now going back to student school and the blog quantity is expected to better in quality ( and less in quantity) as I hope to get some homework done after finding some money to buy textbooks.

Month Unique visitors Number of visits Pages Hits Bandwidth
May 2009 1527 3775 20903 42640 1000.02 MB
Jun 2009 4097 9082 55946 139918 6.39 GB
Jul 2009 17364 24532 126613 293489 9.18 GB
Aug 2009 11788 21909 114317 255289 9.25 GB

Interview Dylan Jones DataQualityPro.com

Here is an interview with Dylan Jones the founder/editor of Dataqualitypro.com , the site to go to for anything related to Data Quality discussions. Dylan is a great charming person and in this interview talks candidly on his views.Dylan Jones

Ajay: Describe your career in science and in business intelligence. How would you convince young students to take more maths and science courses for scientific careers.

Dylan: My main education for the profession was a degree in Information Technology and Software Development. No surprises what my first job entailed – software development for an IT company!

That role took me straight into the trials and tribulations of business intelligence and data quality. After a couple of years I went freelance and have pretty much worked for myself ever since. There has been a constant thread of data quality, business intelligence and data migration throughout my career which culminated in me setting up the more recent social media initiatives to try and pull professionals together in this space.

In all honesty, I’m probably the worst person to give career advice Ajay as I’m a hopeless dreamer. I’ve never really structured my career. I fell into data quality early on and it has led me to work in some wonderful places and with some great people, largely by accident and fate.

I have a simple philosophy, do what you love doing. I’m incredibly lucky to wake up every day with an absolute passion for what I do. In the past, whenever I have found myself working in a situation that I find soul destroying (and in our profession that can happen regularly) I move on to something new.

So, my advice for people starting out would be to first question what makes them happy in life. Don’t simply follow the herd. The internet has totally transformed the rules of the game in terms of finding an outlet for your skills so follow your heart, not conventional wisdom.

That said, I think there are some core skills that will always provide a springboard. Maths is obviously one of those skills that can open many doors but I would also advise people to learn about marketing, sales and other business fundamentals. From a business intelligence perspective it really adds an attractive dimension to your skills if you can link technical ability with a deeper understanding of how businesses operate.

Ajay You are a top expert and publisher on BI topics. Tell us something about

a) http://www.datamigrationpro.com/

b) http://www.dataqualitypro.com/

c) Involvement with the DataFlux community of experts

d) Your latest venture http://www.dqvote.com

Dylan- Data Migration Pro was my first foray into the social media space. I realised that very few people were talking about the challenges and techniques of data migration. On average, large organisations implement around 4 migration projects a year and most end in failure. A lot of this is due to a lack of awareness. Having worked for so long in this space I felt it was time to create a social media site to bring the wider community together. So we now have forums, regular articles, tools and techniques on the site with about 1400 members worldwide plus lots of plans in the pipeline for 2010.

Data Quality Pro followed on from the success of Data Migration Pro and our speed of growth really demonstrates how important data quality is right now. Again, awareness of the basic techniques and best-practices is key. I think many organisations are really starting to recognise the importance of better data quality management practices so a lot of our focus is on giving people practical advice and tools to get started. We are a community publishing platform, I do write regularly but we’ve always had a significant community contribution from expert practitioners and authors.

I didn’t just want to take a corporate viewpoint with these communities. As a result they are very much focused on the individual. That is why we post so many features on how to promote your skills, search for work, gain personal skills and generally get ahead in the profession. Data Quality Pro has just under 2,000 members and about 6,000 regular visitors a month so it demonstrates just how many people are really committed to learning about this discipline as it impacts practically every part of the business. I also think it is an excellent career choice as so many projects are dependent on good quality data there will always be demand.

The DataFlux community of experts is a great resource that I’ve actually admired for some time. I am a big fan of Jill Dyche who used to write on the community and of course there is a great line-up on there now with experts like David Loshin, Joyce Norris-Montanari and Mike Ferguson so I was delighted to be invited to participate. DataFlux have sponsored our sites from the very beginning and without their support we wouldn’t have grown to our current size. So although I’m vendor independent, it’s great to be sharing my thoughts and ideas with people who visit their site.

DQVote.com is a relatively new initiative. I noticed that there was some great data quality content being linked through platforms like Twitter but it would essentially become hard to find after several days. Also, there was no way for the community to vote on what content they found especially useful. DQVote.com allows people to promote their own content but also to vote and share other useful data quality articles, blogs, presentations, videos, tutorials – anything that adds value to the data quality community. It is also a great springboard for emerging data quality bloggers and publishers of useful content.

Ajay- Do you think BI projects can be more successful if we reward data entry people, or at least pay more for better quality data rather than ask them to fill in database tables as fast as they can? Especially in offshore call centres.

Dylan- Data entry is a pet frustration of mine. I regularly visit companies who are investing hundreds of thousands of pounds in data quality technology and consultants but nothing in grass-roots education and cultural change. They would rather create cleansing factories than resolve the issues at source.

So, yes I completely agree, the reward system has to change. I personally suffer from this all the time – call centre staff record incorrect or incomplete information about my service or account and it leads to billing errors, service problems, annoyance and eventually lost business. Call centre staff are not to blame, they are simply rewarded on the volume of customer service calls they can make, they are not encouraged to enter good quality data. The fault ultimately lies with the corporations that use these services and I don’t think offshore or onshore makes a difference. I’ve witnessed terrible data quality in-house also. The key is to have service level agreements on what quality of data is acceptable. I also think a reward structure as opposed to a penalty structure can be a much more progressive way of improving the quality of call-centre data.

Ajay- What are the top 5 things that you can help summarize your views on Business Intelligence – assume you are speaking to a class of freshmen statisticians.

Dylan- Business intelligence is wholly dependent on data quality. Accessibility, timeliness, accuracy, completeness, duplication – data quality dimensions like these can dramatically change the value of business intelligence to the organisation. Take nothing for granted with data, assume nothing. I have never, ever, assessed a dataset in a large business that did not have some serious data defects that were impacting decision making.

As statisticians, they therefore possess the tools to help organisations discover and measure these defects. They can find ways to continuously improve and ensure that future decisions are based on reliable data.

I would also add that business intelligence is not just about technology, it is about interpreting data to determine trends that will enable a company to improve their competitive advantage. Statistics are important but freshmen must also understand how organisations really create value for their customers.

My advice is to therefore step away from the tools and learn how the business operates on the ground. Really listen to workers and customers as they can bring the data to life. You will be able to create far more accurate dashboards and reports of where the issues and opportunities lie within a business if you immerse yourself with the people who create the data and the senior management who depend on the quality of your business intelligence platforms.

Ajay- Which software have you personally coded or implemented. Which one did you like the best and why?

Dylan- I’ve used most of the BI and DQ tools out there, all have strengths and weaknesses so it is very subjective. I have my favourites but I try to remain vendor neutral so I’ll have to gracefully decline on this one Ajay!

However, I did build a data profiling and data quality assessment tool several years ago. To be honest, that is the tool I like best because it had a range of features I still haven’t seen implemented so far in any other tools. If I ever get chance, and if no other vendor comes up with the same concept, I may yet take it to market. For now though, two young kids, two communities and a 12 hour day mean it is something of pipedream.

Ajay-What does Dylan Jones do when not helping data quality of the world go better.

Dylan- I’ve recently had another baby boy so kids take up most of whatever free time I have left. When we do get a break though I like to head to my home town and just hang out on the beach or go up into the mountains. I love travelling and as I effectively work completely online now, we’re really trying to figure out a way of combining travel and work.

Biography-

Dylan Jones is the founder and editor of Data Quality Pro and Data Migration Pro, the leading online expert community resources. Since the early nineties he has been helping large organisations tackle major information management challenges. He now devotes his time to fostering greater awareness, community and education in the fields of data quality and data migration via the use of social media channels. Dylan can be contacted via his profile page at http://www.dataqualitypro.com/data-quality-dylan-jones/ or at http://www.twitter.com/dataqualitypro

SAP caught stealing patents: Pays $139 Million

Curt Monash was right. SAP does have questionable business ethics. It has been caught stealing ideas of as many as 5 patents and has been told to pay 139 Million $.

How many more patents does SAP have in it’s closet ( wink wink). By Funding Blogs, and Blog Communities how much time is SAP trying to buy, by raising prices aribitarily for locked in customers and using the one time gain to buy companies with better decision management pedigrees.

Don’t belive me, huh. Here is PC World or just google/bing for SAP, patent lawsuit

http://www.pcworld.com/businesscenter/article/170899/versata_wins_139m_damages_in_sap_patent_lawsuit.html

Software HIStory: Bass Institute Part 1

or How SAS Institute needs to take competition from WPS, (sas language compiler) in an alliance with IBM, and from R (open source predictive analytics with tremendous academic support) and financial pressure from Microsoft and SAP more seriously.

On the weekend, I ran into Jeff Bass, owner of BASS Institute. BASS Institute provided a SAS -like compiler in the 1980’s , was very light compared to the then clunky SAS ( which used multiple floppies), and sold many copies. It ran out of money when the shift happened to PCs and SAS Institute managed to reach that first.

Today the shift is happening to cloud computing and though SAS has invested 70 Million in it, it still continues to SUPPORT Microsoft by NOT supporting or even offering financial incentives for customers to use  Ubuntu Linux server and Ubuntu Linux desktop. For academic students it charges 25$ per Windows license, and thus helping sell much more copies of Windows Vista. Why does it not give the Ubuntu Linux version free to students. Why does SAS Institute continue to give the online doc free to people who use it’s language, and undercut it. More importantly why does SAS charge LESS money for excellent software in the BI space. It is one of the best and cheapest BI software and the most expensive desktop software. Why Does the SAS Institute not support Hadoop , Map/Reduce database systems insted of focusing on Oracle, Teradata relationships and feelings ??

Anyways, back to Jeff Bass- This is part 1 of the interview.

Ajay- Jeff, tell us all about the BASS Institute?

Jeff-

the BASS system has been off the market for about 20 years and is an example of old, command line, DOS based software that has been far surpassed by modern products – including SAS for the PC platform.  It was fun providing a “SAS like” language for people on PCs – running MS DOS – but I scrapped the product when PC SAS became a reasonably useable product and PC’s got enough memory and hard disk space.
 
BASS was a SAS “work alike”…it would run many (but certainly not all) SAS programs with few modifications.  It required a DOS PC with 640K of RAM and a hard disk with 1MB of available space.  We used to demo it on a Toshiba laptop with NO hard disk and only a floppy drive.  It was a true compiler that parsed the data / proc step input code and generated 8086 assembly language that went through mild optimization, and then executed.
 
I no longer have the source code…it was saved to an ancient Irwin RS-232 tape drive onto tapes that no longer exist…it is fun how technology has moved on in 20 years!  The BASS system was written in Microsoft Pascal and the code for the compiler was similar to the code that would be generated by the Unix YACC “compiler compiler” when fed the syntax of the SAS data step language.  BASS included the “DATA Step” and the most basic PROCS, like MEANS, FREQ, REG, TTEST, PRINT, SORT and others.  Parts of the system were written in 8086 assembler (I have to smile when I remember that).  If I was to recreate it today, I would probably use YACC and have it produce R source code…but that is an idea I am never likely to spend any time on.
 
We sold quite a few copies of the software and BASS Institute, Incorporated was a going concern until PC SAS became debugged and reliable.  Then there was no point in continuing it.  But I think it would be fun for someone to write a modern open source version of a SAS compiler (the data step and basic procs were developed in the public domain at NC State University before Sall and Goodnight took the company private, so as long as no copyrighted code was used in any way, an open source compiler would probably be legal).
 
I still use SAS (my company has an enterprise license), but only very rarely.  I use R more often and am a big fan of free software (sometimes called open source software, but I like the free software foundation’s distinction at fsf.org).  I appreciated your recommendation of the book “R for SAS and SPSS Users” on your website.  I bought it for my Kindle immediately upon reading about it on your website.I no longer work in the software world; I’m a reimbursement and health policy director for the biotech firm Amgen, where I have worked since 1990 or so…  I also serve on the boards of a couple of non-profit organizations in the health care field.

the BASS system has been off the market for about 20 years and is an example of old, command line, DOS based software that has been far surpassed by modern products – including SAS for the PC platform.  It was fun providing a “SAS like” language for people on PCs – running MS DOS – but I scrapped the product when PC SAS became a reasonably useable product and PC’s got enough memory and hard disk space.

 

BASS was a SAS “work alike”…it would run many (but certainly not all) SAS programs with few modifications.  It required a DOS PC with 640K of RAM and a hard disk with 1MB of available space.  We used to demo it on a Toshiba laptop with NO hard disk and only a floppy drive.  It was a true compiler that parsed the data / proc step input code and generated 8086 assembly language that went through mild optimization, and then executed.

 

I no longer have the source code…it was saved to an ancient Irwin RS-232 tape drive onto tapes that no longer exist…it is fun how technology has moved on in 20 years!  The BASS system was written in Microsoft Pascal and the code for the compiler was similar to the code that would be generated by the Unix YACC “compiler compiler” when fed the syntax of the SAS data step language.  BASS included the “DATA Step” and the most basic PROCS, like MEANS, FREQ, REG, TTEST, PRINT, SORT and others.  Parts of the system were written in 8086 assembler (I have to smile when I remember that).  If I was to recreate it today, I would probably use YACC and have it produce R source code…but that is an idea I am never likely to spend any time on.

 

We sold quite a few copies of the software and BASS Institute, Incorporated was a going concern until PC SAS became debugged and reliable.  Then there was no point in continuing it.  But I think it would be fun for someone to write a modern open source version of a SAS compiler (the data step and basic procs were developed in the public domain at NC State University before Sall and Goodnight took the company private, so as long as no copyrighted code was used in any way, an open source compiler would probably be legal).

 

I still use SAS (my company has an enterprise license), but only very rarely.  I use R more often and am a big fan of free software (sometimes called open source software, but I like the free software foundation’s distinction at fsf.org).  I appreciated your recommendation of the book “R for SAS and SPSS Users” on your website.  I bought it for my Kindle immediately upon reading about it on your website.

 

I’m a reimbursement and health policy director for the biotech firm Amgen, where I have worked since 1990 or so…  I also serve on the boards of a couple of non-profit organizations in the health care field.

Ajay- Any comments on WPS?

Jeff- I’m glad WPS is out there.  I think alternatives help keep the SAS folks aware that they have to care about competition, at least a little 😉

( Note from Ajay-

You can see more on WPS at http://www.teamwpc.co.uk/home

wps

and on SAS at http://www.sas.com/


Goodbye Teddy

A bear of a man, with an appetite of a whale
The lion of the senate, succeeds while all fail.

Slowly succeeding steadily, with his head and heart
The youngest knight of camelot, went out the last

No child left behind, and no sick person too,
Goodbye Teddy- We the still uninsured will miss you