IBM Buys Netezza

IBM just bought Netezza (maker of Twin Fin appliance) for handling big data.

http://dealbook.blogs.nytimes.com/2010/09/20/i-b-m-to-buy-analytics-firm-for-1-7-billion/?hpw

The deal values Netezza at $27 a share, a 9.8 percent premium to its closing price on Friday.

Since Netezza was an existing SAS partner, probably it would impact it more if at all, since IBM-SPSS acquisition. Also Netezza was one of the foremost BI companies for both using and expounding R-

See- Using Netezza and R http://www.biecek.pl/WZUR2009/LukaszBartnik2009c.pdf

and http://www.netezza.com/userconference/pce.html#rmftfic

Below a paper on using R on Netezza-

> library(nzr)
> nzconnect(“user”, “password”, “host”, “database”)
> library(rpart)
> data(kyphosis)
# this creates a table out of kyphosis data.frame
# and sends its data to TwinFin
> invisible(as.nz.data.frame(kyphosis))
> nzQuery(“SELECT * FROM kyphosis”)
KYPHOSIS AGE NUMBER START
1 absent 71 3 5
2 absent 158 3 14
3 present 128 4 5
[ cut ]
# now create a nz.data.frame
> k <- nz.data.frame(“kyphosis”)
> as.data.frame(k)
KYPHOSIS AGE NUMBER START
1 absent 71 3 5
2 absent 158 3 14
3 present 128 4 5
[ cut ]
> nzQuery(“SELECT * FROM kyphosis”)
COUNT
1 81

KXEN Update

Update from a very good data mining software company, KXEN –

  1. Longtime Chairman and founder Roger Haddad is retiring but would be a Board Member. See his interview with Decisionstats here https://decisionstats.wordpress.com/2009/01/05/interview-roger-haddad-founder-of-kxen-automated-modeling-software/ (note images were hidden due to migration from .com to .wordpress.com )
  2. New Members of Leadership are as-
John Ball, CEOJohn Ball
Chief Executive Officer

John Ball brings 20 years of experience in enterprise software, deep expertise in business intelligence and CRM applications, and a proven track record of success driving rapid growth at highly innovative companies.

Prior to joining KXEN, Mr. Ball served in several executive roles at salesforce.com, the leading provider of SaaS applications. Most recently, John served as VP & General Manager, Analytics and Reporting Products, where he spearheaded salesforce.com’s foray into CRM analytics and business intelligence. John also served as VP & General Manager, Service and Support Applications at salesforce.com, where he successfully grew the business to become the second largest and fastest growing product line at salesforce.com. Before salesforce.com, Ball was founder and CEO of Netonomy, the leading provider of customer self-service solutions for the telecommunications industry. Ball also held a number of executive roles at Business Objects, including General Manager, Web Products, where delivered to market the first 3 versions of WebIntelligence. Ball has a master’s degree in electrical engineering from Georgia Tech and a master’s degree in electric

I hope John atleast helps build a KXEN Force.com application- there are only 2 data mining apps there on App Exchange. Also on the wish list  more social media presence, a Web SaaS/Amazon API for KXEN, greater presence in American/Asian conferences, and a solution for SME’s (which cannot afford the premium pricing of the flagship solution. An alliance with bigger BI vendors like Oracle, SAP or IBM  for selling the great social network analysis.

Bill Russell as Non Executive Chairman-

Bill Russell as Non-executive Chairman of the Board, effective July 16 2010. Russell has 30 years of operational experience in enterprise software, with a special focus on business intelligence, analytics, and databases.Russell held a number of senior-level positions in his more than 20 years at Hewlett-Packard, including Vice President and General Manager of the multi-billion dollar Enterprise Systems Group. He has served as Non-executive Chairman of the Board for Sylantro Systems Corporation, webMethods Inc., and Network Physics, Inc. and has served as a board director for Cognos Inc. In addition to KXEN, Russell currently serves on the boards of Saba, PROS Holdings Inc., Global 360, ParAccel Inc., and B.T. Mancini Company.

Xavier Haffreingue as senior vice president, worldwide professional services and solutions.
He has almost 20 years of international enterprise software experience gained in the CRM, BI, Web and database sectors. Haffreingue joins KXEN from software provider Axway where he was VP global support operations. Prior to Axway, he held various leadership roles in the software industry, including VP self service solutions at Comverse Technologies and VP professional services and support at Netonomy, where he successfully delivered multi-million dollar projects across Europe, Asia-Pacific and Africa. Before that he was with Business Objects and Sybase, where he ran support and services in southern Europe managing over 2,500 customers in more than 20 countries.

David Guercio  as senior vice president, Americas field operations. Guercio brings to the role more than 25 years experience of building and managing high-achieving sales teams in the data mining, business intelligence and CRM markets. Guercio comes to KXEN from product lifecycle management vendor Centric Software, where he was EVP sales and client services. Prior to Centric, he was SVP worldwide sales and client services at Inxight Software, where he was also Chairman and CEO of the company’s Federal Systems Group, a subsidiary of Inxight that saw success in the US Federal Government intelligence market. The success in sales growth and penetration into the federal government led to the acquisition of Inxight by Business Objects in 2007, where Guercio then led the Inxight sales organization until Business Objects was acquired by SAP. Guercio was also a key member of the management team and a co-founder at Neovista, an early pioneer in data mining and predictive analytics. Additionally, he held the positions of director of sales and VP of professional services at Metaphor Computer Systems, one of the first data extraction solutions companies, which was acquired by IBM. During his career, Guercio also held executive positions at Resonate and SiGen.

3) Venture Capital funding to fund expansion-

It has closed $8 million in series D funding to further accelerate its growth and international expansion. The round was led by NextStage and included participation from existing investors XAnge Capital, Sofinnova Ventures, Saints Capital and Motorola Ventures.

This was done after John Ball had joined as CEO.

4) Continued kudos from analysts and customers for it’s technical excellence.

KXEN was named a leader in predictive analytics and data mining by Forrester Research (1) and was rated highest for commercial deployments of social network analytics by Frost & Sullivan (2)

Also it became an alliance partner of Accenture- which is also a prominent SAS partner as well.

In Database Optimization-

In KXEN V5.1, a new data manipulation module (ADM) is provided in conjunction with scoring to optimize database workloads and provide full in-database model deployment. Some leading data mining vendors are only now beginning to offer this kind of functionality, and then with only one or two selected databases, giving KXEN a more than five-year head start. Some other vendors are only offering generic SQL generation, not optimized for each database, and do not provide the wealth of possible outputs for their scoring equations: For example, real operational applications require not only to generate scores, but decision probabilities, error bars, individual input contributions – used to derive reasons of decision and more, which are available in KXEN in-database scoring modules.

Since 2005, KXEN has leveraged databases as the data manipulation engine for analytical dataset generation. In 2008, the ADM (Analytical Data Management) module delivered a major enhancement by providing a very easy to use data manipulation environment with unmatched productivity and efficiency. ADM works as a generator of optimized database-specific SQL code and comes with an integrated layer for the management of meta-data for analytics.

KXEN Modeling Factory- (similar to SAS’s recent product Rapid Predictive Modeler http://www.sas.com/resources/product-brief/rapid-predictive-modeler-brief.pdf and http://jtonedm.com/2010/09/02/first-look-rapid-predictive-modeler/)

KXEN Modeling Factory (KMF) has been designed to automate the development and maintenance of predictive analytics-intensive systems, especially systems that include large numbers of models, vast amounts of data or require frequent model refreshes. Information about each project and model is monitored and disseminated to ensure complete management and oversight and to facilitate continual improvement in business performance.

Main Functions

Schedule: creation of the Analytic Data Set (ADS), setup of how and when to score, setup of when and how to perform model retraining and refreshes …

Report
: Monitormodel execution over time, Track changes in model quality over time, see how useful one variable is by considering its multiple instance in models …

Notification
: Rather than having to wade through pages of event logs, KMF Department allows users to manage by exception through notifications.

Other products from KXEN have been covered here before https://decisionstats.wordpress.com/tag/kxen/ , including Structural Risk Minimization- https://decisionstats.wordpress.com/2009/04/27/kxen-automated-regression-modeling/

Thats all for the KXEN update- all the best to the new management team and a splendid job done by Roger Haddad in creating what is France and Europe’s best known data mining company.

Note- Source – http://www.kxen.com


Open Source Business Intelligence: Pentaho and Jaspersoft

Here are two products that are used widely for Business Intelligence_ They are open source and both have free preview.

Jaspersoft-For the Enterprise version click on the screenshot while for the free community version you can go to

http://jasperforge.org/projects/jasperserver

Interestingly (and not surprisingly) Revolution Analytics is teaming up with Jaspersoft to use R for reporting along with the Jaspersoft BI stack.

ADVANCED ANALYTICS ON DEMAND IN APPLICATIONS, IN DASHBOARDS, AND ON THE WEB

FREE WEBINAR WEDNESDAY, SEPTEMBER 22ND @9AM PACIFIC

DEPLOYING R: ADVANCED ANALYTICS ON DEMAND IN APPLICATIONS, IN DASHBOARDS, AND ON THE WEB

A JOINT WEBINAR FROM REVOLUTION ANALYTICS AND JASPERSOFT

Date: Wednesday, September 22, 2010
Time: 9:00am PDT (12:00pm EDT; 4:00pm GMT)
Presenters: David Smith, Vice President of Marketing, Revolution Analytics
Andrew Lampitt, Senior Director of Technology Alliances, Jaspersoft
Matthew Dahlman, Business Development Engineer, Jaspersoft
Registration: Click here to register now!

R is a popular and powerful system for creating custom data analysis, statistical models, and data visualizations. But how can you make the results of these R-based computations easily accessible to others? A PhD statistician could use R directly to run the forecasting model on the latest sales data, and email a report on request, but then the process is just going to have to be repeated again next month, even if the model hasn’t changed. Wouldn’t it be better to empower the Sales manager to run the model on demand from within the BI application she already uses—daily, even!—and free up the statistician to build newer, better models for others?

In this webinar, David Smith (VP of Marketing, Revolution Analytics) will introduce the new “RevoDeployR” Web Services framework for Revolution R Enterprise, which is designed to make it easy to integrate dynamic R-based computations into applications for business users. RevoDeployR empowers data analysts working in R to publish R scripts to a server-based installation of Revolution R Enterprise. Application developers can then use the RevoDeployR Web Services API to securely and scalably integrate the results of these scripts into any application, without needing to learn the R language. With RevoDeployR, authorized users of hosted or cloud-based interactive Web applications, desktop applications such as Microsoft Excel, and BI applications like Jaspersoft can all benefit from on-demand analytics and visualizations developed by expert R users.

To demonstrate the power of deploying R-based computations to business users, Andrew Lampitt will introduce Jaspersoft commercial open source business intelligence, the world’s most widely used BI software. In a live demonstration, Matt Dahlman will show how to supercharge the BI process by combining Jaspersoft and Revolution R Enterprise, giving business users on-demand access to advanced forecasts and visualizations developed by expert analysts.

Click here to register for the webinar.

Speaker Biographies:

David Smith is the Vice President of Marketing at Revolution Analytics, the leading commercial provider of software and support for the open source “R” statistical computing language. David is the co-author (with Bill Venables) of the official R manual An Introduction to R. He is also the editor of Revolutions (http://blog.revolutionanalytics.com), the leading blog focused on “R” language, and one of the originating developers of ESS: Emacs Speaks Statistics. You can follow David on Twitter as @revodavid.

Andrew Lampitt is Senior Director of Technology Alliances at Jaspersoft. Andrew is responsible for strategic initiatives and partnerships including cloud business intelligence, advanced analytics, and analytic databases. Prior to Jaspersoft, Andrew held other business positions with Sunopsis (Oracle), Business Objects (SAP), and Sybase (SAP). Andrew earned a BS in engineering from the University of Illinois at Urbana Champaign.

Matthew Dahlman is Jaspersoft’s Business Development Engineer, responsible for technical aspects of technology alliances and regional business development. Matt has held a wide range of technical positions including quality assurance, pre-sales, and technical evangelism with enterprise software companies including Sybase, Netonomy (Comverse), and Sunopsis (Oracle). Matt earned a BA in mathematics from Carleton College in Northfield, Minnesota.


The second widely used BI stack in open source is Pentaho.

You can download it here to evaluate it or click on screenshot to read more at

http://community.pentaho.com/

http://sourceforge.net/projects/pentaho/files/Business%20Intelligence%20Server/

Aster Data hires Quentin Gallivan as CEO

AsterData formally marked phase 2 of it’s rapid growth story by getting as new CEO Quentin Gallivan (of Postini before it was sold to Google and also Pivotlink).

Founders (and Stanfordians) Mayan Bawa stays as Chief Customer Officer and Tasso Argyros as CTO. It has a very deja vu feel -like Eric Schmidt coming in CEO of Google in the glory days past.  Indeed the investment team in Google and AsterData is quite similar and so are the backgrounds of the founders.

AsterData of course creates the leading MapReduce (also created by Google) solution for providing BI infrastructure for big data and has been rapidly been expanding into new frontiers for Big Data.

Aster Data Appoints New Chief Executive Officer

Quentin Gallivan Joins Aster Data as CEO to Lead Company to Next Level of Growth

San Carlos, CA – September 9, 2010– Aster Data, a proven leader dedicated to providing the best data management and data processing platform for big data management and analytics, today announced the appointment of Quentin Gallivan as President and CEO. Gallivan brings more than 20 years of senior executive experience to the leading analytics and database company. With Aster Data achieving tremendous growth in the past year, Gallivan will take Aster Data to the next level, further accelerating its market leadership, sales, channel partnerships and international expansion.  Founding CEO Mayank Bawa, who grew the company from its inception based on the founders’ research at Stanford University, and whose passion for helping customers uniquely unlock the value of their data, will take on the role of Chief Customer Officer.  Bawa, in his new role, will lead the Company’s organization devoted to ensuring the success, longevity and innovation of its fast-growing customer base. Together, Gallivan and Bawa, along with co-founder and Chief Technology Officer, Tasso Argyros, will deliver on the the Company’s mission to help customers discover more value from their data, achieve deep insights through rich analytics and do more with their massive data volumes than has ever been possible.

Gallivan joins Aster Data with over 20 years of leadership experience in the high-tech industry and has held a variety of CEO and senior executive positions with leading technology companies. Before joining Aster Data, Gallivan served as CEO at PivotLink, the leading provider of business intelligence (BI) solutions delivered via Software as a Service (SaaS), where he rapidly grew the company to over 15,000 business users, from mid-sized companies to Fortune 1000 companies, across key industries including financial services, retail, CPG manufacturing and high technology. Prior to Pivotlink, Gallivan served as CEO of Postini where he scaled the company to 35,000 customers and over 10 million users until its eventual acquisition by Google in 2007.  Gallivan also served as executive vice president of worldwide sales and services at VeriSign where he was instrumental in growing the business from $20 million to $1.2 billion and was responsible for the design and execution of the global distribution strategy for the company’s security and services business. Gallivan also held a number of key executive and leadership positions at Netscape Communications and GE Information Services.

“We are delighted to have someone of Quentin’s caliber, who is a veteran of both emerging and established technology companies, lead Aster Data through our next stage of growth,” said Mayank Bawa, Chief Customer Officer and co-founder, Aster Data. “His significant experience around growing organizations and driving operational excellence will be invaluable as he takes Aster Data forward. I’m excited to shift my focus to customers and their success; to bring our innovations to our customers worldwide to help them unlock deep value from their growing data volumes.”

“I am very excited to be joining Aster Data and taking on the challenge of augmenting its already impressive level of growth and success.  Aster Data is very well respected and established in the marketplace, has an enviable solution for big data management that uniquely addresses both big data storage and data processing, an impressive client list and a very talented team,” said Quentin Gallivan, President and CEO, Aster Data. “My task will be to leverage these assets, help shape a new market and provide operational guidance and strategic direction to drive even greater value for shareholders, customers and employees alike.”

Business Analytics Analyst Relations /Ethics/White Papers

Curt Monash, whom I respect and have tried to interview (unsuccessfully) points out suitable ethical dilemmas and gray areas in Analyst Relations in Business Intelligence here at http://www.dbms2.com/2010/07/30/advice-for-some-non-clients/

If you dont know what Analyst Relations are, well it’s like credit rating agencies for BI software. Read Curt and his landscaping of the field here ( I am quoting a summary) at http://www.strategicmessaging.com/the-ethics-of-white-papers/2010/08/01/

Vendors typically pay for

  1. They want to connect with sales prospects.
  2. They want general endorsement from the analyst.
  3. They specifically want endorsement from the analyst for their marketing claims.
  4. They want the analyst to do a better job of explaining something than they think they could do themselves.
  5. They want to give the analyst some money to enhance the relationship,

Merv Adrian (I interviewed Merv here at http://www.dudeofdata.com/?p=2505) has responded well here at http://www.enterpriseirregulars.com/23040/white-paper-sponsorship-and-labeling/

None of the sites I checked clearly identify the work as having been sponsored in any way I found obvious in my (admittefly) quick scan. So this is an issue, but it’s not confined to Oracle.

My 2 cents (not being so well paid 😉 are-

I think Curt was calling out Oracle (which didnt respond) and not Merv ( whose subsequent blog post does much to clarify).

As a comparative new /younger blogger in this field,
I applaud both Curt to try and bell the cat ( or point out what everyone in AR winks at) and for Merv for standing by him.

In the long run, it would strengthen analyst relations as a channel if they separate financial payment of content from bias. An example is credit rating agencies who forgot to do so in BFSI and see what happened.

Customers invest millions of dollars in BI systems trusting marketing collateral/white papers/webinars/tests etc. Perhaps it’s time for an industry association for analysts so that individual analysts don’t knuckle down under vendor pressure.

It is easier for someone of Curt, Merv’s stature to declare editing policy and disclosures before they write a white paper.It is much harder for everyone else who is not so well established.

White papers can take as much as 25,000$ to produce- and I know people who in Business Analytics (as opposed to Business Intelligence) slog on cents per hour cranking books on R, SAS , webinars, trainings but there are almost no white papers in BA. Are there any analytics independent analysts who are not biased by R or SAS or SPSS or etc etc. I am not sure but this looks like a good line to  pursue 😉 – provided ethical checks and balances are established.

Personally I know of many so called analytics communities go all out to please their sponsors so bias in writing does exist (you cant praise SAS on a R Blogging Forum or R USers Meet and you cant write on WPS at SAS Community.org )

– at the same time someone once told me- It is tough to make a living as a writer, and that choice between easy money and credible writing needs to be respected.

Most sponsored white papers I read are pure advertisements, directed at CEOs rather than the techie community at large.

Almost every BI vendor claims to have the fastest database with 5X speed- and benchmarking in technical terms could be something they could do too.

Just like Gadget sites benchmark products, you can not benchmark BI or even BA products as it is written not to do so  in many licensing terms.

Probably that is the reason Billions are spent in BI and the positive claims are doubtful ( except by the sellers). Similarly in Analytics, many vendors would have difficulty justifying their claims or prices if they are subjected to a side by side comparison. Unfortunately the resulting confusion results in shoddy technology coming stronger due to more aggressive marketing.

Towards better analytical software

Here are some thoughts on using existing statistical software for better analytics and/or business intelligence (reporting)-

1) User Interface Design Matters- Most stats software have a legacy approach to user interface design. While the Graphical User Interfaces need to more business friendly and user friendly- example you can call a button T Test or You can call it Compare > Means of Samples (with a highlight called T Test). You can call a button Chi Square Test or Call it Compare> Counts Data. Also excessive reliance on drop down ignores the next generation advances in OS- namely touchscreen instead of mouse click and point.

Given the fact that base statistical procedures are the same across softwares, a more thoughtfully designed user interface (or revamped interface) can give softwares an edge over legacy designs.

2) Branding of Software Matters- One notable whine against SAS Institite products is a premier price. But really that software is actually inexpensive if you see other reporting software. What separates a Cognos from a Crystal Reports to a SAS BI is often branding (and user interface design). This plays a role in branding events – social media is often the least expensive branding and marketing channel. Same for WPS and Revolution Analytics.

3) Alliances matter- The alliances of parent companies are reflected in the sales of bundled software. For a complete solution , you need a database plus reporting plus analytical software. If you are not making all three of the above, you need to partner and cross sell. Technically this means that software (either DB, or Reporting or Analytics) needs to talk to as many different kinds of other softwares and formats. This is why ODBC in R is important, and alliances for small companies like Revolution Analytics, WPS and Netezza are just as important as bigger companies like IBM SPSS, SAS Institute or SAP. Also tie-ins with Hadoop (like R and Netezza appliance)  or  Teradata and SAS help create better usage.

4) Cloud Computing Interfaces could be the edge- Maybe cloud computing is all hot air. Prudent business planing demands that any software maker in analytics or business intelligence have an extremely easy to load interface ( whether it is a dedicated on demand website) or an Amazon EC2 image. Easier interfaces win and with the cloud still in early stages can help create an early lead. For R software makers this is critical since R is bad in PC usage for larger sets of data in comparison to counterparts. On the cloud that disadvantage vanishes. An easy to understand cloud interface framework is here ( its 2 years old but still should be okay) http://knol.google.com/k/data-mining-through-cloud-computing#

5) Platforms matter- Softwares should either natively embrace all possible platforms or bundle in middle ware themselves.

Here is a case study SAS stopped supporting Apple OS after Base SAS 7. Today Apple OS is strong  ( 3.47 million Macs during the most recent quarter ) and the only way to use SAS on a Mac is to do either

http://goo.gl/QAs2

or do a install of Ubuntu on the Mac ( https://help.ubuntu.com/community/MacBook ) and do this

http://ubuntuforums.org/showthread.php?t=1494027

Why does this matter? Well SAS is free to academics and students  from this year, but Mac is a preferred computer there. Well WPS can be run straight away on the Mac (though they are curiously not been able to provide academics or discounted student copies 😉 ) as per

http://goo.gl/aVKu

Does this give a disadvantage based on platform. Yes. However JMP continues to be supported on Mac. This is also noteworthy given the upcoming Chromium OS by Google, Windows Azure platform for cloud computing.

Certifications in Analytics and Business Intelligence

I sometimes get a chat message on Twitter/ Facebook asking for help on some specific data issue. More often than not it is something like – How do I get started in BI/BA /Data stuff. So here is a list of certifications which I think are quite nice as beginning points or even CV multipliers.

[tweetmeme=”Decisionstats”]

1) Google’s Certifications

http://www.google.com/intl/en/adwords/professionals/

2) SAS Certifications

Quite well established and easily one of the best structured certification programs in the industry.

http://support.sas.com/certify/index.html

3) SPSS

The SPSS certification began last year and it helps provide a valuable skill set for both your practice as well as your resume. Also useful to have a second skill set apart from SAS in terms of statistical software.

http://www.spss.com/certification/

At this point I would like you to pause and think if the above certifications are useful or cost  effective for you as they are broadly general qualifications in statistical platforms as well as in applying them for the web analytics ( a key area for business analytics).

For more specialized certifications here are some more-

1) Microsoft SQL Server

http://www.microsoft.com/learning/en/us/certification/cert-sql-server.aspx

2) TDWI Certification

http://tdwi.org/pages/certification/index.aspx

3) IBM

Not sure how updated these are so caveat emptor!

http://www.redbooks.ibm.com/abstracts/sg245747.html

If you are knowledgeable about IBM’s Business Intelligence solutions and the fundamental concepts of DB2 Universal Database, and you are capable of performing the intermediate and advanced skills required to design, develop, and support Business Intelligence applications

Also IBM Cognos Certifications

http://www-01.ibm.com/software/data/education/cognos-cert.html

4) MicroStrategy

http://www.microstrategy.com/education/Certification/

5) Oracle

Included the all new Sun Certifications as well.

http://certification.oracle.com/

and http://blogs.oracle.com/certification/

6) SAP Certifications

http://www.sap.com/services/education/certification/index.epx

7) Cloudera’s Hadoop Certification

http://www.cloudera.com/developers/learn-hadoop/hadoop-certification/

These are some Business Intelligence and Business Analytics related certifications that I assembled in a list. Many other programs were either too software development specific or did not have a certification for general usage (like many R trainings or company tool specific trainings). Please feel free to add in any suggestions.