Software Lawsuits :Ergo

The latest round of software lawsuits makes things more interesting especially for Google. There are two notable developments

1) Google’s pact with Verizon for Even more Open Internet -From

http://googlepublicpolicy.blogspot.com/2010/08/joint-policy-proposal-for-open-internet.html

A provider that offers a broadband Internet access service
complying with the above principles could offer any other additional or differentiated services. Such other services would have to be distinguishable in scope and purpose from broadband . Internet access service, but could make use of or access Internet content, applications or services
and could include traffic prioritization.

2) Oracle’s lawsuit against Google for Intellectual Property enforcement of Java for Android. ( read here http://news.cnet.com/8301-30685_3-20013549-264.html

I once joked about nothing remains cool forever not even Google (see https://decisionstats.wordpress.com/2008/08/05/11-ways-to-beat-up-google/ ) and I did not foresee the big G beating itself into knots on its own.

It is hard to sympathize with Google (or Oracle or Verizon) but this is a mess that is created when lawyers (with a briefcase) steal value rather than a thousand engineers can create value.

Interestingly Google owns the IP for Map Reduce – so could it itself sue the Hadoop community over terms of royalty someday-like Oracle did with Java- hmmmmm interesting revenue stream

All in all I would be happy to see zero tiers on an internet (wireless or wired) and even Java developers to make some money on writing code. Open source is not free source.

R Oracle Data Mining

Here is a new package called R ODM and it is an interface to do Data Mining via Oracle Tables through R. You can read more here http://www.oracle.com/technetwork/database/options/odm/odm-r-integration-089013.html and here http://cran.fhcrc.org/web/packages/RODM/RODM.pdf . Also there is a contest for creative use of R and ODM.

R Interface to Oracle Data Mining

The R Interface to Oracle Data Mining ( R-ODM) allows R users to access the power of Oracle Data Mining’s in-database functions using the familiar R syntax. R-ODM provides a powerful environment for prototyping data analysis and data mining methodologies.

R-ODM is especially useful for:

  • Quick prototyping of vertical or domain-based applications where the Oracle Database supports the application
  • Scripting of “production” data mining methodologies
  • Customizing graphics of ODM data mining results (examples: classificationregressionanomaly detection)

The R-ODM interface allows R users to mine data using Oracle Data Mining from the R programming environment. It consists of a set of function wrappers written in source R language that pass data and parameters from the R environment to the Oracle RDBMS enterprise edition as standard user PL/SQL queries via an ODBC interface. The R-ODM interface code is a thin layer of logic and SQL that calls through an ODBC interface. R-ODM does not use or expose any Oracle product code as it is completely an external interface and not part of any Oracle product. R-ODM is similar to the example scripts (e.g., the PL/SQL demo code) that illustrates the use of Oracle Data Mining, for example, how to create Data Mining models, pass arguments, retrieve results etc.

R-ODM is packaged as a standard R source package and is distributed freely as part of the R environment’s Comprehensive R Archive Network ( CRAN). For information about the R environment, R packages and CRAN, see www.r-project.org.

and

Present and win an Apple iPod Touch!
The BI, Warehousing and Analytics (BIWA) SIG is giving an Apple iPOD Touch to the best new presenter. Be part of the TechCast series and get a chance to win!

Consider highlighting a creative use of R and ODM.

BIWA invites all Oracle professionals (experts, end users, managers, DBAs, developers, data analysts, ISVs, partners, etc.) to submit abstracts for 45 minute technical webcasts to our Oracle BIWA (IOUG SIG) Community in our Wednesday TechCast series. Note that the contest is limited to new presenters to encourage fresh participation by the BIWA community.

Also an interview with Oracle Data Mining head, Charlie Berger https://decisionstats.wordpress.com/2009/09/02/oracle/

Business Analytics Analyst Relations /Ethics/White Papers

Curt Monash, whom I respect and have tried to interview (unsuccessfully) points out suitable ethical dilemmas and gray areas in Analyst Relations in Business Intelligence here at http://www.dbms2.com/2010/07/30/advice-for-some-non-clients/

If you dont know what Analyst Relations are, well it’s like credit rating agencies for BI software. Read Curt and his landscaping of the field here ( I am quoting a summary) at http://www.strategicmessaging.com/the-ethics-of-white-papers/2010/08/01/

Vendors typically pay for

  1. They want to connect with sales prospects.
  2. They want general endorsement from the analyst.
  3. They specifically want endorsement from the analyst for their marketing claims.
  4. They want the analyst to do a better job of explaining something than they think they could do themselves.
  5. They want to give the analyst some money to enhance the relationship,

Merv Adrian (I interviewed Merv here at http://www.dudeofdata.com/?p=2505) has responded well here at http://www.enterpriseirregulars.com/23040/white-paper-sponsorship-and-labeling/

None of the sites I checked clearly identify the work as having been sponsored in any way I found obvious in my (admittefly) quick scan. So this is an issue, but it’s not confined to Oracle.

My 2 cents (not being so well paid 😉 are-

I think Curt was calling out Oracle (which didnt respond) and not Merv ( whose subsequent blog post does much to clarify).

As a comparative new /younger blogger in this field,
I applaud both Curt to try and bell the cat ( or point out what everyone in AR winks at) and for Merv for standing by him.

In the long run, it would strengthen analyst relations as a channel if they separate financial payment of content from bias. An example is credit rating agencies who forgot to do so in BFSI and see what happened.

Customers invest millions of dollars in BI systems trusting marketing collateral/white papers/webinars/tests etc. Perhaps it’s time for an industry association for analysts so that individual analysts don’t knuckle down under vendor pressure.

It is easier for someone of Curt, Merv’s stature to declare editing policy and disclosures before they write a white paper.It is much harder for everyone else who is not so well established.

White papers can take as much as 25,000$ to produce- and I know people who in Business Analytics (as opposed to Business Intelligence) slog on cents per hour cranking books on R, SAS , webinars, trainings but there are almost no white papers in BA. Are there any analytics independent analysts who are not biased by R or SAS or SPSS or etc etc. I am not sure but this looks like a good line to  pursue 😉 – provided ethical checks and balances are established.

Personally I know of many so called analytics communities go all out to please their sponsors so bias in writing does exist (you cant praise SAS on a R Blogging Forum or R USers Meet and you cant write on WPS at SAS Community.org )

– at the same time someone once told me- It is tough to make a living as a writer, and that choice between easy money and credible writing needs to be respected.

Most sponsored white papers I read are pure advertisements, directed at CEOs rather than the techie community at large.

Almost every BI vendor claims to have the fastest database with 5X speed- and benchmarking in technical terms could be something they could do too.

Just like Gadget sites benchmark products, you can not benchmark BI or even BA products as it is written not to do so  in many licensing terms.

Probably that is the reason Billions are spent in BI and the positive claims are doubtful ( except by the sellers). Similarly in Analytics, many vendors would have difficulty justifying their claims or prices if they are subjected to a side by side comparison. Unfortunately the resulting confusion results in shoddy technology coming stronger due to more aggressive marketing.

Certifications in Analytics and Business Intelligence

I sometimes get a chat message on Twitter/ Facebook asking for help on some specific data issue. More often than not it is something like – How do I get started in BI/BA /Data stuff. So here is a list of certifications which I think are quite nice as beginning points or even CV multipliers.

[tweetmeme=”Decisionstats”]

1) Google’s Certifications

http://www.google.com/intl/en/adwords/professionals/

2) SAS Certifications

Quite well established and easily one of the best structured certification programs in the industry.

http://support.sas.com/certify/index.html

3) SPSS

The SPSS certification began last year and it helps provide a valuable skill set for both your practice as well as your resume. Also useful to have a second skill set apart from SAS in terms of statistical software.

http://www.spss.com/certification/

At this point I would like you to pause and think if the above certifications are useful or cost  effective for you as they are broadly general qualifications in statistical platforms as well as in applying them for the web analytics ( a key area for business analytics).

For more specialized certifications here are some more-

1) Microsoft SQL Server

http://www.microsoft.com/learning/en/us/certification/cert-sql-server.aspx

2) TDWI Certification

http://tdwi.org/pages/certification/index.aspx

3) IBM

Not sure how updated these are so caveat emptor!

http://www.redbooks.ibm.com/abstracts/sg245747.html

If you are knowledgeable about IBM’s Business Intelligence solutions and the fundamental concepts of DB2 Universal Database, and you are capable of performing the intermediate and advanced skills required to design, develop, and support Business Intelligence applications

Also IBM Cognos Certifications

http://www-01.ibm.com/software/data/education/cognos-cert.html

4) MicroStrategy

http://www.microstrategy.com/education/Certification/

5) Oracle

Included the all new Sun Certifications as well.

http://certification.oracle.com/

and http://blogs.oracle.com/certification/

6) SAP Certifications

http://www.sap.com/services/education/certification/index.epx

7) Cloudera’s Hadoop Certification

http://www.cloudera.com/developers/learn-hadoop/hadoop-certification/

These are some Business Intelligence and Business Analytics related certifications that I assembled in a list. Many other programs were either too software development specific or did not have a certification for general usage (like many R trainings or company tool specific trainings). Please feel free to add in any suggestions.

The Top Statistical Softwares (GUI)

The list of top Statistical Softwares (GUI) is continued below. You can see the earlier post here

6. R Commander– While initially aimed at being a basic statistics GUI, the tremendous popularity of R Commander and the extensions in the form of plugins has helped make this one of the most widely used GUI. In short if you dont know ANY R, and still want to do basic descriptive stats and modeling this will come in handy- with an added script window for custom code for advanced users and extensions like that for DoE (design of experiments) and QCC (Quality Control) packages the e-plugins are a great way to extend this. I suspect the only thing holding it back is Dr Fox and the rest of R Core’s reluctance to fully embrace GUI as a software medium. You can read his earlier interview here-https://decisionstats.wordpress.com/2009/09/14/interview-professor-john-fox-creator-r-commander/

Technically it is possible to convert just about any package to a GUI menu in R Commander using the e-plugins.

7. SAS GUIs

Enterprise (Guide)

SAS Enterprise Guide was the higher end (and higher priced solution) to enhanced editor’s lack of menu driven commands. It works but many people I know prefer the text editor just as well.


The Enterprise Miner is a separate software and works more like Red R or SPSS Modeler does. Again EM is one of the major DM softwares out there, but the similarity in names is a bit confusing.

Even the Base SAS Enhanced Editor does have some menus for importing data, or querying etc, but it is rarely confused for being a GUI.

8. Oracle Data Miner and Knime

I like both the ODM and Knime but I find the lack of advertising or promotional support puzzling. Both these softwares can do well to combine technical excellence with some marketing. And since they are both free you can check them out yourself here

Oracle Data Mining

You can download it here-(note- the Oracle Web Site itself is a bit aging 🙂 )

http://www.oracle.com/technology/products/bi/odm/odminer.html

Knime is the open source GUI which can be found here-

http://www.knime.org/introduction/features

9. RAwkard

Another R GUI- it stands out on the comprehensive ways you can customize your code in menus rather than writing all or learning by rote the syntax.

From http://sourceforge.net/apps/mediawiki/rkward/index.php?title=Main_Page

you can see it below. I recommend this GUI over other GUIs especially if you are new to R and do more data visualization which needs custom graphics.

10. Red R and R JGR/ Deducer

Red R and RJGR/Deducer are both up and coming GUIs for R. While REd R is R version for Enterprise Miner, Deducer is coming up with a new GUI for ggplot the powerful graphics package in R.

Some GUIs excluded from this list are – Statistica, MatLab, EViews(?) because I dont really work with them, and thought it best to turn them over to someone who knows them better.

Hope this list of GUIs helps you- note most of the softwares can be learnt within a quick hour and two if you know basic software skills/data manipulation so going through the GUI list is a faster way of adding value to your resume/knowledge base as well.