R Oracle Data Mining

Here is a new package called R ODM and it is an interface to do Data Mining via Oracle Tables through R. You can read more here http://www.oracle.com/technetwork/database/options/odm/odm-r-integration-089013.html and here http://cran.fhcrc.org/web/packages/RODM/RODM.pdf . Also there is a contest for creative use of R and ODM.

R Interface to Oracle Data Mining

The R Interface to Oracle Data Mining ( R-ODM) allows R users to access the power of Oracle Data Mining’s in-database functions using the familiar R syntax. R-ODM provides a powerful environment for prototyping data analysis and data mining methodologies.

R-ODM is especially useful for:

  • Quick prototyping of vertical or domain-based applications where the Oracle Database supports the application
  • Scripting of “production” data mining methodologies
  • Customizing graphics of ODM data mining results (examples: classificationregressionanomaly detection)

The R-ODM interface allows R users to mine data using Oracle Data Mining from the R programming environment. It consists of a set of function wrappers written in source R language that pass data and parameters from the R environment to the Oracle RDBMS enterprise edition as standard user PL/SQL queries via an ODBC interface. The R-ODM interface code is a thin layer of logic and SQL that calls through an ODBC interface. R-ODM does not use or expose any Oracle product code as it is completely an external interface and not part of any Oracle product. R-ODM is similar to the example scripts (e.g., the PL/SQL demo code) that illustrates the use of Oracle Data Mining, for example, how to create Data Mining models, pass arguments, retrieve results etc.

R-ODM is packaged as a standard R source package and is distributed freely as part of the R environment’s Comprehensive R Archive Network ( CRAN). For information about the R environment, R packages and CRAN, see www.r-project.org.

and

Present and win an Apple iPod Touch!
The BI, Warehousing and Analytics (BIWA) SIG is giving an Apple iPOD Touch to the best new presenter. Be part of the TechCast series and get a chance to win!

Consider highlighting a creative use of R and ODM.

BIWA invites all Oracle professionals (experts, end users, managers, DBAs, developers, data analysts, ISVs, partners, etc.) to submit abstracts for 45 minute technical webcasts to our Oracle BIWA (IOUG SIG) Community in our Wednesday TechCast series. Note that the contest is limited to new presenters to encourage fresh participation by the BIWA community.

Also an interview with Oracle Data Mining head, Charlie Berger https://decisionstats.wordpress.com/2009/09/02/oracle/

Business Analytics Analyst Relations /Ethics/White Papers

Curt Monash, whom I respect and have tried to interview (unsuccessfully) points out suitable ethical dilemmas and gray areas in Analyst Relations in Business Intelligence here at http://www.dbms2.com/2010/07/30/advice-for-some-non-clients/

If you dont know what Analyst Relations are, well it’s like credit rating agencies for BI software. Read Curt and his landscaping of the field here ( I am quoting a summary) at http://www.strategicmessaging.com/the-ethics-of-white-papers/2010/08/01/

Vendors typically pay for

  1. They want to connect with sales prospects.
  2. They want general endorsement from the analyst.
  3. They specifically want endorsement from the analyst for their marketing claims.
  4. They want the analyst to do a better job of explaining something than they think they could do themselves.
  5. They want to give the analyst some money to enhance the relationship,

Merv Adrian (I interviewed Merv here at http://www.dudeofdata.com/?p=2505) has responded well here at http://www.enterpriseirregulars.com/23040/white-paper-sponsorship-and-labeling/

None of the sites I checked clearly identify the work as having been sponsored in any way I found obvious in my (admittefly) quick scan. So this is an issue, but it’s not confined to Oracle.

My 2 cents (not being so well paid 😉 are-

I think Curt was calling out Oracle (which didnt respond) and not Merv ( whose subsequent blog post does much to clarify).

As a comparative new /younger blogger in this field,
I applaud both Curt to try and bell the cat ( or point out what everyone in AR winks at) and for Merv for standing by him.

In the long run, it would strengthen analyst relations as a channel if they separate financial payment of content from bias. An example is credit rating agencies who forgot to do so in BFSI and see what happened.

Customers invest millions of dollars in BI systems trusting marketing collateral/white papers/webinars/tests etc. Perhaps it’s time for an industry association for analysts so that individual analysts don’t knuckle down under vendor pressure.

It is easier for someone of Curt, Merv’s stature to declare editing policy and disclosures before they write a white paper.It is much harder for everyone else who is not so well established.

White papers can take as much as 25,000$ to produce- and I know people who in Business Analytics (as opposed to Business Intelligence) slog on cents per hour cranking books on R, SAS , webinars, trainings but there are almost no white papers in BA. Are there any analytics independent analysts who are not biased by R or SAS or SPSS or etc etc. I am not sure but this looks like a good line to  pursue 😉 – provided ethical checks and balances are established.

Personally I know of many so called analytics communities go all out to please their sponsors so bias in writing does exist (you cant praise SAS on a R Blogging Forum or R USers Meet and you cant write on WPS at SAS Community.org )

– at the same time someone once told me- It is tough to make a living as a writer, and that choice between easy money and credible writing needs to be respected.

Most sponsored white papers I read are pure advertisements, directed at CEOs rather than the techie community at large.

Almost every BI vendor claims to have the fastest database with 5X speed- and benchmarking in technical terms could be something they could do too.

Just like Gadget sites benchmark products, you can not benchmark BI or even BA products as it is written not to do so  in many licensing terms.

Probably that is the reason Billions are spent in BI and the positive claims are doubtful ( except by the sellers). Similarly in Analytics, many vendors would have difficulty justifying their claims or prices if they are subjected to a side by side comparison. Unfortunately the resulting confusion results in shoddy technology coming stronger due to more aggressive marketing.

Hadley’s tutorials on R Visualization

Here are a set of extremely nice tutorials from Hadley Wickham  (creator of ggplot) and whom we interviewed here at https://decisionstats.wordpress.com/2010/01/12/interview-hadley-wickham-r-project-data-visualization-guru/

Hadley is teaching two short courses at Vanderbilt university (read at http://gettinggeneticsdone.blogspot.com/2010/07/hadley-wickhams-ggplot2-data.html)

or download code and presentations here at http://had.co.nz/vanderbilt-vis/

They are a set of very very lucid and easy to understand presentations so even if you know very little R, or just want to learn visualization you can refer to them.

Here is the first lesson by Prof Hadley on Basics in Visualization using R.

Hadley's tutorials on R Visualization

Here are a set of extremely nice tutorials from Hadley Wickham  (creator of ggplot) and whom we interviewed here at https://decisionstats.wordpress.com/2010/01/12/interview-hadley-wickham-r-project-data-visualization-guru/

Hadley is teaching two short courses at Vanderbilt university (read at http://gettinggeneticsdone.blogspot.com/2010/07/hadley-wickhams-ggplot2-data.html)

or download code and presentations here at http://had.co.nz/vanderbilt-vis/

They are a set of very very lucid and easy to understand presentations so even if you know very little R, or just want to learn visualization you can refer to them.

Here is the first lesson by Prof Hadley on Basics in Visualization using R.

My latest creation

I have just teamed up to create my latest venture called Kush Cognitives (Kush is my son). The firm is gonna make websites, build statistical analysis and offer social media offerings. It’s my latest venture and it merges all my previous ones and skills. After almost 3 years of working on and off with multiple people, this one is with a friend in the US.

Over the years (since 2007) I have made http://virtua-analytics.com (defunct), Swarajya Analytics Private Limited (www.swanplc.com – now sold) and now Kush Cognitives. I have gone through the models of proprietorship and corporation and now partnership.

Kush Cognitives is hosted at Decisionstats.com (as our flagship website) and we have shifted the blog to Decisionstats.Wordpress.com

We are aiming at the startups and small and medium segments first, but we retain capabilities for bigger clients as well. Lesser Bullshit and More Bang for your Buck.

So wish us luck- and if you need any social media advice, statistical analysis to be done, or technical matters of creating websites-This also includes training customization in R , SAS  , and statistical software but from a more practical point of view from a user angle. We are able to cater to both US and Indian clients.

give us a buzz at http://decisionstats.com

regards

Ajay Ohri

Image Courtesy-michelangelo

R Excel :Updated

It was really nice to see the latest version of R Excel at http://rcom.univie.ac.at/ and bundled together in an aptly named package called R and Friends.

The look and feel of the package as well as ease of installing are really professional. I also liked the commercial equivalent at http://www.statconn.com/

However much older-guardians and  die- hards of command line,  feel that GUI is like putting lipstick on a pig, but we respectfully demur.

What does R Excel do? Well for one it can put the R Commander Interface INSIDE your Excel Spreadsheet. That makes it easy to use and a familiar interface even if you are newbie to R- (assuming you have done some Excel)

Download the latest version here

RAndFriends

This package will automatically install and configure

  • R 2.11.1
  • rscproxy 1.3-1
  • rcom 2.2-1

It will also download and install a suitable version of the statconnDCOM server and of RExcel during installation. Therefore you will need a working Internet connection during the installation process.
This version of RAndFriends was created 20100516.

Download RAndFriendsSetup2111V3.1-5-1

We also give you information how to download all sources for R and the R packages included in RAndFriends.

Also read a paper on R and SAS interoperability (using HMisc package from Dr Harrell) at Holland Numerics

http://www.hollandnumerics.co.uk/pdf/SAS2R2SAS_paper.pdf

SAS Sentiment Analysis wins Award

From Business Wire, the new Sentiment Analysis product by SAS Institute (created by acquisition Teragram ) wins an award. As per wikipedia

http://en.wikipedia.org/wiki/Sentiment_analysis

Sentiment analysis or opinion mining refers to a broad (definitionally challenged) area of natural language processingcomputational linguistics and text mining. Generally speaking, it aims to determine the attitude of a speaker or a writer with respect to some topic. The attitude may be their judgment or evaluation (see appraisal theory), their affective state (that is to say, the emotional state of the author when writing) or the intended emotional communication (that is to say, the emotional effect the author wishes to have on the reader).

It was developed by Teragram. Here is another Sentiment Analysis tool from Stanford Grad school at http://twittersentiment.appspot.com/search?query=sas

See-

Sentiment analysis for sas

Image Citation-

http://threeminds.organic.com/2009/09/five_reasons_sentiment_analysi.html

Read an article on sentiment analysis here at http://www.nytimes.com/2009/08/24/technology/internet/24emotion.html

And the complete press release at http://goo.gl/iVzf`

SAS Sentiment Analysis delivers insights on customer, competitor and organizational opinions to a degree never before possible via manual review of electronic text. As a result, SAS, the leader in business analytics software and services, has earned the prestigious Communications Solutions Product of the Year Award fromTechnology Marketing Corporation (TMC).

“SAS has automated the time-consuming process of reading individual documents and manually extracting relevant information”

“SAS Sentiment Analysis has shown benefits for its customers and it provides ROI for the companies that use it,” said Rich Tehrani, CEO, TMC. “Congratulations to the entire team at SAS, a company distinguished by its dedication to software quality and superiority to address marketplace needs.”

Derive positive and negative opinions, evaluations and emotions

SAS Sentiment Analysis’ high-performance crawler locates and extracts sentiment from digital content sources, including mainstream websites, social media outlets, internal servers and incoming news feeds. SAS’ unique hybrid approach combines powerful statistical techniques with linguistics rules to improve accuracy to the detailed feature level. It summarizes the sentiment expressed in all available text collections – identifying trends and creating graphical reports that describe the expressed feelings of consumers, partners, employees and competitors in real time. Output from SAS Sentiment Analysis can be stored in document repositories, surfaced in corporate portals and used as input to additional SAS Text Analytics software or search engines to help decision makers evaluate trends, predict future outcomes, minimize risks and capitalize on opportunities.

“SAS has automated the time-consuming process of reading individual documents and manually extracting relevant information,” said Fiona McNeill, Global Analytics Product Marketing Manager at SAS. “Our integrated analytics framework helps organizations maximize the value of information to improve their effectiveness.”

SAS Sentiment Analysis is included in the SAS Text Analytics suite, which helps organizations discover insights from electronic text materials, associate them for delivery to the right person or place, and provide intelligence to select the best course of action. Whether answering complex search-and-retrieval questions, ensuring appropriate content is presented to internal or external constituencies, or predicting which activity or channel will produce the best effect on existing sentiments, SAS Text Analytics provides exceptional real-time processing speeds for large volumes of text.

SAS Text Analytics solutions are part of the SAS Business Analytics Framework, backed by the industry’s most comprehensive range of consulting, training and support services, ensuring customers maximum return from their IT investments.

Recognizing vision

The Communications Solutions Product of the Year Award recognizes vision, leadership and thoroughness. The most innovative products and services brought to the market from March 2008 through March 2009 were chosen as winners of this Product of the Year Award and are published on the INTERNET TELEPHONY and Customer Interaction Solutions websites.