SPSS gets Directions

A link to the Predictive Analytic Conference by SPSS ( the first after the Big Blue announcement) at http://www.spss.com/spssdirections/na/index.htm

Should be interesting for existing clients and SPSS watchers.

spss

Analyzing Monkeys

I once promised a reader long time back that I would not get into politics but something unexpected hit me like a big truck.

At what point do you decide your boss is a racist. How do you analyze the difference between jokes and racial insults.

Another interesting analysis

Citation Emerald

Red R- A new beginning

Check out an interesting new interface to R.

Note I haven’t tested it but plan to do so shortly as I am currently using Ubuntu 9 almost exclusively nowadays.

R fans who are  not quite overjoyed  with the wonderful beauty and charm  of the traditional R GUI may want to give it a try.

Citation-

http://code.google.com/p/r-orange/

Note- This website does not assume responsibilty for any software glitches as R comes with no warranty- unlike other softwares that come loaded with both a warranty and then bug-fix patches.

redr

Losing a Million Bucks: Netflix Prize Interview

I ( and collective pseudo geeks) across the world lost a potential million dollars when the following team won the Netflix prize. In disgust, I just renewed my Netflix subscription and noticed a 10% increase in the way I liked them.

Jokes apart, here is an except ( perhaps one of the few ever) of an interview of the Netflix winners done by the great Eric Siegel, Phd.

Eric is conference chair of the Predictive Analytics Conference ( a King Arthur’s round table conference on all the shining knights of the data analytic’s world)

Citation-http://www.predictiveanalyticsworld.com/layman-netflix-leader.php

[ES] With no relevant background in statistics — let alone product recommendations specifically — what capabilities or background did make your success possible? Do you consider yourselves mathematicians, or at least strong with math?

[MC] I am certainly not a mathematician – I have engineering level skill. I consider Martin Piotte to have an exceptional mathematical mind (he participated successfully in international math contests when he was a student) even though he never formally studied in that field. In the end, the mathematics used in this contest seem very complex, but are really rather simple. Compared to what most people think, this was more of an engineering contest than a mathematical contest [See Martin’s response below for elaboration on this central point. -Ed]. Also, I think that having a perhaps less in-depth but wider array of skills and knowledge helped us.

[ES] You’ve said, when first getting started, you learned many core strategies/techniques from the Netflix Prize discussion board. Did you do much reading or research elsewhere to ramp up?

[MC] Having started late in the competition, the forum was a good starting point as many avenues had already been explored and links had been posted to many interesting papers. In the end though, reading and getting a good understanding of the actual research papers was a very important step. The forum was also a place where people proposed new (sometimes far fetched) ideas; these ideas often inspired us to come up with our own creative innovations.

PAWS is a great place to meet, greet and do business and though it is 5 hours away I have too much homework to do and grade while at University of Tennessee ( for now)-

Here is a very interesting poll that they are carrying it is good to see conferences take feedback in such a transparent manner-

paws poll

A comment on OffShoring

A comment on offshoring was put by a reader- I am re-posting it entirely.

When you use the phrase “labor shortage” or “skills shortage” you’re speaking in a sentence fragment.  What you actually mean to say is:  “There is a labor shortage at the salary level I’m willing to pay.”  That statement is the correct phrase; the complete sentence and the intellectually honest statement.

Employers speak about shortages as though they represent some absolute, readily identifiable lack of desirable services. Price is rarely accorded its proper importance in their discussion.

If you start raising wages and improving working conditions, and continue doing so, you’ll solve your shortage and will have people lining up around the block to work for you even if you need to have huge piles of steaming manure hand-scooped on a blazing summer afternoon.

Re:  Shortage caused by employees retiring out of the workforce:  With the majority of retirement accounts down about 50% or more, most people entering retirement age are working well into their sunset years.  So, you won’t be getting a worker shortage anytime soon due to retirees exiting the workforce.

Okay, fine.  Some specialized jobs require training and/or certification, again, the solution is higher wages and improved benefits. People will self-fund their re-education so that they can enter the industry in a work-ready state.  The attractive wages, working conditions and career prospects of technology during the 1980’s and 1990’s was a prime example of people’s willingness to self-fund their own career re-education.

There is never enough of any good or service to satisfy all wants or desires. A buyer, or employer, must give up something to get something. They must pay the market price and forego whatever else he could have for the same price. The forces of supply and demand determine these prices — and the price of a skilled workman is no exception. The buyer can take it or leave it. However, those who choose to leave it (because of lack of funds or personal preference) must not cry shortage. The good is available at the market price. All goods and services are scarce, but scarcity and shortages are by no means synonymous. Scarcity is a regrettable and unavoidable fact.

Shortages are purely a function of price. The only way in which a shortage has existed, or ever will exist, is in cases where the “going price” has been held below the market-clearing price.

How to use Oracle for Data Mining

Oracle for Data Mining!!!! Thats right I am talking of the same Database company that made waves with acquiring Sun ( and the beloved Java) and has been stealing market share left and right.

Here are some techie specific help- if you know SQL ( or Even Proc SQL) you can learn Oracle Data Mining in less than an hour- good enough to clear that job shortlist.

Check out the attached sample code examples.  They are designed to run on the ODM demo data, but you could change that easily.  They are posted on OTN here

Sample Code Demonstrating Oracle 11.1 Data Mining (230KB)
These files include sample programs in PL/SQL and Java illustrating each of the algorithms supported by Oracle Data Mining 11.1. There are examples of automatic data preparation and data transformations appropriate for each algorithm. Several programs illustrate the text transformation and text mining process.

Oracle Data Mining PL/SQL Sample Programs

The PL/SQL sample programs illustrate each algorithm supported by Oracle Data Mining as well as text transformation and text mining using NMF and SVM classification. Transformations that prepare the data for mining are included in the programs.Execute the PL/SQL sample programs.

Mining Function Algorithm Sample Program
Anomaly Detection One-Class Support Vector Machine dmsvodem.sql
Association Rules Apriori dmardemo.sql
Attribute Importance Minimum Descriptor Length dmaidemo.sql
Classification Adaptive Bayes Network (deprecated) dmabdemo.sql
Classification Decision Tree dmdtdemo.sql
Classification Decision Tree (cross validation) dmdtxvlddemo.sql
Classification Logistic Regression dmglcdem.sql
Classification Naive Bayes dmnbdemo.sql
Classification Support Vector Machine dmsvcdem.sql
Clustering k-Means dmkmdemo.sql
Clustering O-Cluster dmocdemo.sql
Feature Extraction Non-Negative Matrix Factorization dmnmdemo.sql
Regression Linear Regression dmglrdem.sql
Regression Support Vector Machine dmsvrdem.sql
Text Mining Text transformation using Oracle Text dmtxtfe.sql
Text Mining Non-Negative Matrix Factorization dmtxtnmf.sql
Text Mining Support Vector Machine (Classification) dmtxtsvm.sql

And

a particularly cute and nifty example of Fraud ( as in Fraud Detection 😉

drop table CLAIMS_SET;
exec dbms_data_mining.drop_model(‘CLAIMSMODEL’);
create table CLAIMS_SET (setting_name varchar2(30), setting_value varchar2(4000));
insert into CLAIMS_SET values (‘ALGO_NAME’,’ALGO_SUPPORT_VECTOR_MACHINES’);
insert into CLAIMS_SET values (‘PREP_AUTO’,’ON’);
commit;
begin
dbms_data_mining.create_model(‘CLAIMSMODEL’, ‘CLASSIFICATION’,
‘CLAIMS’, ‘POLICYNUMBER’, null, ‘CLAIMS_SET’);
end;
/
— accuracy (per-class and overall)
col actual format a6
select actual, round(corr*100/total,2) percent, corr, total-corr incorr, total from
(select actual, sum(decode(actual,predicted,1,0)) corr, count(*) total from
(select CLAIMS actual, prediction(CLAIMSMODEL using *) predicted
from CLAIMS_APPLY)
group by rollup(actual));
— top 5 most suspicious claims where the number of previous claims is 2 or more:
select * from
(select POLICYNUMBER, round(prob_fraud*100,2) percent_fraud,
rank() over (order by prob_fraud desc) rnk from
(select POLICYNUMBER, prediction_probability(CLAIMSMODEL, ‘0’ using *) prob_fraud
from CLAIMS_APPLY
where PASTNUMBEROFCLAIMS in (‘2 to 4’, ‘more than 4’)
where rnk <= 5
order by percent_fraud desc;

Coming up- a series of tutorials on learning the skills by just sitting in your home.

Hat Tip- Karl Rexer , Rexer Analytics and Charlie Berger, Oracle.

Adobe gulps Omniture

Another analytics takeover. Adobe needing to do something exciting and cash generating made a smart play with a 50 % premium for Omniture- with the amount of web traffic that adobe is embedded into (from documents ,graphics and videos especially) Adding in analytics can only mean better growth prospects for both given the pressure they are likely to face soon from competing products ( MS Silverlight and Yahoo Index Tools, Google Analytics respectively).

From the Press Release (note the cute diagram)

Adobe to Acquire Omniture

On Sept. 15, 2009, Adobe Systems Incorporated (Nasdaq:ADBE) and Omniture, Inc.                 (Nasdaq:OMTR) announced the two companies have entered into a definitive agreement                               for Adobe to acquire Omniture in a transaction valued at approximately $1.8 billion on a                          fully diluted equity-value basis. Under the terms of the agreement, Adobe will commence                                a tender offer to acquire all of the outstanding common stock of Omniture for $21.50   

per share in cash.

Adobe’s acquisition of Omniture furthers its mission to revolutionize the way the world engages             with ideas and information. By combining Adobe’s content creation tools and ubiquitous                     clients with Omniture’s Web analytics, measurement and optimization technologies, Adobe will be well positioned to deliver solutions that can transform the future of engaging experiences and e-commerce across all   digital content, platforms and devices.

Adobe and Omniture

The combination of the two companies will increase the value Adobe delivers to customers.                     For designers, developers and online marketers, an integrated workflow—with optimization   capabilities embedded in the creation tools—will streamline the creation and delivery of                relevant content and applications. This optimization will enable advertisers and advertising       agencies, publishers,  and e-tailers to realize greater ROI from their digital media investments                  and improve their end users’ experience

And the official fact sheet

ADOBE

  1. FOUNDED: 1982
  2. PRESIDENT & CEO: Shantanu Narayen
  3. MARKET CAP: $18.19 billion (as of 9/11/09)
  4. FY 08 REVENUE: US $3.58 billion (FYE Nov. 28, 2008)

Omniture

  1. CO-FOUNDER & CEO: Josh James
  2. FOUNDED: 1996
  3. MARKET CAP: $1.29 billion (as of 9/11/09)
  4. FY 08 REVENUE: US $295.6 million (FYE Dec. 31, 2008)

From-

http://www.adobe.com/aboutadobe/invrelations/adobeandomniture.html