Oracle for Data Mining!!!! Thats right I am talking of the same Database company that made waves with acquiring Sun ( and the beloved Java) and has been stealing market share left and right.
Here are some techie specific help- if you know SQL ( or Even Proc SQL) you can learn Oracle Data Mining in less than an hour- good enough to clear that job shortlist.
Check out the attached sample code examples. They are designed to run on the ODM demo data, but you could change that easily. They are posted on OTN here
Sample Code Demonstrating Oracle 11.1 Data Mining (230KB)
These files include sample programs in PL/SQL and Java illustrating each of the algorithms supported by Oracle Data Mining 11.1. There are examples of automatic data preparation and data transformations appropriate for each algorithm. Several programs illustrate the text transformation and text mining process.
Oracle Data Mining PL/SQL Sample Programs
The PL/SQL sample programs illustrate each algorithm supported by Oracle Data Mining as well as text transformation and text mining using NMF and SVM classification. Transformations that prepare the data for mining are included in the programs.Execute the PL/SQL sample programs.
Mining Function Algorithm Sample Program Anomaly Detection One-Class Support Vector Machine
Association Rules Apriori
Attribute Importance Minimum Descriptor Length
Classification Adaptive Bayes Network (deprecated)
Classification Decision Tree
Classification Decision Tree (cross validation)
Classification Logistic Regression
Classification Naive Bayes
Classification Support Vector Machine
Feature Extraction Non-Negative Matrix Factorization
Regression Linear Regression
Regression Support Vector Machine
Text Mining Text transformation using Oracle Text
Text Mining Non-Negative Matrix Factorization
Text Mining Support Vector Machine (Classification)
a particularly cute and nifty example of Fraud ( as in Fraud Detection ;)
drop table CLAIMS_SET; exec dbms_data_mining.drop_model(‘CLAIMSMODEL’); create table CLAIMS_SET (setting_name varchar2(30), setting_value varchar2(4000)); insert into CLAIMS_SET values (‘ALGO_NAME’,’ALGO_SUPPORT_VECTOR_MACHINES’); insert into CLAIMS_SET values (‘PREP_AUTO’,’ON’); commit; begin dbms_data_mining.create_model(‘CLAIMSMODEL’, ‘CLASSIFICATION’, ‘CLAIMS’, ‘POLICYNUMBER’, null, ‘CLAIMS_SET’); end; / – accuracy (per-class and overall) col actual format a6 select actual, round(corr*100/total,2) percent, corr, total-corr incorr, total from (select actual, sum(decode(actual,predicted,1,0)) corr, count(*) total from (select CLAIMS actual, prediction(CLAIMSMODEL using *) predicted from CLAIMS_APPLY) group by rollup(actual)); – top 5 most suspicious claims where the number of previous claims is 2 or more: select * from (select POLICYNUMBER, round(prob_fraud*100,2) percent_fraud, rank() over (order by prob_fraud desc) rnk from (select POLICYNUMBER, prediction_probability(CLAIMSMODEL, ’0′ using *) prob_fraud from CLAIMS_APPLY where PASTNUMBEROFCLAIMS in (’2 to 4′, ‘more than 4′) where rnk <= 5 order by percent_fraud desc;
Coming up- a series of tutorials on learning the skills by just sitting in your home.
Hat Tip- Karl Rexer , Rexer Analytics and Charlie Berger, Oracle.