Oracle for Data Mining!!!! Thats right I am talking of the same Database company that made waves with acquiring Sun ( and the beloved Java) and has been stealing market share left and right.
Here are some techie specific help- if you know SQL ( or Even Proc SQL) you can learn Oracle Data Mining in less than an hour- good enough to clear that job shortlist.
Check out the attached sample code examples. They are designed to run on the ODM demo data, but you could change that easily. They are posted on OTN here
Sample Code Demonstrating Oracle 11.1 Data Mining (230KB)
These files include sample programs in PL/SQL and Java illustrating each of the algorithms supported by Oracle Data Mining 11.1. There are examples of automatic data preparation and data transformations appropriate for each algorithm. Several programs illustrate the text transformation and text mining process.Oracle Data Mining PL/SQL Sample Programs
The PL/SQL sample programs illustrate each algorithm supported by Oracle Data Mining as well as text transformation and text mining using NMF and SVM classification. Transformations that prepare the data for mining are included in the programs.Execute the PL/SQL sample programs.
Mining Function Algorithm Sample Program Anomaly Detection One-Class Support Vector Machine dmsvodem.sqlAssociation Rules Apriori dmardemo.sqlAttribute Importance Minimum Descriptor Length dmaidemo.sqlClassification Adaptive Bayes Network (deprecated) dmabdemo.sqlClassification Decision Tree dmdtdemo.sqlClassification Decision Tree (cross validation) dmdtxvlddemo.sqlClassification Logistic Regression dmglcdem.sqlClassification Naive Bayes dmnbdemo.sqlClassification Support Vector Machine dmsvcdem.sqlClustering k-Means dmkmdemo.sqlClustering O-Cluster dmocdemo.sqlFeature Extraction Non-Negative Matrix Factorization dmnmdemo.sqlRegression Linear Regression dmglrdem.sqlRegression Support Vector Machine dmsvrdem.sqlText Mining Text transformation using Oracle Text dmtxtfe.sqlText Mining Non-Negative Matrix Factorization dmtxtnmf.sqlText Mining Support Vector Machine (Classification) dmtxtsvm.sql
And
a particularly cute and nifty example of Fraud ( as in Fraud Detection 😉
drop table CLAIMS_SET; exec dbms_data_mining.drop_model(‘CLAIMSMODEL’); create table CLAIMS_SET (setting_name varchar2(30), setting_value varchar2(4000)); insert into CLAIMS_SET values (‘ALGO_NAME’,’ALGO_SUPPORT_VECTOR_MACHINES’); insert into CLAIMS_SET values (‘PREP_AUTO’,’ON’); commit; begin dbms_data_mining.create_model(‘CLAIMSMODEL’, ‘CLASSIFICATION’, ‘CLAIMS’, ‘POLICYNUMBER’, null, ‘CLAIMS_SET’); end; / — accuracy (per-class and overall) col actual format a6 select actual, round(corr*100/total,2) percent, corr, total-corr incorr, total from (select actual, sum(decode(actual,predicted,1,0)) corr, count(*) total from (select CLAIMS actual, prediction(CLAIMSMODEL using *) predicted from CLAIMS_APPLY) group by rollup(actual)); — top 5 most suspicious claims where the number of previous claims is 2 or more: select * from (select POLICYNUMBER, round(prob_fraud*100,2) percent_fraud, rank() over (order by prob_fraud desc) rnk from (select POLICYNUMBER, prediction_probability(CLAIMSMODEL, ‘0’ using *) prob_fraud from CLAIMS_APPLY where PASTNUMBEROFCLAIMS in (‘2 to 4’, ‘more than 4’) where rnk <= 5 order by percent_fraud desc;
Coming up- a series of tutorials on learning the skills by just sitting in your home.
Hat Tip- Karl Rexer , Rexer Analytics and Charlie Berger, Oracle.