One of the most commonly used uses of Statistical Software is building models, and that too logistic regression models for propensity in marketing of goods and services.
If building a model is what you do-here is a brief easy essay on how to build a model in R.
1) Packages to be used-
For smaller datasets
- CAR Package http://cran.r-project.org/web/packages/car/index.html
- GVLMA Package http://cran.r-project.org/web/packages/gvlma/index.html
- ROCR Package http://rocr.bioinf.mpi-sb.mpg.de/
- Relaimpo Package
- DAAG package
- MASS package
- Bootstrap package
- Leaps package
http://cran.r-project.org/web/packages/rms/index.html or RMS package
rms works with almost any regression model, but it was especially written to work with binary or ordinal logistic regression, Cox regression, accelerated failure time models, ordinary linear models, the Buckley-James model, generalized least squares for serially or spatially correlated observations, generalized linear models, and quantile regression.
For bigger datasets also see Biglm http://cran.r-project.org/web/packages/biglm/index.html and RevoScaleR packages.
- outp=lm(y~x1+x2+xn,data=dataset) Model Eq
- summary(outp) Model Summary
- par(mfrow=c(2,2)) + plot(outp) Model Graphs
- vif(outp) MultiCollinearity
- gvlma(outp) Heteroscedasticity using GVLMA package
- outlierTest (outp) for Outliers
- predicted(outp) Scoring dataset with scores
- > predict(lm.result,data.frame(conc = newconc), level = 0.9, interval = “confidence”)
For a Reference Card -Cheat Sheet see
3) Also read-
The Analytics 2011 Conference Series combines the power of SAS’s M2010 Data Mining Conference and F2010 Business Forecasting Conference into one conference covering the latest trends and techniques in the field of analytics. Analytics 2011 Conference Series brings the brightest minds in the field of analytics together with hundreds of analytics practitioners. Join us as these leading conferences change names and locations. At Analytics 2011, you’ll learn through a series of case studies, technical presentations and hands-on training. If you are in the field of analytics, this is one conference you can’t afford to miss.
October 24-25, 2011
Grande Lakes Resort
Analytics 2011 topic areas include:
Here are some fabulous applications at http://yeroon.net – if you are in the field of data and / or analytics you should try and dekko this site- it is created by UCLA’s department of statistics.
You can create stockplots ( something similar to based to Yahoo and Google finance which I have covered earlier)
or create ggplot visualizations
or create a linear model
Just using a browser to upload the dataset and thats all the hard/soft ware you need .
Note the background uses R. It would be interesting if companies like Revolution R, SAS and SPSS can do in this browser based computing ( maybe charge like Amazon Ec2 apis)
Kudos and credits to http://www.stat.ucla.edu/~jeroen/