Quote of the Day-
it is impossible to be a data scientist without knowing iris
#Anonymous #Quotes
Revolution Analytics has been nice enough to provide both datasets and code for analyzing Big Data in R.
http://www.revolutionanalytics.com/subscriptions/datasets/
http://packages.revolutionanalytics.com/datasets/
Site was updated so here are the new links
while the Datasets collection is still elementary, as a R Instructor I find this list extremely useful. However I wish they look at some other repositories and make .xdf and “tidy” csv versions. A little bit of RODBC usage should help, and so will some descriptions. Maybe they should partner with Quandl, DataMarket, or Infochimps on this initiative than do it alone.
Overall there can be a R package (like a Big Data version of the famous datasets package in R)
But a nice and very useful effort
Revolution R Datasets
- ../
- AirOnTime87to12/ 09-Nov-2013 00:46 –
- AirOnTimeCSV2012/ 09-Nov-2013 00:30 –
- AirOnTime2012.xdf 08-Nov-2013 18:08 190110335
- AirOnTime7Pct.xdf 08-Nov-2013 17:42 103317987
- AirlineData87to08.tar.gz 03-May-2013 21:05 5521408
- AirlineData87to08.zip 09-May-2013 14:59 1802240
- AirlineData87to08_11811.tar.gz 08-Nov-2013 03:27 1428527359
- AirlineData87to08_83010.zip 08-Nov-2013 06:37 1477052425
- AirlineDataSubsample.xdf 08-Nov-2013 07:27 390789536
- Census5PCT2000.tar.gz 08-Nov-2013 10:55 871208970
- Census5PCT2000.zip 08-Nov-2013 12:52 925929427
- CensusUS5Pct2000.xdf 08-Nov-2013 21:27 1204906764
- ccFraud.csv 23-Apr-2013 20:57 291737157
- ccFraudScore.csv 23-Apr-2013 21:10 273848249
- ccFraudScore10_CreateLoadTableQuotedColumns.fas..> 23-Apr-2013 21:10 981
- ccFraud_CreateLoadTable_QuotedColumns.fastload 23-Apr-2013 21:10 984
- index.php.txt 09-May-2013 22:17 3983
- mortDefault.tar.gz 08-Nov-2013 12:59 61585580
- mortDefault.zip 08-Nov-2013 13:08 63968310
More code-
http://blog.revolutionanalytics.com/2013/08/big-data-sets-for-r.html
Also a recent project made by a student of mine on Revolution Datasets and using their blog posts.