Home » Posts tagged 'Free'
Tag Archives: Free
Using R for Cricket Analysis #rstats
ESPN Crincinfo is the best site for cricket data (you can see an earlier detailed post on the database here http://decisionstats.com/2012/04/07/cricinfo-statsguru-database-for-statistical-and-graphical-analysis/ ), and using the XML package in R we can easily scrape and manipulate data
Here is the code.
library(XML) url="http://stats.espncricinfo.com/ci/engine/stats/index.html?class=1;team=6;template=results;type=batting" #Note I can also break the url string and use paste command to modify this url with parameters tables=readHTMLTable(url) tables$"Overall figures" #Now see this- since I only got 50 results in each page, I look at the url of next page table1=tables$"Overall figures" url="http://stats.espncricinfo.com/ci/engine/stats/index.html?class=1;page=2;team=6;template=results;type=batting" tables=readHTMLTable(url) table2=tables$"Overall figures" #Now I need to join these two tables vertically table3=rbind(table1,table2) Note-I can also automate the web scraping . Now the data is within R, we can use something like Deducer to visualize.
Created by Pretty R at inside-R.org
Running R on Windows Azure #rstats #cloud
Here is a brief tutorial for people to run R on Windows Azure Cloud (OS=Windows in this case , but there are 4 kinds of Linux also available)
There is a free 90 day trial so you can run R for free on the cloud for free (since Google Cloud Compute is still in closed hush hush beta)
Go to https://www.windowsazure.com/en-us/pricing/free-trial/
JMP Student Edition
I really liked the initiatives at JMP/Academic. Not only they offer the software bundled with a textbook, which is both good common sense as well as business sense given how fast students can get confused
(Rant 1 Bundling with textbooks is something I think is Revolution Analytics should think of doing instead of just offering the academic version for free downloading- it would be interesting to see the penetration of R academic market with Revolution’s version and the open source version with the existing strategy)
From http://www.jmp.com/academic/textbooks.shtml
Major publishers of introductory statistics textbooks offer a 12-month license to JMP Student Edition, a streamlined version of JMP, with their textbooks.
and a glance through this http://www.jmp.com/academic/pdf/jmp_se_comparison.pdf shows it is a credible and not extremely whittled down version which would be just dishonest.
And I loved this Reference Card at http://www.jmp.com/academic/pdf/jmp10_se_quick_guide.pdf
Oracle, SAP- Hana, Revolution Analytics and even SAS/STAT itself can make more reference cards like this- elegant solutions for students and new learners!
More- creative-rants Honestly why do corporate sites use PDFs anymore when they can use Instapaper , or any of these SlideShare/Scribd formats to show information in a better way without diverting the user from the main webpage.
But I digress, back to JMP
Resources for Faculty Using JMP® Student Edition
Faculty who select a JMP Student Edition bundle for their courses may be eligible for additional resources, including course materials and training.
Special JMP® Student Edition for AP Statistics
JMP Student Edition is available in a convenient five-year license for qualified Advanced Placement statistics programs.
Try and have a look yourself at http://www.jmp.com/academic/student.shtml
Obfuscate using Rapid Miner
ob·fus·cate/ˈäbfəˌskāt/
| Verb: |
|
A nice geeky function in Rapid Miner is the Obfuscator
This operator can be used to anonymize your data. It is possible to save the obfuscating map into a file which can be used to remap the old values and names. Please use the operator Deobfuscator for this
Click screenshot to enlarge-
RapidMiner is free for download here (its open source)
http://rapid-i.com/content/view/26/201/
RCOMM 2012 goes live in August
An awesome conference by an awesome software Rapid Miner remains one of the leading enterprise grade open source software , that can help you do a lot of things including flow driven data modeling ,web mining ,web crawling etc which even other software cant.
Presentations include:
- Mining Machine 2 Machine Data (Katharina Morik, TU Dortmund University)
- Handling Big Data (Andras Benczur, MTA SZTAKI)
- Introduction of RapidAnalytics at Telenor (Telenor and United Consult)
- and more
Here is a list of complete program
Program
Time
|
Tuesday
|
Wednesday
|
Thursday
|
Friday
|
09:00 – 10:30 |
Introductory Speech Ingo Mierswa (Rapid-I)Resource-aware Data Mining or M2M Mining (Invited Talk) Katharina Morik (TU Dortmund University)
Data Analysis
NeurophRM: Integration of the Neuroph framework into RapidMiner |
To be announced (Invited Talk) Andras Benczur Recommender Systems
Extending RapidMiner with Recommender Systems Algorithms Implementation of User Based Collaborative Filtering in RapidMiner |
Parallel Training / Workshop Session
Advanced Data Mining and Data Transformations or |
|
10:30 – 11:00 |
Coffee Break |
Coffee Break |
Coffee Break |
|
11:00 – 12:30 |
Data Analysis
Nearest-Neighbor and Clustering based Anomaly Detection Algorithms for RapidMiner Customers’ LifeStyle Targeting on Big Data using Rapid Miner Robust GPGPU Plugin Development for RapidMiner |
Extensions
Optimization Plugin For RapidMiner
Image Mining Extension – Year After Incorporating R Plots into RapidMiner Reports |
||
12:30 – 13:30 |
Lunch |
Lunch |
Lunch |
|
13:30 – 15:30 |
Parallel Training / Workshop Session
Basic Data Mining and Data Transformations or |
Applications
Introduction of RapidAnalyticy Enterprise Edition at Telenor Hungary
Application of RapidMiner in Steel Industry Research and Development A Comparison of Data-driven Models for Forecast River Flow Portfolio Optimization Using Local Linear Regression Ensembles in Rapid Miner |
Extensions
An Octave Extension for RapidMiner
Unstructured Data
Processing Data Streams with the RapidMiner Streams-Plugin Automated Creation of Corpuses for the Needs of Sentiment Analysis
Demonstration: News from the Rapid-I Labs This short session demonstrates the latest developments from the Rapid-I lab and will let you how you can build powerful analysis processes and routines by using those RapidMiner tools. |
Certification Exam |
15:30 – 16:00 |
Coffee Break |
Coffee Break |
Coffee Break |
|
16:00 – 18:00 |
Book Presentation and Game Show
Data Mining for the Masses: A New Textbook on Data Mining for Everyone Matthew North presents his new book “Data Mining for the Masses” introducing data mining to a broader audience and making use of RapidMiner for practical data mining problems.
Game Show |
User Support
Get some Coffee for free – Writing Operators with RapidMiner Beans Meta-Modeling Execution Times of RapidMiner operators Conference day ends at ca. 17:00. |
||
19:30 |
Social Event (Conference Dinner) |
Social Event (Visit of Bar District) |
and you should have a look at https://rapid-i.com/rcomm2012f/index.php?option=com_content&view=article&id=65
Conference is in Budapest, Hungary,Europe.
( Disclaimer- Rapid Miner is an advertising sponsor of Decisionstats.com in case you didnot notice the two banner sized ads.)
Interview Rob J Hyndman Forecasting Expert #rstats
Here is an interview with Prof Rob J Hyndman who has created many time series forecasting methods and authored books as well as R packages on the same.
Probably the biggest impact I’ve had is in helping the Australian government forecast the national health budget. In 2001 and 2002, they had underestimated health expenditure by nearly $1 billion in each year which is a lot of money to have to find, even for a national government. I was invited to assist them in developing a new forecasting method, which I did. The new method has forecast errors of the order of plus or minus $50 million which is much more manageable. The method I developed for them was the basis of the ETS models discussed in my 2008 book on exponential smoothing (www.exponentialsmoothing.net)
New Free Online Book by Rob Hyndman on Forecasting using #Rstats
From the creator of some of the most widely used packages for time series in the R programming language comes a brand new book, and its online!
This time the book is free, will be updated and 7 chapters are ready (to read!)
. If you do forecasting professionally, now is the time to suggest your own use cases to be featured as the book gets ready by end- 2012. The book is intended as a replacement for Makridakis, Wheelwright and Hyndman (Wiley 1998).
The book is written for three audiences:
(1) people finding themselves doing forecasting in business when they may not have had any formal training in the area;
(2) undergraduate students studying business;
(3) MBA students doing a forecasting elective.
The book is different from other forecasting textbooks in several ways.
- It is free and online, making it accessible to a wide audience.
- It is continuously updated. You don’t have to wait until the next edition for errors to be removed or new methods to be discussed. We will update the book frequently.
- There are dozens of real data examples taken from our own consulting practice. We have worked with hundreds of businesses and organizations helping them with forecasting issues, and this experience has contributed directly to many of the examples given here, as well as guiding our general philosophy of forecasting.
- We emphasise graphical methods more than most forecasters. We use graphs to explore the data, analyse the validity of the models fitted and present the forecasting results.
A print version and a downloadable e-version of the book will be available to purchase on Amazon, but not until a few more chapters are written.
Contents
(Ajay-Support the open textbook movement!)
If you’ve found this book helpful, please consider helping to fund free, open and online textbooks. (Donations via PayPal.)





