Cricinfo StatsGuru Database for Statistical and Graphical Analysis

Data from the ESPN Cricinfo website is available from the STATSGURU website.

The url is of the form-

http://stats.espncricinfo.com/ci/engine/stats/index.html?class=1;team=6;template=results;type=batting

http://stats.espncricinfo.com/ci/engine/stats/index.html?

class=1;team=6;template=results;type=batting

If you break down this URL to get more statistics on cricket, you can choose the following parameters.
class
1=Test
2=ODI
3=T20I
11=Test+ODI+T20I
team
1=England
2=Australia
3=South America
4-West Indies
5=New Zealand
6=India ,7=Pakistan and 8=Sri Lanka

type
batting
bowling
fielding
allround
fow
official
team
aggregate

 

ESPN Terms of Use are here-you may need to  check this before trying any web scraping.

http://www.espncricinfo.com/ci/content/site/company/terms_use.html

 

However ESPN has unleashed the API (including both free and premium)for Developers at http://developer.espn.com/docs.

and especially these sports http://developer.espn.com/docs/headlines#parameters

/sports News across all sports/sections
/sports/baseball/mlb Major League Baseball (MLB)
/sports/basketball/mens-college-basketball NCAA Men’s College Basketball
/sports/basketball/nba National Basketball Association (NBA)
/sports/basketball/wnba Women’s National Basketball Association (WNBA)
/sports/basketball/womens-college-basketball NCAA Women’s College Basketball
/sports/boxing Boxing
/sports/football/college-football NCAA College Football
/sports/football/nfl National Football League (NFL)
/sports/golf Golf
/sports/hockey/nhl National Hockey League (NHL)
/sports/horse-racing Horse Racing
/sports/mma Mixed Martial Arts
/sports/racing Auto Racing
/sports/racing/nascar NASCAR Racing
/sports/soccer Professional soccer (US focus)
/sports/tennis Tennis

 

I wonder when this can be enabled for Cricket as well (including APIs  free,academic,premium,partner ).

(Note you can use R packages XML , RCurl , rjson, to get data from the web among others).

Plotting is best done using ggplot2 http://had.co.nz/ggplot2/ or d3.js at http://mbostock.github.com/d3/, and the current status of cricket graphics can surely look a change- they are mostly a single radial plot of shots played /runs scored or a combined barplot/line graph.

How to learn to be a hacker easily

1) Are you sure. It is tough to be a hacker. And football players get all the attention.

2) Really? Read on

3) Read Hacker’s Code

http://muq.org/~cynbe/hackers-code.html

The Hacker’s Code

“A hacker of the Old Code.”

  • Hackers come and go, but a great hack is forever.
  • Public goods belong to the public.*
  • Software hoarding is evil.
    Software does the greatest good given to the greatest number.
  • Don’t be evil.
  • Sourceless software sucks.
  • People have rights.
    Organizations live on sufferance.
  • Governments are organizations.
  • If it is wrong when citizens do it,
    it is wrong when governments do it.
  • Information wants to be free.
    Information deserves to be free.
  • Being legal doesn’t make it right.
  • Being illegal doesn’t make it wrong.
  • Subverting tyranny is the highest duty.
  • Trust your technolust!

4) Read How to be a hacker by

Eric Steven Raymond

http://www.catb.org/~esr/faqs/hacker-howto.html

or just get the Hacker Attitude

The Hacker Attitude

1. The world is full of fascinating problems waiting to be solved.
2. No problem should ever have to be solved twice.
3. Boredom and drudgery are evil.
4. Freedom is good.
5. Attitude is no substitute for competence.
5) If you are tired of reading English, maybe I should move on to technical stuff
6) Create your hacking space, a virtual disk on your machine.
You will need to learn a bit of Linux. If you are a Windows user, I recommend creating a VMWare partition with Ubuntu
If you like Mac, I recommend the more aesthetic Linux Mint.
How to create your virtual disk-
read here-
Download VM Player here
http://www.vmware.com/support/product-support/player/
Down iso image of operating system here
http://ubuntu.com
Downloading is the longest thing in this exercise
Now just do what is written here
http://www.vmware.com/pdf/vmware_player40.pdf
or if you want to try and experiment with other ways to use Windows and Linux just read this
http://www.decisionstats.com/ways-to-use-both-windows-and-linux-together/
Moving data back and forth between your new virtual disk and your old real disk
http://www.decisionstats.com/moving-data-between-windows-and-ubuntu-vmware-partition/
7) Get Tor to hide your IP address when on internet
https://www.torproject.org/docs/tor-doc-windows.html.en
8a ) Block Ads using Ad-block plugin when surfing the internet (like 14.95 million other users)
https://addons.mozilla.org/en-US/firefox/addon/adblock-plus/
 8b) and use Mafiafire to get elusive websites
https://addons.mozilla.org/en-US/firefox/addon/mafiaafire-redirector/
9) Get a  Bit Torrent Client at http://www.utorrent.com/
This will help you download stuff
10) Hacker Culture Alert-
This instruction is purely for sharing the culture but not the techie work of being a hacker
The website Pirate bay acts like a search engine for Bit torrents 
http://thepiratebay.se/
Visiting it is considered bad since you can get lots of music, videos, movies etc for free, without paying copyright fees.
The website 4chan is considered a meeting place to meet other hackers. The site can be visually shocking
http://boards.4chan.org/b/
You need to do atleast set up these systems, read the websites and come back in N month time for second part in this series on how to learn to be a hacker. That will be the coding part.
END OF PART  1
Updated – sorry been a bit delayed on next part. Will post soon.

UseR goes to Nashville, USA

So if Vanderbilt did lose (again) to UT (http://www.govolsxtra.com/news/2011/nov/20/video-tennessee-highlights-vanderbilt-game/) , they have somethign better to look before next season’s football season.

UseR is coming to Tennessee in 2012! This is the premier conference happens annually for R language (>2 mill users), and alternated between Europe and North America every other year.

Details here

http://biostat.mc.vanderbilt.edu/wiki/Main/UseR-2012

useR! 2012 (12-15 June 2012)
Department of Biostatistics
Vanderbilt University
School of Medicine
Nashville Tennessee USA

 

 

 

 


Pre-conference Survey

If you plan to attend useR! 2012, help us plan by completing a RedCAP Survey.

 


Contact

Stephania McNeal-Goddard
Assistant to the Chair
stephania.mcneal-goddard@vanderbilt.edu
Phone:             615.322.2768
Fax: 615.343.4924
Vanderbilt University School of Medicine
Department of Biostatistics
S-2323 Medical Center North
Nashville, TN 37232-2158

 

 


Abstracts and Tutorial Proposals

Participants are encouraged to submit an abstract to for oral presentation during a Kaleidoscope or Focus session, or for poster presentation. Tutorial proposals are also welcomed.

Deadlines

  • Tutorial Submission: Dec 1 – Jan 31
  • Tutorial Acceptance Notification: Feb 1 – Feb 29
  • Abstract Submission: Dec 1 – Mar 12
  • Abstract Acceptance Notification: Mar 13 – Apr 15

 

 


Registration

 

Deadlines

  • Early Registration: Jan 1 – Feb 29
  • Regular Registration: Mar 1 – May 12
  • Late Registration: May 13 – June 11
  • On-site Registration: June 12 – June 15

 

 


Travel and Lodging Information

Vanderbilt University is located in Nashville, Tennessee, USA.

Air Travel

The nearest major airport to Vanderbilt University is the Nashville International Airport (BNA). The airport is about 10 miles east of the campus and downtown Nashville. The BNA website maintains a list of ground transportation options for air travelers. The approximate taxi fare from the airport to Vanderbilt University is $27. Shuttles and buses are also available from the airport. The latter is economical (approximate fare is $1.60), but the travel time is more than an hour.

Car Travel

Nashville is located at the intersection of three major interstates. Interstate 40 approaches from the east and west, interstate 24 from the northwest and southeast, and interstate 65 from the northeast and south.

Statistics on Social Media

Some official statistics on social media from the owners themselves

1) Facebook-

http://www.facebook.com/press/info.php?statistics

Date -17 Nov 2011

Statistics

People on Facebook

R Journal Dec 2010 and R for Business Analytics

A Bold GNU Head
Image via Wikipedia

I almost missed out on the R Journal for this month- great reading,

and I liked Dr Hadley’s article on stringr package the best. Really really useful package and nice writing too

http://journal.r-project.org/archive/2010-2/RJournal_2010-2_Wickham.pdf

(incidentally I just downloaded a local copy of his ggplot website at http://had.co.nz/ggplot2/ggplot-static.zip

I aim to really read that one up

Okay, announcement time

I just signed a contract with Springer for a book on R, some what in first half of 2011

” R for Business Analytics

its going to be a more business analytics than a stats perspective ( I am a MBA /Mech Engineer)

and use cases would be business analytics cases. Do write to me if you need help doing some analytics in R (business use cases)- or want something featured. Big focus would be on GUI and easier analytics, using the Einsteinian principle to make things as simple as possible but no simpler)

Search, Sports,Social Media,SlideShares, Scribd

An image of a house fly eye surface by using S...
Image via Wikipedia

Some slideshare.net presentations I really liked.

A tutorial on SEO and SEM-

Carole Ann Matignon deals with optimization and scheduling, rules in the…….NFL!

 

 

Carole, We are waiting for the sequel on  analytics on football and the beer game.

Social Media Screw-Ups

Social Media doesnt matter at all- Social Media matters a lot- Still undecided? Take a look

Slideshare is a great VISUAL interface on sharing content. I liked Google Docs embedding as well, but Matt Mullenberg and Matt Cutts seemed to have stopped talking. Mullenberg is going like Zuckenberg, not willing to align with Sergey Mikhaylovich Brin. or maybe they are afraid of Big Brother Brin. Google loves Java and Javascript (even when they are getting sued for it)- while Matt M  hates it- bad for RIA I guess.

Scribd also is a great way to share content- and probably is small enough for. WordPress.com to allow embedding

Thats the reason why I sometimes prefer Scribd for sharing my poetry to Slideshare and Google Docs. Also I like the enhanced analytics and the much easier and evolved interface for reading. Slideshare is much more successful than Scribd because it is open to sharing with everyone- scribd tries to get you to register …;)

(* Also see MIT’s beer game at http://beergame.mit.edu/ which is ahem different from Duke’s beer games).