Hearst DataMining Challenge

Check out the Hearst Data Mining Challenge- a new competition-sponsored by DMA, Hearst Magazine, and EXL

THE HEARST CHALLENGE STARTS ON OCTOBER 14TH

CHALLENGE

DESCRIPTION

Over the years, the magazine publishing industry has made significant strides in improving subscription based circulation by developing analytic frameworks that better predict customer response to acquisition and renewal offers. The objective of this contest is to apply the same analytic discipline and effectively predict newsstand locations “response”. Specifically the objective is to predict the number of copies to be placed in each newsstand location to optimize the overall contribution of the newsstand location typically referred to as draw.

Data for the competition is provided by CMG and Experian.

and

RULES

HOW TO ENTER: Beginning October 14th, 2010 at 12:01 AM (ET) throughDecember 3rd, 2010 at 11:59 PM (ET) go to the Hearst Challenge website located at http://www.HearstChallenge.com (the “Site”) and complete and submit the entry form pursuant to the onscreen instructions. Entrants will be provided a historical sample of newsstand location draw, sales and associated location level data to help develop their predictive algorithm. Hearst will in turn hold back two distinct sets of draw/sales data, one to be used as a validation set by the contestant and one to be used as a final contest evaluation set. Entrants may not include any other external variables for the challenge. Additional details will be provided with the data. Entrants will be able to track their performance against the validation set throughout the course of the challenge via a leader tracking board to be made available on the Site. Entries must include the following documentation:

  • Data file with id variables and expected sales values by store and publication
  • The final model/ algorithm code used to score the final data set
  • Any supporting documentation that pertains to the development of the submitted model/algorithm including variable creation. Variables that were used in the model need to be traced through from input to coefficient / node (if using a tree based methodology).

Check out http://www.hearstchallenge.com/index.php for further details.

September Roundup by Revolution

From the monthly newsletter- which I consider quite useful for keeping updated on application of R

——————————————————————————————————————————————————————————————————–

Revolution News
Every month, we’ll bring you the latest news about Revolution’s products and events in this section.
Follow us on Twitter at @RevolutionR for up-to-the-minute news and updates from Revolution Analytics!

Revolution R Enterprise 4.0 for Windows now available. Based on the latest R 2.11.1 and including the RevoScaleR package for big-data analysis in R, Revolution R Enterprise is now available for download for Windows 32-bit and 64-bit systems. Click here to subscribe, or available free to academia.

New! Integrate R with web applications, BI dashboards and more with web services. RevoDeployR is a new Web Services framework that integrates dynamic R-based computations into applications for business users. It will be available September 30 with Revolution R Enterprise Server on RHEL 5. Click here to learn more.

Free Webinar, September 22: In a joint webinar from Revolution Analytics and Jaspersoft, learn how to use RevoDeployR to integrate advanced analytics on-demand in applications, BI dashboards, and on the web. Register here.

Revolution in the News:
SearchBusinessAnalytics.com previews the forthcoming Revolution R GUI; Channel Register introduces RevoDeployR, while IT Business Edge shows off the Web Services architecture; and ReadWriteWeb.com looks at how RevoScaleR tackles the Big Data explosion.

Inside-R: A new site for the R Community. At www.inside-R.org you’ll find the latest information about R from around the Web, searchable R documentation and packages, hints and tips about R, and more. You can even add a “Download R” badge to your own web-page to help spread the word about R.

R News, Tips and Tricks from the Revolutions blog
The Revolutions blog brings you daily news and tips about R, statistics and open source. Here are some highlights from Revolutions from the past month
.

R’s key role in the oil spill response: Read how NIST’s Division Chief of Statistical Engineering used R to provide critical analysis in real time to the Secretaries of Energy and the Interior, and helped coordinate the government’s response.

Animating data with R and Google Earth: Learn how to use R to create animated visualizations of geographical data with Google Earth, such as this video showing how tuna migrations intersect with the location of the Gulf oil spill.

Are baseball games getting longer? Or is it just Red Sox games? Ryan Elmore uses nonparametric regression in R to find out.

Keynote presentations from useR! 2010: the worldwide R user’s conference was a great success, and there’s a wealth of useful tips and information in the presentations. Video of the keynote presentations are available too: check out in particular Frank Harrell’s talk Information Allergy, and Friedrich Leisch’s talk on reproducible statistical research.

Looking for more R tips and tricks? Check out the monthly round-ups at the Revolutions blog.

Upcoming Events
Every month, we’ll highlight some upcoming events from R Community Calendar.

September 23: The San Diego R User Group has a meetup on BioConductor and microarray data analysis.

September 28: The Sydney Users of R Forum has a meetup on building world-class predictive models in R (with dinner to follow).

September 28: The Los Angeles R User Group presents an introduction to statistical finance with R.

September 28: The Seattle R User Group meets to discuss, “What are you doing with R?”

September 29: The Raleigh-Durham-Chapel Hill R Users Group has its first meeting.

October 7: The NYC R User Group features a presentation by Prof. Andrew Gelman.

There are also new R user groups in SingaporeSeoulDenverBrisbane, and New Jersey.  Please let us know if we’re missing your R user group, or if want to get a new one started.

———————————————————————————————-Editor

David Smith, VP Marketing
david@revolutionanalytics.com
Twitter: @revodavid

subscribe here for Revo’s Monthly newsletter-

Protected: SAS Institute lawsuit against WPS Episode 2 The Clone Wars

This content is password protected. To view it please enter your password below:

The R Online WikiBook

I came across the R Programming Wikibook at http://en.wikibooks.org/wiki/R_Programming

It is quite surprisingly good- easy to read for a beginner- handy and concise reference for intermediate users. Some chapters like clustering could do with some more support from the community -see http://en.wikibooks.org/wiki/R_Programming/Clustering

[edit]References

But I really liked the pages on Graphics, Modeling and Maths (including Matrix)

See

http://en.wikibooks.org/wiki/R_Programming/Graphics

and http://en.wikibooks.org/wiki/R_Programming/Linear_Models

I really believe that a consolidated one book online documentation can be achieved for R, only if we follow a moderated-wiki like structure. This can be of a great use- since online help documents for R are currently not concise or present a seemingly professional look (due to multiple formats and styles to the documentation) and they rarely do multiple package comparison. All this has made R books the top selling books on statistics on Amazon but a project like R deserves atleast one comprehensive online and concise book which can be used readily without going through all the scattered multiple documentation- a bit like a R Online Doc.This could help in stage next of the project in getting more users to be comfortable with it.

Any volunteers 🙂 ?