hacker news – DECISION STATS

Stuff I like to Read to Kush: Kush's Blog

I am putting together a list of top 500 Blogs on –

Enterprise Software
Business Intelligence
Companies selling Enterprise software I like (SAP,Oracle,IBM, SAS, Salesforce)
Languages I like (SAS,Python,R,SPSS)

Some additional points-

I like YCombinator‘s Hacker News– so the auto parsed links are like that on main page. They lead to original websites.
Comments are disabled, feed is jumbled, only 40 word excerpts are shown.
Intent is also to show open source blogs and enterprise blogs at same time (regardless of advertising by vendors 😉 )
If your blog feed is there, I will keep it there – either dont write or dont use RSS if you dont want to share
If your blog feed is not there, it is probably not there for a reason.
No ads will be shown NOW or FOREVER on that site.

And after all that noise- you can see Kush’s Blog –http://www.kushohri.com/

Full Hacker News database for download (posts, comments, points, date, username) (api.ihackernews.com)
SAS scores big digital marketing win, announces social media engagement solution (customerthink.com)
Magnitude 6.3 earthquake shakes Afghan Kush (foxnews.com)
New Study Finds Retailers Impatient for Insights, Placing New Expectations on BI Solutions (prweb.com)

Dataists shake up R community with a rocking contest

Flipboard — Image by Johan Larsson via Flickr

Newly created Dataists are creating waves on Hacker News and beyond with their innovative contest- A Recommendation Engine for R Packages.

Not only is the contest useful, it is likely to teach R Users some data hacking skills, as well as the basics of creating a GitHub Project.

For that reason, we’ve settled on the more manageable question, “which packages are most often installed by normal R users?”

This last question could potentially be answered in a variety of ways. Our current approach uses a convenience sample of installation data that we’ve collected from volunteers in the R community, who kindly agreed to send us a list of the packages they have on their systems. We’ve anonymized this data and compiled a set of metadata-based predictors that allow us to predict the installation probabilities quite well. We’re releasing all of our current work, including the data we have and all of the code we’ve used so far for our exploratory analyses. The contest itself will go live on Kaggle on Sunday and will end four months from Sunday on February 10, 2011. The rules, prizes and official data sets are all described below.

Rules and Prizes

To win the contest, you need to predict the probability that a user U has a package P installed on their system for every pair, (U, P). We’ll assess your performance using ROC methods, which will be evaluated against a held out test data set. The winning team will receive 3 UseR! books of their choosing. In order to win the contest, you’ll have to provide your analysis code to us by creating a fork of our GitHub repository. You’ll also be required to provide a written description of your approach. We’re asking for so much openness from the winning team because we want this contest to serve as a stepping stone for the R community. We’re also hoping that enterprising data hackers will extend the lessons learned through this contest to other programming languages.

Extract from-http://www.dataists.com/2010/10/using-data-tools-to-find-data-tools-the-yo-dawg-of-data-hacking/

Read the full article there