There is a 15% discount if you register to these sites, mention you are a reader of Decisionstats.
Thanks. Click on the appropriate badge above to register.
There is a 15 % discount if you want to register for Text Analytics World next month-
Use Discount Code AJAYNY11
October 19-20, 2011 at The Hilton New York
http://www.textanalyticsworld.com/newyork/2011
Text Analytics World Topics & Case Studies - Oct 19-20 in NYC Text Analytics World NYC (tawgo.com) is the business-focused event for text analytics professionals, managers and commercial practitioners. This conference delivers case studies, expertise and resources to leverage unstructured data for business impact. Text Analytics World NYC is packed with the top predictive analytics experts, practitioners, authors and business thought leaders, including keynote addresses from Thomas Davenport, author of Competing on Analytics: The New Science of Winning, David Gondek from IBM Research on their Jeopardy-Winning Watson and DeepQA, and PAW Program Chair Eric Siegel, plus special sessions from industry heavy- weights Usama Fayyad and John Elder. CASE STUDIES: TAW New York City will feature over 25 sessions with case studies from leading enterprises in automotive, educational, e-commerce, financial services, government, high technology, insurance, retail, social media, and telecom such as: Accident Fund, Amdocs, Bundle.com, Citibank, Florida State College, Google, Intuit, MetLife, Mitchell1, PayPal, Snap-on, Socialmediatoday, Topsy, a Fortune 500 global technology company, plus special examples from U.S. government agencies DoD, DHS, and SSA. HOT TOPICS: TAW New York City's agenda covers hot topics and advanced methods such as churn risk detection, customer service and call centers, decision support, document discovery, document filtering, financial indicators from social media, fraud detection, government applications, insurance applications, knowledge discovery, open question-answering, parallelized text analysis, risk profiling, sentiment analysis, social media applications, survey analysis, topic discovery, and voice of the customer and other innovative applications that benefit organizations in new and creative ways. WORKSHOPS: TAW also features a full-day, hands-on text analytics workshop, plus several other pre- and post-conference workshops in analytics that complement the core conference program. For more info: www.tawgo.com/newyork/2011/analytics-workshops For more information: tawgo.com Download the conference preview: View the agenda at-a-glance: textanalyticsworld.com/newyork/2011/agenda Register by September 2nd for Early Bird Rates (save up to $200): textanalyticsworld.com/newyork/2011/registration If you'd like our informative event updates, sign up at: http://www.textanalyticsworld.com/subscription.php To sign up for TAW group on LinkedIn: www.linkedin.com/e/gis/3869759 For inquiries e-mail regsupport@risingmedia.com or call (717) 798-3495. OTHER ANALYTICS EVENTS: Predictive Analytics World for Government: Sept 12-13 in DC – www.pawgov.com Predictive Analytics World New York City: Oct 16-21 – www.pawcon.com/nyc Text Analytics World New York City: Oct 19-20 – www.tawgo.com/nyc Predictive Analytics World London: Nov 30-Dec 1 – www.pawcon.com/london Predictive Analytics World San Francisco: March 4-10, 2012 – www.pawcon.com/sanfrancisco Predictive Analytics World Videos: Available on-demand – www.pawcon.com/video
Also has two sessions on R
Half-day Workshop
Room: Madison
R Bootcamp
Click here for the detailed workshop description
Instructor: Max Kuhn, Director, Nonclinical Statistics, Pfizer
[ Top of this page ] [ Agenda overview ]
Full-day Workshop
Room: Madison
R for Predictive Modeling: A Hands-On Introduction
Click here for the detailed workshop description
Instructor: Max Kuhn, Director, Nonclinical Statistics, Pfizer
I am just listing down a set of basic R functions that allow you to start the task of business analytics, or analyzing a dataset(data.frame). I am doing this both as a reference for myself as well as anyone who wants to learn R- quickly.
I am not putting in data import functions, because data manipulation is a seperate baby altogether. Instead I assume you have a dataset ready for analysis and what are the top R commands you would need to analyze it.
For anyone who thought R was too hard to learn- here is ten functions to learning R
1) str(dataset) helps you with the structure of dataset
2) names(dataset) gives you the names of variables
3)mean(dataset) returns the mean of numeric variables
4)sd(dataset) returns the standard deviation of numeric variables
5)summary(variables) gives the summary quartile distributions and median of variables
That about gives me the basic stats I need for a dataset.
> data(faithful)
> names(faithful) [1] "eruptions" "waiting"
> str(faithful) 'data.frame': 272 obs. of 2 variables: $ eruptions: num 3.6 1.8 3.33 2.28 4.53 ... $ waiting : num 79 54 74 62 85 55 88 85 51 85 ...
> summary(faithful) eruptions waiting Min. :1.600 Min. :43.0 1st Qu.:2.163 1st Qu.:58.0 Median :4.000 Median :76.0 Mean :3.488 Mean :70.9 3rd Qu.:4.454 3rd Qu.:82.0 Max. :5.100 Max. :96.0 > mean(faithful) eruptions waiting 3.487783 70.897059 > sd(faithful) eruptions waiting 1.141371 13.594974
6) I can do a basic frequency analysis of a particular variable using the table command and $ operator (similar to dataset.variable name in other statistical languages)
> table(faithful$waiting) 43 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 62 63 64 65 66 67 68 69 70 1 3 5 4 3 5 5 6 5 7 9 6 4 3 4 7 6 4 3 4 3 2 1 1 2 4 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 96 5 1 7 6 8 9 12 15 10 8 13 12 14 10 6 6 2 6 3 6 1 1 2 1 1
or I can do frequency analysis of the whole dataset using
> table(faithful)
waiting
eruptions 43 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 62 63 64 65 66 67
1.6 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1.667 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1.7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
1.733 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
.....output truncated
7) plot(dataset)
It helps plot the dataset
8) hist(dataset$variable) is better at looking at histograms
hist(faithful$waiting)
9) boxplot(dataset)
10) The tenth function for a beginner would be cor(dataset$var1,dataset$var2)
> cor(faithful)
eruptions waiting
eruptions 1.0000000 0.9008112
waiting 0.9008112 1.0000000
I am assuming that as a beginner you would use the list of GUI at http://rforanalytics.wordpress.com/graphical-user-interfaces-for-r/ to import and export Data. I would deal with ten steps to data manipulation in R another post.
Message from PAW and TAW conferences
The PAW and TAW New York City Early Bird discounts end this Friday.
———————–
– NEXT WEEK: PAW for Government, Sept 12-13, in Washington DC. An amazing line-up of keynotes including Congressman Darrell Issa. Coverage of predictive analytics deployment by over a dozen government agencies. See www.pawgov.com
– Predictive Analytics World NYC – Oct 16-21 – Early Bird Pricing ends this Friday, Sept 9 – register now to save $400 over the full price. Three tracks, over 40 sessions, keynotes from Davenport and from IBM Research on their Jeopardy-Winning Watson – plus much more. Seewww.pawcon.com/nyc
– Text Analytics World NYC (Oct 16-21) also ends Early Bird Pricing this Friday, Sept 9 – register now to save $400 over the full price. Over 25 sessions with case studies from Accident Fund, Amdocs, Bundle.com, Citibank, Google, Intuit, MetLife, PayPal, and much more. See www.tawgo.com/nyc
– PAW London: Nov 30 – Dec 1. Case studies from BBC, GSK, HP, ING, Lloyds TSB, Paychex, US Bank, Yahoo!, and more. See www.pawcon.com/london
– PAW and TAW San Francisco: Mar 4-10 2012 – Save-the-date and call-for-speakers. Seewww.pawcon.com/submit.php and www.tawgo.com/call-for-speakers
* For informative event updates: www.pawcon.com/signup-us.php
A particularly prominent technology blogger ( see http://www.readwriteweb.com/archives/michael_arrington_the_kingmaker_who_would_be_king.php )has now formalized his status as an investor (which he did even before) while relinquishing his editorial duties (which were not much given the blog’s acquisition by AOL and its own formidable line of writers, each one of whom is quite influential). Without going into either sermon mode (thou shall not have conflict of interests) or adulatory mode (wow he sold the blog for 30 mill and now he gets another 20 mill for his funds)- I shall try and present the case for ethics and ethical lapses while as a writer.
I got some good news from the fine people at Predictive Analytics World.
you qualify for 2 free passes to the PAW NYC event October 16-20, 2011. I will be sending you a code to use for registration to receive these passes within the next couple of days.
If you cannot attend our PAW NYC event, please feel free to use these two free passes as a promotional tool within your blog.
Now I have been partnering with PAW for a long time, so it is nice to have free passes. I am grateful for their support of this blog. Therein lies my dilemma. I am in India, and a return ticket from NYC to India costs 1100$. Unless something drastic happens , I dont see myself with that kind of travel money.
Ergo.
I am offering two free passes to Predictive Analytics World . http://predictiveanalyticsworld.com/
All you need to do is – ahem- cough-
AND
What do you get?
One of these –http://www.predictiveanalyticsworld.com/newyork/register.php (details awaited!) to
http://www.predictiveanalyticsworld.com/newyork/2011/

Here is a contest based community called CrowdANALYTIX.com which is quite nice and offers you free Revolution R for the statistical and analytical contests based there (a bit like Kaggle.com http://www.kaggle.com/). There are only 3 contests right now and that too low volume but I guess that number should increase. Also they seem to have a consulting arm.
Latest Analytics website- welcome! http://www.crowdanalytix.com/contests