Funny Photos- it happens only in India 2

Raj Weds Deepika (with spelling error) and no, its not photoshopped

Translation- No Girlfriend No Tension (back of Truck).Note the amazing spelling in the picture above

Polite reminder- Please do not spit

Ok maybe this maybe an international thing- but it is funny in India (which is hot enough, thank you and our curries are hot enough!)

 environmental sign

—–

Based on the award winning series of pictures at http://www.decisionstats.com/funny-photo-it-happens-only-in-india/

Interview Prof Benjamin Alamar , Sports Analytics

Here is an interview with Prof Benjamin Alamar, founding editor of the Journal of Quantitative Analysis in Sport, a professor of sports management at Menlo College and the Director of Basketball Analytics and Research for the Oklahoma City Thunder of the NBA.

Ajay – The movie Moneyball recently sparked out mainstream interest in analytics in sports.Describe the role of analytics in sports management

Benjamin- Analytics is impacting sports organizations on both the sport and business side.
On the Sport side, teams are using analytics, including advanced data management, predictive anlaytics, and information systems to gain a competitive edge. The use of analytics results in more accurate player valuations and projections, as well as determining effective strategies against specific opponents.
On the business side, teams are using the tools of analytics to increase revenue in a variety of ways including dynamic ticket pricing and optimizing of the placement of concession stands.
Ajay-  What are the ways analytics is used in specific sports that you have been part of?

Benjamin- A very typical first step for a team is to utilize the tools of predictive analytics to help inform their draft decisions.

Ajay- What are some of the tools, techniques and software that analytics in sports uses?
Benjamin- The tools of sports analytics do not differ much from the tools of business analytics. Regression analysis is fairly common as are other forms of data mining. In terms of software, R is a popular tool as is Excel and many of the other standard analysis tools.
Ajay- Describe your career journey and how you became involved in sports management. What are some of the tips you want to tell young students who wish to enter this field?

Benjamin- I got involved in sports through a company called Protrade Sports. Protrade initially was a fantasy sports company that was looking to develop a fantasy game based on advanced sports statistics and utilize a stock market concept instead of traditional drafting. I was hired due to my background in economics to develop the market aspect of the game.

There I met Roland Beech (who now works for the Mavericks) and Aaron Schatz (owner of footballoutsiders.com) and learned about the developing field of sports statistics. I then changed my research focus from economics to sports statistics and founded the Journal of Quantitative Analysis in Sports. Through the journal and my published research, I was able to establish a reputation of doing quality, useable work.

For students, I recommend developing very strong data management skills (sql and the like) and thinking carefully about what sort of questions a general manager or coach would care about. Being able to demonstrate analytic skills around actionable research will generally attract the attention of pro teams.

About-

Benjamin Alamar, Professor of Sport Management, Menlo College

Benjamin Alamar

Professor Benjamin Alamar is the founding editor of the Journal of Quantitative Analysis in Sport, a professor of sports management at Menlo College and the Director of Basketball Analytics and Research for the Oklahoma City Thunder of the NBA. He has published academic research in football, basketball and baseball, has presented at numerous conferences on sports analytics. He is also a co-creator of ESPN’s Total Quarterback Rating and a regular contributor to the Wall Street Journal. He has consulted for teams in the NBA and NFL, provided statistical analysis for author Michael Lewis for his recent book The Blind Side, and worked with numerous startup companies in the field of sports analytics. Professor Alamar is also an award winning economist who has worked academically and professionally in intellectual property valuation, public finance and public health. He received his PhD in economics from the University of California at Santa Barbara in 2001.

Prof Alamar is a speaker at Predictive Analytics World, San Fransisco and is doing a workshop there

http://www.predictiveanalyticsworld.com/sanfrancisco/2012/agenda.php#day2-17

2:55-3:15pm

All level tracks Track 1: Sports Analytics
Case Study: NFL, MLB, & NBA
Competing & Winning with Sports Analytics

The field of sports analytics ties together the tools of data management, predictive modeling and information systems to provide sports organization a competitive advantage. The field is rapidly developing based on new and expanded data sources, greater recognition of the value, and past success of a variety of sports organizations. Teams in the NFL, MLB, NBA, as well as other organizations have found a competitive edge with the application of sports analytics. The future of sports analytics can be seen through drawing on these past successes and the developments of new tools.

You can know more about Prof Alamar at his blog http://analyticfootball.blogspot.in/ or journal at http://www.degruyter.com/view/j/jqas. His detailed background can be seen at http://menlo.academia.edu/BenjaminAlamar/CurriculumVitae

Use R for Business- Competition worth $ 20,000 #rstats

All you contest junkies, R lovers and general change the world people, here’s a new contest to use R in a business application

http://www.revolutionanalytics.com/news-events/news-room/2011/revolution-analytics-launches-applications-of-r-in-business-contest.php

REVOLUTION ANALYTICS LAUNCHES “APPLICATIONS OF R IN BUSINESS” CONTEST

$20,000 in Prizes for Users Solving Business Problems with R

 

PALO ALTO, Calif. – September 1, 2011 – Revolution Analytics, the leading commercial provider of R software, services and support, today announced the launch of its “Applications of R in Business” contest to demonstrate real-world uses of applying R to business problems. The competition is open to all R users worldwide and submissions will be accepted through October 31. The Grand Prize winner for the best application using R or Revolution R will receive $10,000.

The bonus-prize winner for the best application using features unique to Revolution R Enterprise – such as itsbig-data analytics capabilities or its Web Services API for R – will receive $5,000. A panel of independent judges drawn from the R and business community will select the grand and bonus prize winners. Revolution Analytics will present five honorable mention prize winners each with $1,000.

“We’ve designed this contest to highlight the most interesting use cases of applying R and Revolution R to solving key business problems, such as Big Data,” said Jeff Erhardt, COO of Revolution Analytics. “The ability to process higher-volume datasets will continue to be a critical need and we encourage the submission of applications using large datasets. Our goal is to grow the collection of online materials describing how to use R for business applications so our customers can better leverage Big Analytics to meet their analytical and organizational needs.”

To enter Revolution Analytics’ “Applications of R in Business” competition Continue reading “Use R for Business- Competition worth $ 20,000 #rstats”

Machine Learning Contest

New Contest at http://www.ecmlpkdd2011.org/dcOverview.php

 

 

Discovery Challenge Overview

Organization | Overview | Task and DatasetsTimeline

 

General description: tasks and dataset

VideoLectures.net is a free and open access multimedia repository of video lectures, mainly of research and educational character. The lectures are given by distinguished scholars and scientists at the most important and prominent events like conferences, summer schools, workshops and science promotional events from many fields of Science. The portal is aimed at promoting science, exchanging ideas and fostering knowledge sharing by providing high quality didactic contents not only to the scientific community but also to the general public. All lectures, accompanying documents, information and links are systematically selected and classified through the editorial process taking into account also users’ comments.

The ECML-PKDD 2011 Discovery Challenge is organized in order to improve the website’s current recommender system. The challenge consists of two main tasks and a “side-by” contest. The provided data is for both of the tasks, and it is up to the contestants how it will be used for learning (building up) a recommender.

Due to the nature of the problem, each of the tasks has its own merit: task 1 simulates new-user and new- item recommendation (cold-start mode), task 2 simulates clickstream based recommendation (normal mode). Continue reading “Machine Learning Contest”

Interview- Top Data Mining Blogger on Earth , Sandro Saitta

Surajustement Modèle 2
Image via Wikipedia

If you do a Google search for Data Mining Blog- for the past several years one Blog will come on top. data mining blog – Google Search http://bit.ly/kEdPlE

To honor 5 years of Sandro Saitta’s blog (yes thats 5 years!) , we cover an exclusive interview with him where he reveals his unique sauce for cool techie blogging.

Ajay- Describe your journey as a scientist and data miner, from early experiences, to schooling to your work/research/blogging.

Sandro- My first experience with data mining was my master project. I used decision tree to predict pollen concentration for the following week using input data such as wind, temperature and rain. The fact that an algorithm can make a computer learn from experience was really amazing to me. I found it so interesting that I started a PhD in data mining. This time, the field of application was civil engineering. Civil engineers put a lot of sensors on their structure in order to understand how they behave. With all these sensors they generate a lot of data. To interpret these data, I used data mining techniques such as feature selection and clustering. I started my blog, Data Mining Research, during my PhD, to share with other researchers.

I then started applying data mining in the stock market as my first job in industry. I realized the difference between image recognition, where 99% correct classification rate is state of the art, and stock market, where you’re happy with 55%. However, the company ambiance was not as good as I thought, so I moved to consulting. There, I applied data mining in behavioral targeting to increase click-through rates. When you compare the number of customers who click with the ones who don’t, then you really understand what class imbalance mean. A few months ago, I accepted a very good opportunity at SICPA. I’m looking forward to resolving new challenges there.

Ajay- Your blog is the top ranked blog for “data mining blog”. Could you share some tips on better blogging for analytics and technical people

Sandro- It’s always difficult to start a blog, since at the beginning you have no reader. Writing for nobody may seem stupid, but it is not. By writing my first posts during my PhD I was reorganizing my ideas. I was expressing concepts which were not always clear to me. I thus learned a lot and also improved my English level. Of course, it’s still not perfect, but I hope most people can understand me.

Next come the readers. A few dozen each week first. To increase this number, I then started to learn SEO (Search Engine Optimization) by reading books and blogs. I tested many techniques that increased Data Mining Research visibility in the blogosphere. I think SEO is interesting when you already have some content published (which means not at the very beginning of your blog). After a while, once your blog is nicely ranked, the main task is to work on the content of the blog. To be of interest, your content must be particular: original, informative or provocative for example. I also had the chance to have a good visibility thanks to well-known people in the field like Kevin Hillstrom, Gregory Piatetsky-Shapiro, Will Dwinnell / Dean Abbott, Vincent Granville, Matthew Hurst and many others.

Ajay- Whats your favorite statistical software and what are the various softwares that you have worked with.
Could you compare and contrast these software as well.

Sandro- My favorite software at this point is SAS. I worked with it for two years. Once you know the language, you can perform ETL and data mining so easily. It’s also very fast compared to others. There are a lot of tools for data mining, but I cannot think of a tool that is as powerful as SAS and, in the same time, has a high-level programming language behind it.

I also worked with R and Matlab. R is very nice since you have all the up-to-date data mining algorithms implemented. However, working in the memory is not always a good choice, especially for ETL. Matlab is an excellent tool for prototyping. It’s not so fast and certainly not done for ETL, but the price is low regarding all the possibilities for data mining. According to me, SAS is the best choice for ETL and a good choice for data mining. Of course, there is the price.

Ajay- What are your favorite techniques and training resources for learning basics of data mining to say statisticians or business management graduates.

Sandro- I’m the kind of guy who likes to read books. I read data mining books one after the other. The fact that the same concepts are explained differently (and by different people) helps a lot in learning a topic like data mining. Of course, nothing replaces experience in the field. You can read hundreds of books, you will still not be a good practitioner until you really apply data mining in specific fields. My second choice after books is blogs. By reading data mining blogs, you will really see the issues and challenges in the field. It’s still not experience, but we are closer. Finally, web resources and networks such as KDnuggets of course, but also AnalyticBridge and LinkedIn.

Ajay- Describe your hobbies and how they help you ,if at all in your professional life.

Sandro- One of my hobbies is reading. I read a lot of books about data mining, SEO, Google as well as Sci-Fi and Fantasy. I’m a big fan of Asimov by the way. My other hobby is playing tennis. I think I simply use my hobbies as a way to find equilibrium in my life. I always try to find the best balance between work, family, friends and sport.

Ajay- What are your plans for your website for 2011-2012.

Sandro- I will continue to publish guest posts and interviews. I think it is important to let other people express themselves about data mining topics. I will not write about my current applications due to the policies of my current employer. But don’t worry, I still have a lot to write, whether it is technical or not. I will also emphasis more on my experience with data mining, advices for data miners, tips and tricks, and of course book reviews!

Standard Disclosure of Blogging- Sandro awarded me the Peoples Choice award for his blog for 2010 and carried out my interview. There is a lot of love between our respective wordpress blogs, but to reassure our puritan American readers- it is platonic and intellectual.

About Sandro S-



Sandro Saitta is a Data Mining Research Engineer at SICPA Security Solutions. He is also a blogger at Data Mining Research (www.dataminingblog.com). His interests include data mining, machine learning, search engine optimization and website marketing.

You can contact Mr Saitta at his Twitter address- 

https://twitter.com/#!/dataminingblog

Jump to JMP- the best statistical GUI software as per Google Search

This book just won an international award

producing graphs alongside results. In most cases, each page or two-page spread completes a JMP task, which maximizes the book’s utility as a reference.

Continue reading “Jump to JMP- the best statistical GUI software as per Google Search”

Intel® Threading Challenge 2011 Software Contest

Logo of Intel, Jul 1968 - Dec 2005
Image via Wikipedia

One more software contests for you, but in the sub million dollar prize range

http://software.intel.com/en-us/contests/intel-threading-challenge-2011/contests.php

Intel® Threading Challenge 2011 – Win a Trip to Intel Developer Forum in San Francisco

Intel® Threading Challenge 2011 is going BIG this year! After three exciting threading competitions, our fourth Threading Challenge is stepping up the excitement with a BIG Grand Prize, a trip to the Intel Developer Forum (IDF) in San Francisco (September 13-15, 2011).

Since 2008, the Intel® Threading Challenge has attracted developers of varying experience from around the world. The active participation from the community has made the Threading Challenge not only a great programming competition, but a great way for community members to engage with each other, trade threading tips, and discover new parallel programming resources.

Last year’s format of two competition levels, Master and Apprentice, generated great excitement and opened the Threading Challenge to a new group of participants. So, we are going to continue the competition with a Master level and Apprentice level, each competing for the Grand Prize for their level, as well as individual problem awards. We know you love a great challenge and great prizes, so our Threading Challenge Team is putting together some exciting threading problems for you.

Monday, April 18, 2011 – Threading Challenge 2011 (Phase 1) Launches (both levels) at 12:00 PM (noon PDT)– The competition for 2011 is very similar to last year’s, but read on whether you’re a previous participant or new to the Threading Challenge, so you will be aware of all elements of the competition and how to compete. Then, you can start threading your way to prizes today!

Choose the right level for you!

 

Threading Challenge 2011:

• Two levels available for entry: Apprentice & Master
• Phase 1: 3 problems in each level
• Phase 2: Stay tuned for details, coming in Autumn 2011
• We will award 1st, 2nd & 3rd place prizes for each problem in each level
• No overlap of problems and each level’s problems will be offered consecutively
• Participants have the option to use the Intel® Manycore Testing Lab (MTL), consisting of 40 cores, 80 threads
• To enter the Threading Challenge 2011, please read the Official Rules and register for the competition with link in the “To Enter” Section.

The Threading Challenge will be implemented in two phases, with the 1st Phase consisting of 3 problems in each level. The details of the 2nd Phase will be announced in September 2011. For Phase 1, a new problem in each level will be launched on the days listed below at 12:00 noon (PDT) and will be open for entry for 22 days (inclusive of the problem starting day), until closing on the final problem day at 12:00 noon (PDT).

Problem Start and Closing Dates (both Master and Apprentice levels):

Problem 1:
Starts: Monday, April 18, 2011 at 12:00pm (PDT)
Ends. Monday, May 9, 2011 at 12:00pm (PDT)

Problem 2:
Starts: Monday, May 9, 2011 at 12:00pm (PDT)
Ends: Monday, May 30, 2011 at 12:00pm (PDT)

Problem 3: (Due to U.S. Memorial Day Holiday, Problem 2 will start on Tuesday, May 31, 2011)
Starts: Tuesday, May 31, 2011 at 12:00pm (PDT)
Ends: Tuesday, June 21, 2011 at 12:00pm (PDT)

*All problems start and end at 12:00 noon (Pacific Daylight Time)

Contestants will have 22 days to complete their entry submission (solution only for Apprentice OR solution and write-up for Master) for each problem. You may enter ONLY 1 problem at a time and will need to choose which level (Apprentice or Master) you wish to participate in during each problem cycle. You will be awarded points based on your solution submitted. Be sure to take advantage of our threading resources and tools, and you may validate your solution (optional) using the Intel® Manycore Testing Lab to solve your problems and get involved in the dedicated forums to earn extra points.

Each problems winners will be announced on the site after the problem is closed, and Prizes will be awarded to those problem winners (see official rules for prize distribution information). The Grand Prize, a Trip to Intel® Developer Forum (IDF) in San Francisco, will be awarded for each level to the participant that has the highest total points earned for the three problems in each level (i.e., highest total points for Master level problems and Apprentice level problems).

The Intel® Threading Challenge attracts some of the most talented developers in the world to solve parallelism code challenges. Now is your chance to take multithreading to the next level and possibly win great prizes. Demonstrate your threading expertise today!

More Details:

Intel® Threading Challenge 2011 is organized so any level of developer can have the opportunity to participate. Two levels of participation are available. The Apprentice level gives those just getting started in multithreading development a chance to try out and improve their threading skills. The Master level will be executed similarly to previous threading challenges, providing those with more experience a chance to test their skills and compete against other experienced developers.

Intel® Manycore Testing Lab – Available as Option for Threading Challenge 2011 Participants

This year competitors will have the optional opportunity to develop and validate their code using the Intel® Manycore Testing Lab. This 40-core, 80-thread development environment has the latest hardware and software available and will be used by this year’s judges to test the winning entries in Threading Challenge 2011 Phase 1.

The Intel® Manycore Testing Lab (MTL) will be made available to Threading Challenge 2011 contestants. Use of the MTL will give participants the opportunity to write and test their code on systems exactly configured to what the judges will be using to score submitted entries. No more guessing about if your code will build or how it will run. (There is no requirement to use the MTL for any part of the contest. It is strictly an optional alternative being made available to those that wish to use it.)