PAW Videos

A message from Predictive Analytics World on  newly available videos. It has many free videos as well so you can check them out.

Predictive Analytics World March 2011 in San Francisco

Access PAW DC Session Videos Now

Predictive Analytics World is pleased to announce on-demand access to the videos of PAW Washington DC, October 2010, including over 30 sessions and keynotes that you may view at your convenience. Access this leading predictive analytics content online now:

View the PAW DC session videos online

Register by January 18th and receive $150 off the full 2-day conference program videos (enter code PAW150 at checkout)

Trial videos – view the following for no charge:

Select individual conference sessions, or recognize savings by registering for access to one or two full days of sessions. These on-demand videos deliver PAW DC right to your desk, covering hot topics and advanced methods such as:

Social data 

Text mining

Search marketing

Risk management

Survey analysis

Consumer privacy

Sales force optimization

Response & cross-sell

Recommender systems

Featuring experts such as:
Usama Fayyad, Ph.D.
CEO, Open Insights Former Chief Data Officer, Yahoo!

Andrew Pole
Sr Mgr, Media/DB Mktng
Target
View Keynote for Free

John F. Elder, Ph.D.
CEO and Founder
Elder Research

Bruno Aziza
Director, Worldwide Strategy Lead, BI
Microsoft

Eric Siegel, Ph.D.
Conference Chair
Predictive Analytics World

PAW DC videos feature over 25 speakers with case studies from leading enterprises such as: CIBC, CEB, Forrester, Macy’s, MetLife, Microsoft, Miles Kimball, Monster.com, Oracle, Paychex, SunTrust, Target, UPMC, Xerox, Yahoo!, YMCA, and more.

How video access works:

View Slides on the Left See & Hear Speaker in the Right Window

Sign up by January 18 for immediate video access and $150 discount


San Francisco
March 14-15, 2011
Washington DC
October, 2011
London
November, 2011
Contact Us

Produced by:

 

Session Gallery: Day 1 of 2

Viewing (17) Sessions of (31)

 

keynote.jpg
Add to Cart
Keynote: Five Ways Predictive Analytics Cuts Enterprise Risk  

Eric Siegel, Ph.D., Program Chair, Predictive Analytics World

All business is an exercise in risk management. All organizations would benefit from measuring, tracking and computing risk as a core process, much like insurance companies do.

Predictive analytics does the trick, one customer at a time. This technology is a data-driven means to compute the risk each customer will defect, not respond to an expensive mailer, consume a retention discount even if she were not going to leave in the first place, not be targeted for a telephone solicitation that would have landed a sale, commit fraud, or become a “loss customer” such as a bad debtor or an insurance policy-holder with high claims.

In this keynote session, Dr. Eric Siegel reveals:

– Five ways predictive analytics evolves your enterprise to reduce risk

– Hidden sources of risk across operational functions

– What every business should learn from insurance companies

– How advancements have reversed the very meaning of fraud

– Why “man + machine” teams are greater than the sum of their parts for enterprise decision support

Length – 00:45:57 | Email to a Colleague

Price: $195

 

 

sponsor.jpg
Play video of session: Platinum Sponsor Presentation, Analytics: The Beauty of Diversity
Platinum Sponsor Presentation: Analytics – The Beauty of Diversity 

Anne H. Milley, Senior Director of Analytic Strategy, Worldwide Product Marketing, SAS

Analytics contributes to, and draws from, multiple disciplines. The unifying theme of “making the world a better place” is bred from diversity. For instance, the same methods used in econometrics might be used in market research, psychometrics and other disciplines. In a similar way, diverse paradigms are needed to best solve problems, reveal opportunities and make better decisions. This is why we evolve capabilities to formulate and solve a wide range of problems through multiple integrated languages and interfaces. Extending that, we have provided integration with other languages so that users can draw on the disciplines and paradigms needed to best practice their craft.

Length – 20:11 | Email to a Colleague

Free viewing enabled – no charge

 

gold sponsor.jpg
Play video of session: Gold Sponsor Presentation Predictive Analytics Accelerate Insight for Financial Services
Gold Sponsor Presentation: Predictive Analytics Accelerate Insight for Financial Services 

Finbarr Deely, Director of Business Development,ParAccel

Financial services organizations face immense hurdles in maintaining profitability and building competitive advantage. Financial services organizations must perform “what-if” scenario analysis, identify risks, and detect fraud patterns. The advanced analytic complexity required often makes such analysis slow and painful, if not impossible. This presentation outlines the analytic challenges facing these organizations and provides a clear path to providing the accelerated insight needed to perform in today’s complex business environment to reduce risk, stop fraud and increase profits. * The value of predictive analytics in Accelerating Insight * Financial Services Analytic Case Studies * Brief Overview of ParAccel Analytic Database

Length – 09:06 | Email to a Colleague

Free viewing enabled – no charge

 

isson1.jpg
Add to Cart
TOPIC: BUSINESS VALUE
Case Study: Monster.com
Creating Global Competitive Power with Predictive Analytics 

Jean Paul Isson, Vice President, Globab BI & Predictive Analytics, Monster Worldwide

Using Predictive analytics to gain a deeper understanding of customer behaviours, increase marketing ROI and drive growth

– Creating global competitive power with business intelligence: Making the right decisions – at the right time

– Avoiding common change management challenges in sales, marketing, customer service, and products

– Developing a BI vision – and implementing it: successful business intelligence implementation models

– Using predictive analytics as a business driver to stay on top of the competition

– Following the Monster Worldwide global BI evolution: How Monster used BI to go from good to great

Length – 51:17 | Email to a Colleague

Price: $195

 

 

abbot.jpg
Add to Cart
TOPIC: SURVEY ANALYSIS
Case Study: YMCA
Turning Member Satisfaction Surveys into an Actionable Narrative 

Dean Abbott, President, Abbott Analytics

Employees are a key constituency at the Y and previous analysis has shown that their attitudes have a direct bearing on Member Satisfaction. This session will describe a successful approach for the analysis of YMCA employee surveys. Decision trees are built and examined in depth to identify key questions in describing key employee satisfaction metrics, including several interesting groupings of employee attitudes. Our approach will be contrasted with other factor analysis and regression-based approaches to survey analysis that we used initially. The predictive models described are currently in use and resulted in both greater understanding of employee attitudes, and a revised “short-form” survey with fewer key questions identified by the decision trees as the most important predictors.

Length – 50:19 | Email to a Colleague

Price: $195

 

 

rexer.jpg
Add to Cart
TOPIC: INDUSTRY TRENDS
2010 Data Minter Survey Results: Highlights
 

Karl Rexer, Ph.D., Rexer Analytics

Do you want to know the views, actions, and opinions of the data mining community? Each year, Rexer Analytics conducts a global survey of data miners to find out. This year at PAW we unveil the results of our 4th Annual Data Miner Survey. This session will present the research highlights, such as:

– Analytic goals & key challenges

– Impact of the economy

– Regional differences

– Text mining trends

Length – 15:20 | Email to a Colleague

Price: $195

 

 

elder.jpg
Add to Cart
Multiple Case Studies: U.S. DoD, U.S. DHS, SSA
Text Mining: Lessons Learned 

John F. Elder, Chief Scientist, Elder Research, Inc.

Text Mining is the “Wild West” of data mining and predictive analytics – the potential for gain is huge, the capability claims are often tall tales, and the “land rush” for leadership is very much a race.

In solving unstructured (text) analysis challenges, we found that principles from inductive modeling – learning relationships from labeled cases – has great power to enhance text mining. Dr. Elder highlights key technical breakthroughs discovered while working on projects for leading government agencies, including: Text Mining is the “Wild West” of data mining and predictive analytics – the potential for gain is huge, the capability claims are often tall tales, and the “land rush” for leadership is very much a race.

– Prioritizing searches for the Dept. of Homeland Security

– Quick decisions for Social Security Admin. disability

– Document discovery for the Dept. of Defense

– Disease discovery for the Dept. of Homeland Security

– Risk profiling for the Dept. of Defense

Length – 48:58 | Email to a Colleague

Price: $195

 

 

target.jpg
Play video of session: Keynote: How Target Gets the Most out of Its Guest Data to Improve Marketing ROI
Keynote: How Target Gets the Most out of Its Guest Data to Improve Marketing ROI 

Andrew Pole, Senior Manager, Media and Database Marketing, Target

In this session, you’ll learn how Target leverages its own internal guest data to optimize its direct marketing – with the ultimate goal of enhancing our guests’ shopping experience and driving in-store and online performance. You will hear about what guest data is available at Target, how and where we collect it, and how it is used to improve the performance and relevance of direct marketing vehicles. Furthermore, we will discuss Target’s development and usage of guest segmentation, response modeling, and optimization as means to suppress poor performers from mailings, determine relevant product categories and services for online targeted content, and optimally assign receipt marketing offers to our guests when offer quantities are limited.

Length – 47:49 | Email to a Colleague

Free viewing enabled – no charge

 

analytics.jpg
Play video of session: Platinum Sponsor Presentation: Driving Analytics Into Decision Making
Platinum Sponsor Presentation: Driving Analytics Into Decision Making  

Jason Verlen, Director, SPSS Product Strategy & Management, IBM Software Group

Organizations looking to dramatically improve their business outcomes are turning to decision management, a convergence of technology and business processes that is used to streamline and predict the outcome of daily decision-making. IBM SPSS Decision Management technology provides the critical link between analytical insight and recommended actions. In this session you’ll learn how Decision Management software integrates analytics with business rules and business applications for front-line systems such as call center applications, insurance claim processing, and websites. See how you can improve every customer interaction, minimize operational risk, reduce fraud and optimize results.

Length – 17:29 | Email to a Colleague

Free viewing enabled – no charge

 

macy.jpg
Add to Cart
TOPIC: DATA INFRASTRUCTURE AND INTEGRATION
Case Study: Macy’s
The world is not flat (even though modeling software has to think it is) 

Paul Coleman, Director of Marketing Statistics, Macy’s Inc.

Software for statistical modeling generally use flat files, where each record represents a unique case with all its variables. In contrast most large databases are relational, where data are distributed among various normalized tables for efficient storage. Variable creation and model scoring engines are necessary to bridge data mining and storage needs. Development datasets taken from a sampled history require snapshot management. Scoring datasets are taken from the present timeframe and the entire available universe. Organizations, with significant data, must decide when to store or calculate necessary data and understand the consequences for their modeling program.

Length – 34:54 | Email to a Colleague

Price: $195

 

 

gwaltney.jpg
Add to Cart
TOPIC: CUSTOMER VALUE
Case Study: SunTrust
When One Model Will Not Solve the Problem – Using Multiple Models to Create One Solution 

Dudley Gwaltney, Group Vice President, Analytical Modeling, SunTrust Bank

In 2007, SunTrust Bank developed a series of models to identify clients likely to have large changes in deposit balances. The models include three basic binary and two linear regression models.

Based on the models, 15% of SunTrust clients were targeted as those most likely to have large balance changes. These clients accounted for 65% of the absolute balance change and 60% of the large balance change clients. The targeted clients are grouped into a portfolio and assigned to individual SunTrust Retail Branch. Since 2008, the portfolio generated a 2.6% increase in balances over control.

Using the SunTrust example, this presentation will focus on:

– Identifying situations requiring multiple models

– Determining what types of models are needed

– Combining the individual component models into one output

Length – 48:22 | Email to a Colleague

Price: $195

 

 

paychex1.jpg
Add to Cart
TOPIC: RESPONSE & CROSS-SELL
Case Study: Paychex
Staying One Step Ahead of the Competition – Development of a Predictive 401(k) Marketing and Sales Campaign 

Jason Fox, Information Systems and Portfolio Manager,Paychex

In-depth case study of Paychex, Inc. utilizing predictive modeling to turn the tides on competitive pressures within their own client base. Paychex, a leading provider of payroll and human resource solutions, will guide you through the development of a Predictive 401(k) Marketing and Sales model. Through the use of sophisticated data mining techniques and regression analysis the model derives the probability a client will add retirement services products with Paychex or with a competitor. Session will include roadblocks that could have ended development and ROI analysis. Speaker: Frank Fiorille, Director of Enterprise Risk Management, Paychex Speaker: Jason Fox, Risk Management Analyst, Paychex

Length – 26:29 | Email to a Colleague

Price: $195

 

 

ling.jpg
Add to Cart
TOPIC: SEGMENTATION
Practitioner: Canadian Imperial Bank of Commerce
Segmentation Do’s and Don’ts 

Daymond Ling, Senior Director, Modelling & Analytics,Canadian Imperial Bank of Commerce

The concept of Segmentation is well accepted in business and has withstood the test of time. Even with the advent of new artificial intelligence and machine learning methods, this old war horse still has its place and is alive and well. Like all analytical methods, when used correctly it can lead to enhanced market positioning and competitive advantage, while improper application can have severe negative consequences.

This session will explore what are the elements of success, and what are the worse practices that lead to failure. The relationship between segmentation and predictive modeling will also be discussed to clarify when it is appropriate to use one versus the other, and how to use them together synergistically.

Length – 45:57 | Email to a Colleague

Price: $195

 

 

kobelius1.jpg
Add to Cart
TOPIC: SOCIAL DATA
Thought Leadership
Social Network Analysis: Killer Application for Cloud Analytics
 

James Kobielus, Senior Analyst, Forrester Research

Social networks such as Twitter and Facebook are a potential goldmine of insights on what is truly going through customers´minds. Every company wants to know whether, how, how often, and by whom they´re being mentioned across the billowing new cloud of social media. Just as important, every company wants to influence those discussions in their favor, target new business, and harvest maximum revenue potential. In this session, Forrester analyst James Kobielus identifies fruitful applications of social network analysis in customer service, sales, marketing, and brand management. He presents a roadmap for enterprises to leverage their inline analytics initiatives and leverage high-performance data warehousing (DW) clouds and appliances in order to analyze shifting patterns of customer sentiment, influence, and propensity. Leveraging Forrester’s ongoing research in advanced analytics and customer relationship management, Kobielus will discuss industry trends, commercial modeling tools, and emerging best practices in social network analysis, which represents a game-changing new discipline in predictive analytics.

Length – 48:16 | Email to a Colleague

Price: $195

 

 

dogan.jpg
Add to Cart
TOPIC: HEALTHCARE – INTERNATIONAL TARGETING
Case Study: Life Line Screening
Taking CRM Global Through Predictive Analytics 

Ozgur Dogan,
VP, Quantitative Solutions Group, Merkle Inc

Trish Mathe,
Director of Database Marketing, Life Line Screening

While Life Line is successfully executing a US CRM roadmap, they are also beginning this same evolution abroad. They are beginning in the UK where Merkle procured data and built a response model that is pulling responses over 30% higher than competitors. This presentation will give an overview of the US CRM roadmap, and then focus on the beginning of their strategy abroad, focusing on the data procurement they could not get anywhere else but through Merkle and the successful modeling and analytics for the UK. Speaker: Ozgur Dogan, VP, Quantitative Solutions Group, Merkle Inc Speaker: Trish Mathe, Director of Database Marketing, Life Line Screening

Length – 40:12 | Email to a Colleague

Price: $195

 

 

sambamoorthi1.jpg
Add to Cart
TOPIC: SURVEY ANALYSIS
Case Study: Forrester
Making Survey Insights Addressable and Scalable – The Case Study of Forrester’s Technographics Benchmark Survey 

Nethra Sambamoorthi, Team Leader, Consumer Dynamics & Analytics, Global Consulting, Acxiom Corporation

Marketers use surveys to create enterprise wide applicable strategic insights to: (1) develop segmentation schemes, (2) summarize consumer behaviors and attitudes for the whole US population, and (3) use multiple surveys to draw unified views about their target audience. However, these insights are not directly addressable and scalable to the whole consumer universe which is very important when applying the power of survey intelligence to the one to one consumer marketing problems marketers routinely face. Acxiom partnered with Forrester Research, creating addressable and scalable applications of Forrester’s Technographics Survey and applied it successfully to a number of industries and applications.

Length – 39:23 | Email to a Colleague

Price: $195

 

 

zasadil.jpg
Add to Cart
TOPIC: HEALTHCARE
Case Study: UPMC Health Plan
A Predictive Model for Hospital Readmissions 

Scott Zasadil, Senior Scientist, UPMC Health Plan

Hospital readmissions are a significant component of our nation’s healthcare costs. Predicting who is likely to be readmitted is a challenging problem. Using a set of 123,951 hospital discharges spanning nearly three years, we developed a model that predicts an individual’s 30-day readmission should they incur a hospital admission. The model uses an ensemble of boosted decision trees and prior medical claims and captures 64% of all 30-day readmits with a true positive rate of over 27%. Moreover, many of the ‘false’ positives are simply delayed true positives. 53% of the predicted 30-day readmissions are readmitted within 180 days.

Length – 54:18 | Email to a Colleague

Price: $195

Multi State Models

Arc de Triomphe

A special issue of the Journal of Statistical Software has come out devoted to Multi State Models and Competing Risks. It is a must read for anyone with interest in Pharma Analytics or Survival Analysis- even if you dont know much R

Here is an extract from “mstate: An R Package for the Analysis ofCompeting Risks and Multi-State Models”

Multi-state models are a very useful tool to answer a wide range of questions in sur-vival analysis that cannot, or only in a more complicated way, be answered by classicalmodels. They are suitable for both biomedical and other applications in which time-to-event variables are analyzed. However, they are still not frequently applied. So far, animportant reason for this has been the lack of available software. To overcome this prob-lem, we have developed the mstate package in R for the analysis of multi-state models.The package covers all steps of the analysis of multi-state models, from model buildingand data preparation to estimation and graphical representation of the results. It canbe applied to non- and semi-parametric (Cox) models. The package is also suitable forcompeting risks models, as they are a special category of multi-state models.

 

—————————–

 

Issues for JSS Special Volume 38: Competing Risks and Multi-State Models

Special Issue about Competing Risks and Multi-State Models

Hein Putter
Vol. 38, Issue 1, Jan 2011
Submitted 2011-01-03, Accepted 2011-01-03

Analyzing Competing Risk Data Using the R timereg Package

Thomas H. Scheike, Mei-Jie Zhang
Vol. 38, Issue 2, Jan 2011
Submitted 2009-05-25, Accepted 2010-06-22

p3state.msm: Analyzing Survival Data from an Illness-Death Model

Luís Filipe Meira Machado, Javier Roca-Pardiñas
Vol. 38, Issue 3, Jan 2011
Submitted 2009-06-30, Accepted 2010-03-02

Empirical Transition Matrix of Multi-State Models: The etm Package

Arthur Allignol, Martin Schumacher, Jan Beyersmann
Vol. 38, Issue 4, Jan 2011
Submitted 2009-01-08, Accepted 2010-03-11

Lexis: An R Class for Epidemiological Studies with Long-Term Follow-Up

Martyn Plummer, Bendix Carstensen
Vol. 38, Issue 5, Jan 2011
Submitted 2010-02-09, Accepted 2010-09-16

Using Lexis Objects for Multi-State Models in R

Bendix Carstensen, Martyn Plummer
Vol. 38, Issue 6, Jan 2011
Submitted 2010-02-09, Accepted 2010-09-16

mstate: An R Package for the Analysis of Competing Risks and Multi-State Models

Liesbeth C. de Wreede, Marta Fiocco, Hein Putter
Vol. 38, Issue 7, Jan 2011
Submitted 2010-01-17, Accepted 2010-08-20

Multi-State Models for Panel Data: The msm Package for R

Christopher Jackson
Vol. 38, Issue 8, Jan 2011
Submitted 2009-07-21, Accepted 2010-08-18

_______________________________________________
JSS-Announce mailing list
JSS-Announce@lists.stat.ucla.edu
http://lists.stat.ucla.edu/mailman/listinfo/jss-announce

 

How to balance your online advertising and your offline conscience

Google in 1998, showing the original logo
Image via Wikipedia

I recently found an interesting example of  a website that both makes a lot of money and yet is much more efficient than any free or non profit. It is called ECOSIA

If you see a website that wants to balance administrative costs  plus have a transparent way to make the world better- this is a great example.

  • http://ecosia.org/how.php
  • HOW IT WORKS
    You search with Ecosia.
  • Perhaps you click on an interesting sponsored link.
  • The sponsoring company pays Bing or Yahoo for the click.
  • Bing or Yahoo gives the bigger chunk of that money to Ecosia.
  • Ecosia donates at least 80% of this income to support WWF’s work in the Amazon.
  • If you like what we’re doing, help us spread the word!
  • Key facts about the park:

    • World’s largest tropical forest reserve (38,867 square kilometers, or about the size of Switzerland)
    • Home to about 14% of all amphibian species and roughly 54% of all bird species in the Amazon – not to mention large populations of at least eight threatened species, including the jaguar
    • Includes part of the Guiana Shield containing 25% of world’s remaining tropical rainforests – 80 to 90% of which are still pristine
    • Holds the last major unpolluted water reserves in the Neotropics, containing approximately 20% of all of the Earth’s water
    • One of the last tropical regions on Earth vastly unaltered by humans
    • Significant contributor to climatic regulation via heat absorption and carbon storage

     

    http://ecosia.org/statistics.php

    They claim to have donated 141,529.42 EUR !!!

    http://static.ecosia.org/files/donations.pdf

     

     

     

     

     

     

     

     

     

     

    Well suppose you are the Web Admin of a very popular website like Wikipedia or etc

    One way to meet server costs is to say openly hey i need to balance my costs so i need some money.

    The other way is to use online advertising.

    I started mine with Google Adsense.

    Click per milli (or CPM)  gives you a very low low conversion compared to contacting ad sponsor directly.

    But its a great data experiment-

    as you can monitor which companies are likely to be advertised on your site (assume google knows more about their algols than you will)

    which formats -banner or text or flash have what kind of conversion rates

    what are the expected pay off rates from various keywords or companies (like business intelligence software, predictive analytics software and statistical computing software are similar but have different expected returns (if you remember your eco class)

     

    NOW- Based on above data, you know whats your minimum baseline to expect from a private advertiser than a public, crowd sourced search engine one (like Google or Bing)

    Lets say if you have 100000 views monthly. and assume one out of 1000 page views will lead to a click. Say the advertiser will pay you 1 $ for every 1 click (=1000 impressions)

    Then your expected revenue is $100.But if your clicks are priced at 2.5$ for every click , and your click through rate is now 3 out of 1000 impressions- (both very moderate increases that can done by basic placement optimization of ad type, graphics etc)-your new revenue is  750$.

    Be a good Samaritan- you decide to share some of this with your audience -like 4 Amazon books per month ( or I free Amazon book per week)- That gives you a cost of 200$, and leaves you with some 550$.

    Wait! it doesnt end there- Adam Smith‘s invisible hand moves on .

    You say hmm let me put 100 $ for an annual paper writing contest of $1000, donate $200 to one laptop per child ( or to Amazon rain forests or to Haiti etc etc etc), pay $100 to your upgraded server hosting, and put 350$ in online advertising. say $200 for search engines and $150 for Facebook.

    Woah!

    Month 1 would should see more people  visiting you for the first time. If you have a good return rate (returning visitors as a %, and low bounce rate (visits less than 5 secs)- your traffic should see atleast a 20% jump in new arrivals and 5-10 % in long term arrivals. Ignoring bounces- within  three months you will have one of the following

    1) An interesting case study on statistics on online and social media advertising, tangible motivations for increasing community response , and some good data for study

    2) hopefully better cost management of your server expenses

    3)very hopefully a positive cash flow

     

    you could even set a percentage and share the monthly (or annually is better actions) to your readers and advertisers.

    go ahead- change the world!

    the key paradigms here are sharing your traffic and revenue openly to everyone

    donating to a suitable cause

    helping increase awareness of the suitable cause

    basing fixed percentages rather than absolute numbers to ensure your site and cause are sustained for years.

    2010 in review and WP-Stats

    The following is an auto generated post thanks to WordPress.com stats team- clearly they have got some stuff wrong

    1) Defining the speedometer quantitatively

    2) The busiest day numbers are plain wrong ( 2 views ??)

    3) There is still no geographic data in WordPress -com stats (unlike Google Analytics) and I cant enable Google Analytics on a wordpress.com hosted site.

     

    The stats helper monkeys at WordPress.com mulled over how this blog did in 2010, and here’s a high level summary of its overall blog health:

    Healthy blog!

    The Blog-Health-o-Meter™ reads Wow.

    Crunchy numbers

    Featured image

    The Louvre Museum has 8.5 million visitors per year. This blog was viewed about 97,000 times in 2010. If it were an exhibit at The Louvre Museum, it would take 4 days for that many people to see it.

     

    In 2010, there were 367 new posts, growing the total archive of this blog to 1191 posts. There were 411 pictures uploaded, taking up a total of 121mb. That’s about 1 pictures per day.

    The busiest day of the year was September 22nd with 2 views. The most popular post that day was Top 10 Graphical User Interfaces in Statistical Software.

    Where did they come from?

    The top referring sites in 2010 were r-bloggers.com, reddit.com, rattle.togaware.com, twitter.com, and Google Reader.

    Some visitors came searching, mostly for libre office, facebook analytics, test drive a chrome notebook, test drive a chrome notebook., and wps sas lawsuit.

    Attractions in 2010

    These are the posts and pages that got the most views in 2010.

    1

    Top 10 Graphical User Interfaces in Statistical Software April 2010
    8 comments and 1 Like on WordPress.com,

    2

    Wealth = function (numeracy, memory recall) December 2009
    1 Like on WordPress.com,

    3

    Matlab-Mathematica-R and GPU Computing September 2010
    1 Like on WordPress.com,

    4

    About DecisionStats July 2008

    5

    The Top Statistical Softwares (GUI) May 2010
    1 comment and 1 Like on WordPress.com,

    The Year 2010

    Nokia N800 internet tablet, with open source s...
    Image via Wikipedia

    My annual traffic to this blog was almost 99,000 . Add in additional views on networking sites plus the 400 plus RSS readers- so I can say traffic was 1,20,000 for 2010. Nice. Thanks for reading and hope it was worth your time. (this is a long post and will take almost 440 secs to read but the summary is just given)

    My intent is either to inform you, give something useful or atleast something interesting.

    see below-

    Jan Feb Mar Apr May Jun
    2010 6,311 4,701 4,922 5,463 6,493 4,271
    Jul Aug Sep Oct Nov Dec Total
    5,041 5,403 17,913 16,430 11,723 10,096 98,767

     

     

    Sandro Saita from http://www.dataminingblog.com/ just named me for an award on his blog (but my surname is ohRi , Sandro left me without an R- What would I be without R :)) ).

    Aw! I am touched. Google for “Data Mining Blog” and Sandro is the best that it is in data mining writing.

    DMR People Award 2010
    There are a lot of active people in the field of data mining. You can discuss with them on forums. You can read their blogs. You can also meet them in events such as PAW or KDD. Among the people I follow on a regular basis, I have elected:

    Ajay Ori

    He has been very active in 2010, especially on his blog . Good work Ajay and continue sharing your experience with us!”

    What did I write in 2010- stuff.

    What did you read on this blog- well thats the top posts list.

    2009-12-31 to Today

    Title Views
    Home page More stats 21,150
    Top 10 Graphical User Interfaces in Statistical Software More stats 6,237
    Wealth = function (numeracy, memory recall) More stats 2,014
    Matlab-Mathematica-R and GPU Computing More stats 1,946
    The Top Statistical Softwares (GUI) More stats 1,405
    About DecisionStats More stats 1,352
    Using Facebook Analytics (Updated) More stats 1,313
    Test drive a Chrome notebook. More stats 1,170
    Top ten RRReasons R is bad for you ? More stats 1,157
    Libre Office More stats 1,151
    Interview Hadley Wickham R Project Data Visualization Guru More stats 1,007
    Using Red R- R with a Visual Interface More stats 854
    SAS Institute files first lawsuit against WPS- Episode 1 More stats 790
    Interview Professor John Fox Creator R Commander More stats 764
    R Package Creating More stats 754
    Windows Azure vs Amazon EC2 (and Google Storage) More stats 726
    Norman Nie: R GUI and More More stats 716
    Startups for Geeks More stats 682
    Google Maps – Jet Ski across Pacific Ocean More stats 670
    Not so AWkward after all: R GUI RKWard More stats 579
    Red R 1.8- Pretty GUI More stats 570
    Parallel Programming using R in Windows More stats 569
    R is an epic fail or is it just overhyped More stats 559
    Enterprise Linux rises rapidly:New Report More stats 537
    Rapid Miner- R Extension More stats 518
    Creating a Blog Aggregator for free More stats 504
    So which software is the best analytical software? Sigh- It depends More stats 473
    Revolution R for Linux More stats 465
    John Sall sets JMP 9 free to tango with R More stats 460

    So how do people come here –

    well I guess I owe Tal G for almost 9000 views ( incidentally I withdrew posting my blog from R- Bloggers and Analyticbridge blogs – due to SEO keyword reasons and some spam I was getting see (below))

    http://r-bloggers.com is still the CAT’s whiskers and I read it  a lot.

    I still dont know who linked my blog to a free sex movie site with 400 views but I have a few suspects.

    2009-12-31 to Today

    Referrer Views
    r-bloggers.com 9,131
    Reddit 3,829
    rattle.togaware.com 1,500
    Twitter 1,254
    Google Reader 1,215
    linkedin.com 717
    freesexmovie.irwanaf.com 422
    analyticbridge.com 341
    Google 327
    coolavenues.com 322
    Facebook 317
    kdnuggets.com 298
    dataminingblog.com 278
    en.wordpress.com 185
    google.co.in 151
    xianblog.wordpress.com 130
    inside-r.org 124
    decisionstats.com 119
    ifreestores.com 117
    bits.blogs.nytimes.com 108

    Still reading this post- gosh let me sell you some advertising. It is only $100 a month (yes its a recession)

    Advertisers are treated on First in -Last out (FILO)

    I have been told I am obsessed with SEO , but I dont care much for search engines apart from Google, and yes SEO is an interesting science (they should really re name it GEO or Google Engine Optimization)

    Apparently Hadley Wickham and Donald Farmer are big keywords for me so I should be more respectful I guess.

    Search Terms for 365 days ending 2010-12-31 (Summarized)

    2009-12-31 to Today

    Search Views
    libre office 925
    facebook analytics 798
    test drive a chrome notebook 467
    test drive a chrome notebook. 215
    r gui 203
    data mining 163
    wps sas lawsuit 158
    wordle.net 133
    wps sas 123
    google maps jet ski 123
    test drive chrome notebook 96
    sas wps 89
    sas wps lawsuit 85
    chrome notebook test drive 83
    decision stats 83
    best statistics software 74
    hadley wickham 72
    google maps jetski 72
    libreoffice 70
    doug savage 65
    hive tutorial 58
    funny india 56
    spss certification 52
    donald farmer microsoft 51
    best statistical software 49

    What about outgoing links? Apparently I need to find a way to ask Google to pay me for the free advertising I gave their chrome notebook launch. But since their search engine and browser is free to me, guess we are even steven.

    Clicks for 365 days ending 2010-12-31 (Summarized)

    2009-12-31 to Today

    URL Clicks
    rattle.togaware.com 378
    facebook.com/Decisionstats 355
    rapid-i.com/content/view/182/196 319
    services.google.com/fb/forms/cr48basic 313
    red-r.org 228
    decisionstats.wordpress.com/2010/05/07/the-top-statistical-softwares-gui 199
    teamwpc.co.uk/products/wps 162
    r4stats.com/popularity 148
    r-statistics.com/2010/04/r-and-the-google-summer-of-code-2010-accepted-students-and-projects 138
    socserv.mcmaster.ca/jfox/Misc/Rcmdr 138
    spss.com/certification 116
    learnr.wordpress.com 114
    dudeofdata.com/decisionstats 108
    r-project.org 107
    documentfoundation.org/faq 104
    goo.gl/maps/UISY 100
    inside-r.org/download 96
    en.wikibooks.org/wiki/R_Programming 92
    nytimes.com/external/readwriteweb/2010/12/07/07readwriteweb-report-google-offering-chrome-notebook-test-11919.html 92
    sourceforge.net/apps/mediawiki/rkward/index.php?title=Main_Page 92
    analyticdroid.togaware.com 88
    yeroon.net/ggplot2 87

    so in 2010,

    SAS remained top daddy in business analytics,

    R made revolutionary strides in terms of new packages,

    JMP  launched a new version,

    SPSS got integrated with Cognos,

    Oracle sued Google and did build a great Data Mining GUI,

    Libre Office gave you a non Oracle Open office ( or open even more office)

    2011 looks like  a fun year. Have safe partying .

    Privacy Browsing Extensions in Google Chrome

    coat of arms of the Palaiologos dynasty, the l...
    Image via Wikipedia

    Using two Chrome Extensions, Disconnect and AdBlock you can be sure of having a vary very clean browsing experience-it is recommended especially if you dont like the auto sharing of your personal preferences and cannot be bothered by the Byzantine maze of social media privacy fineprint.

    https://chrome.google.com/extensions/detail/jeoacafpbcihiomhlakheieifhpjdfeo

    Disconnect by Brian Kennish

    (184) – 44,284 users – Weekly installs: 24,086

    Stop major third parties and search engines from tracking the webpages you go to and searches you do.

    * Search depersonalization is now optional and off by default. Click the “d” button then the “Depersonalize searches” checkbox to turn this feature on (or back off in case you have trouble getting to Google or Yahoo services). For help with anything else, see the known issues below and ask questions at http://j.mp/dnewgroup.
    
    §
    
    If you’re a typical web user, you’re unintentionally sending your browsing and search history with your name and other personal information to third parties and search engines whenever you’re online.
    
    Take control of the data you share with Disconnect!
    
    From the developer of the top-10-rated Facebook Disconnect extension, Disconnect lets you:
    
    • Disable tracking by third parties like Digg, Facebook, Google, Twitter, and Yahoo, without requiring any setup or significantly degrading the usability of the web.
    
    • Truly depersonalize searches on search engines like Google and Yahoo (by blocking identifying cookies not just changing the appearance of results pages), while staying logged into other services — e.g., so you can search anonymously on Google and access iGoogle at once.
    
    • See how many resource and cookie requests are blocked, in real time
    
    
    and
    https://chrome.google.com/extensions/detail/gighmmpiobklfepjocnamgkkbiglidom
    
    ExtensionsAdBlock

    AdBlock

    (6937) - 1,615,373 users - Weekly installs: 153,032
    The most popular Chrome extension, with over 1.5 million users! Blocks ads all over the web.
    Verified author: chromeadblock.com
    =================
    
    New in version 2.1: Translated into dozens of languages!
    New in version 2.0: Ads are blocked from downloading, instead of just being removed after the fact!
    
    =======================
    
    The official AdBlock For Chrome!  Block all advertisements on all web pages.  Your browser is automatically updated with additions to the filter: just click Install, then visit your favorite website and see the ads disappear!
    
    FAQs:1. This is the official AdBlock extension: the original ad blocker written from the ground up to be optimized in Chrome.  There's an unrelated, older Firefox project called Adblock Plus, and they're working on making a Chrome version out of the old AdThwart codebase.  At the moment AdBlock blocks some ads that AdThwart only hides, but they're working to improve it.  It's available at bit.ly/id2Gqx; if you have trouble with AdBlock, they're good guys and a fine alternative!

     

    Troubleshooting Rattle Installation- Data Mining R GUI

    Screenshot of Synaptic Package Manager running...
    Image via Wikipedia

    I really find the Rattle GUI very very nice and easy to do any data mining task. The software is available from http://rattle.togaware.com/

    The only issue is Rattle can be quite difficult to install due to dependencies on GTK+

    After fiddling for a couple of years- this is what I did

    1) Created dual boot OS- Basically downloaded the netbook remix from http://ubuntu.com I created a dual boot OS so you can choose at the beginning whether to use Windows or Ubuntu Linux in that session.  Alternatively you can download VM Player www.vmware.com/products/player/ if you want to do both

    2) Download R packages using Ubuntu packages and Install GTK+ dependencies before that.

    GTK + Requires

    1. Libglade
    2. Glib
    3. Cairo
    4. Pango
    5. ATK

    If  you are a Linux newbie like me who doesnt get the sudo apt get, tar, cd, make , install rigmarole – scoot over to synaptic software packages or just the main ubuntu software centre and download these packages one by one.

    For R Dependencies, you need

    • PMML
    • XML
    • RGTK2

    Again use r-cran as the prefix to these package names and simply install (almost the same way Windows does it easily -double click)

    see http://packages.ubuntu.com/search?suite=lucid&searchon=names&keywords=r-cran

    4) Install Rattle from source

    http://rattle.togaware.com/rattle-download.html

    Advanced users can download the Rattle source packages directly:

    Save theses to your hard disk (e.g., to your Desktop) but don’t extract them. Then, on GNU/Linux run the install command shown below. This command is entered into a terminal window:

    • R CMD INSTALL rattle_2.6.0.tar.gz

    After installation-

    5) Type library(rattle) and rattle.info to get messages on what R packages to update for a proper functioning

    </code>
    
    > library(rattle)
    Rattle: Graphical interface for data mining using R.
    Version 2.6.0 Copyright (c) 2006-2010 Togaware Pty Ltd.
    Type 'rattle()' to shake, rattle, and roll your data.
    > rattle.info()
    Rattle: version 2.6.0
    R: version 2.11.1 (2010-05-31) (Revision 52157)
    
    Sysname: Linux
    Release: 2.6.35-23-generic
    Version: #41-Ubuntu SMP Wed Nov 24 10:18:49 UTC 2010
    Nodename: k1-M725R
    Machine: i686
    Login: k1ng
    User: k1ng
    
    Installed Dependencies
    RGtk2: version 2.20.3
    pmml: version 1.2.26
    colorspace: version 1.0-1
    cairoDevice: version 2.14
    doBy: version 4.1.2
    e1071: version 1.5-24
    ellipse: version 0.3-5
    foreign: version 0.8-41
    gdata: version 2.8.1
    gtools: version 2.6.2
    gplots: version 2.8.0
    gWidgetsRGtk2: version 0.0-69
    Hmisc: version 3.8-3
    kernlab: version 0.9-12
    latticist: version 0.9-43
    Matrix: version 0.999375-46
    mice: version 2.4
    network: version 1.5-1
    nnet: version 7.3-1
    party: version 0.9-99991
    playwith: version 0.9-53
    randomForest: version 4.5-36 upgrade available 4.6-2
    rggobi: version 2.1.16
    survival: version 2.36-2
    XML: version 3.2-0
    bitops: version 1.0-4.1
    
    Upgrade the packages with:
    
     > install.packages(c("randomForest"))
    
    <code>

    Now upgrade whatever package rattle.info tells to upgrade.

    This is much simpler and less frustrating than some of the other ways to install Rattle.

    If all goes well, you will see this familiar screen popup when you type

    >rattle()