An Introduction to Data Mining-online book

I was reading David Smith’s blog http://blog.revolutionanalytics.com/

where he mentioned this interview of Norman Nie, at TDWI

http://tdwi.org/Articles/2010/11/17/R-101.aspx?Page=2

where I saw this link (its great if you want to study Data Mining btw)

http://www.kdnuggets.com/education/usa-canada.html

and I c/liked the U Toronto link

http://chem-eng.utoronto.ca/~datamining/

Best of All- I really liked this online book created by Professor S. Sayad

Its succinct and beautiful and describes all of the Data Mining you want to read in one Map (actually 4 images painstakingly assembled with perfection)

The best thing is- in the original map- even the sub items are click-able for specifics like Pie Chart and Stacked Column chart are not in one simple drop down like Charts- but rather by nature of the kind of variables that lead to these charts. For doing that- you would need to go to the site itself- ( see http://chem-eng.utoronto.ca/~datamining/dmc/categorical_variables.htm

vs

http://chem-eng.utoronto.ca/~datamining/dmc/categorical_numerical.htm

Again- there is no mention of the data visualization software used to create the images but I think I can take a hint from the Software Page which says software used are-

Software

See it on your own-online book (c)Professor S. Sayad

Really good DIY tutorial

http://chem-eng.utoronto.ca/~datamining/dmc/data_mining_map.htm

RWui :Creating R Web Interfaces on the go

Here is a great R application created by http://sysbio.mrc-bsu.cam.ac.uk

R Wui for creating R Web Interfaces

its been there for some time now- but presumably R Apache is more well known.

From-

http://sysbio.mrc-bsu.cam.ac.uk/Rwui/tutorial/Rwui_Rnews_final.pdf

The web application Rwui is used to create web interfaces  for running R scripts. All the code is generated automatically so that a fully functional web interface for an R script can be downloaded and up and running in a matter of minutes.

Rwui is aimed at R script writers who have scripts that they want people unversed in R to use. The script writer uses Rwui to create a web application that will run their R script. Rwui allows the script writer to do this without them having to do any web application programming, because Rwui generates all the code for them.

The script writer designs the web application to run their R script by entering information on a sequence of web pages. The script writer then downloads the application they have created and installs it on their own server.

http://sysbio.mrc-bsu.cam.ac.uk/Rwui/tutorial/Technical_Report.pdf

Features of web applications created by Rwui

  1. Whole range of input items available if required – text boxes, checkboxes, file upload etc.
  2. Facility for uploading of an arbitrary number of files (for example, microarray replicates).
  3. Facility for grouping uploaded files (for example, into ‘Diseased’ and ‘Control’ microarray data files).
  4. Results files displayed on results page and available for download.
  5. Results files can be e-mailed to the user.
  6. Interactive results files using image maps.
  7. Repeat analyses with different parameters and data files – new results added to results list, as a link to the corresponding results page.
  8. Real time progress information (text or graphical) displayed when running the application.

Requirements

In order to use the completed web applications created by Rwui you will need:

  1. A Java webserver such as Tomcat version 5.5 or later.
  2. Java version 1.5
  3. R – a version compatible with your R script(s).

Using Rwui

Using Rwui to create a web application for an R script simply involves:

  1. Entering details about your Rscript on a sequence of web pages.
  2. Rwui is quite flexible so you can backtrack, edit and insert, as you design your application.
  3. Rwui then generates the web application, which is Java based and platform independent.
  4. The application can be downloaded either as a .zip or .tgz file.
  5. Unpacked, the download contains all the source code and a .war file.
  6. Once the .war file is copied to the Tomcat webapps directory, the application is ready to use.
  7. Application details are saved in an ‘application definition file’ for reuse and modification.
Interested-
go click and check out a new web app from http://sysbio.mrc-bsu.cam.ac.uk/Rwui/ in a matter of minutes
Also see

American Decline- Why outsourcing doesnt make sense

Bureau of Labor Statistics logo RGB colors.
Image via Wikipedia

Here is a celebrated graphic from an American journalist using U.S. Department of Labor’s Bureau of Labor Statistics. It is a good example of using time as a dimension for animation- and heat maps for geography enabled visualizations.

————————–According to the U.S. Department of Labor’s Bureau of Labor Statistics, there are nearly 31 million people currently unemployed — that’s including those involuntarily working part time and those who want a job, but have given up on trying to find one. In the face of the worst economic upheaval since the Great Depression, millions of Americans are hurting. “The Decline: The Geography of a Recession,” as created by labor writer LaToya Egwuekwe, serves as a vivid representation of just how much. Watch the deteriorating transformation of the U.S. economy from January 2007 — approximately one year before the start of the recession — to the most recent unemployment data available today. Original link: http://www.latoyaegwuekwe.com/geographyofarecession.html. For more information, email latoya.egwuekwe@yahoo.com

————————————————————————————-

 

31 million unemployed- Does a US corporation seriously think that it can build everything OUTSIDE America and SELL INSIDE America. or who think it is okay intellectual property continues to be stolen as long as labor is cheap.

Shame on you if you outsourced your neighbour’s jobs- or would rather hire in a geography where they steal your intellectual property.

 

This Christmastime – May the Ghost of  the Unemployed Family Christmases visit you in your sleep instead.

Gartner BI and Inf Mgmt Summit 2011- 30 min One on Ones

From the land Down Under, where Gartner gathers business summit thunder.

http://www.gartner.com/technology/summits/apac/business-intelligence/index.jsp

Gartner Business Intelligence
& Information Management Summit 2011

February 22 – 23 • Sydney, AUSTRALIA
gartner.com/ap/bi

Register Now

From Information to Intelligence:

Evaluate, Execute and Evolve

At Gartner Business Intelligence & Information Management Summit 2011 you will experience a unique mix of Gartner research presentations, guest keynote addresses, real-life case studies and interactive panel discussions to provide you with a holistic view of the business intelligence and performance management landscape. Information, insight and advice are channeled through an increasingly targeted and focused approach, taking you from the high-level strategic view all the way to your specific issue.

Click here to view the full agenda or download the brochure.

AGENDA HIGHLIGHTS

teamsend


Guest Keynote Address

Future Thinking – Global Trends and Thinking that are Upending your Business

Anders Sorman-Nilsson
Creative Director, Thinque

Click here to read more about this session.

Best Practice Workshops:

  • How to Become an Effective Data Warehouse Modeler
  • Analytics – Business Intelligence and Performance Management ITScore

Analyst User Roundtables:

  • Enterprise Information Management – Focusing on What Matters to the Business
  • Sharepoint – thin edge of the wedge to the MS family
  • Preparing for the 2020 workplace

Worldwide Expertise at Your Fingertips!
Your questions on Business Intelligence and Performance Management answered. Meet the Gartner Analysts presenting at the Summit and book your exclusive 30 minute one-on-one ( lap top dance) with the Analysts of your choice.

AsterData partners with Tableau

This chart represents several constituent comp...
Image via Wikipedia

Tableau which has been making waves recntly with its great new data visualization tool announced a partner with my old friends at AsterData. Its really cool piece of data vis and very very fast on the desktop- so I can imagine what speed it can help with AsterData’s MPP Row and Column Zingbang AND Parallel Analytical Functions

Tableau and AsterData also share the common Stanfordian connection (but it seems software is divided quite equally between Stanford, Hardvard Dropouts and North Carolina )

It remains to be seen in this announcement how much each company  can leverage the partnership or whether it turns like the SAS Institute- AsterData partnership last year or whether it is just to announce connectors in their software to talk to each other.

See a Tableau vis at

http://public.tableausoftware.com/views/geographyofdiabetes/Dashboard2?:embed=yes&:toolbar=yes

AsterData remains the guys with the potential but I would be wrong to say MapReduceSQL is as hot in December 2010 as it was in June 2009- and the elephant in the room would be Hadoop. That and Google’s continued shyness from encashing its principal comptency of handling Big Data (but hush – I signed a NDA with the Google Prediction API– so things maaaay change very rapidly on ahem that cloud)

Disclaimer- AsterData was my internship sponsor during my winter training while at Univ of  Tenn.

 

PAWCON Bay Area March

The biggest Predictive Analytics Conference comes back to the SF Bay in March next year.

From

http://www.predictiveanalyticsworld.com/sanfrancisco/2011/

Predictive Analytics World March 2011 in San Francisco is packed with the top predictive analytics experts, practitioners, authors and business thought leaders, including keynote speakers:


Sugato Basu, Ph.D.
Senior Research Scientist
Google
Lessons Learned in Predictive Modeling 
for Ad Targeting

Eric Siegel, Ph.D.
Conference Chair
Predictive Analytics World
Five Ways Predictive Analytics
Cuts Enterprise Risk




Plus special plenary sessions from industry heavy-weights:


Andreas S. Weigend, Ph.D.
weigend.com
Former Chief Scientist, Amazon.com
The State of the Social Data Revoltion

John F. Elder, Ph.D.
CEO and Founder
Elder Research
Data Mining Lessons Learned




Predictive Analytics World focuses on concrete examples of deployed predictive analytics. Hear from the horse’s mouth precisely how Fortune 500 analytics competitors and other top practitioners deploy predictive modeling, and what kind of business impact it delivers. Click here to view the agenda at-a-glance.

PAW SF 2011 will feature speakers with case studies from leading enterprises. such as:

PAW’s March agenda covers hot topics and advanced methods such as uplift (net lift) modeling, ensemble models, social data, search marketing, crowdsourcing, blackbox trading, fraud detection, risk management, survey analysis and otherinnovative applications that benefit organizations in new and creative ways.

Join PAW and access the best keynotes, sessions, workshops, exposition, expert panel, live demos, networking coffee breaks, reception, birds-of-a-feather lunches, brand-name enterprise leaders, and industry heavyweights in the business.

 

Using SAS/IML with R

Analyze That
Image via Wikipedia

SAS just released an updated documentation to SAS/IML language with a special chapter devoted to using R

Here is an example-

CALL EXPORTMATRIXTOR( IMLMatrix, RMatrix ) ;

CALL IMPORTMATRIXFROMR( IMLMatrix, RExpr ) ;

If you have existing SAS licences and existing hardware and loots of data -this may be the best of both worlds- without getting into the mess of technically learning MKL threads/BLAS/Premium Packages/Cloud

Another thought- its a good professional looking help book, which is what more R packages can do (work on improving ease of their help/update vignettes)

 

Link-http://support.sas.com/documentation/cdl/en/imlug/63541/HTML/default/viewer.htm#r_toc.htm

 

Calling Functions in the R Language

[continuerule]