A Software Called Rattle

One of my favorite software GUI’s- here is a paper talking of it, it was published in R Journal and describes Dr Graham William’s work in it. If you are software user or creator it is worth a dekko in terms of adding analytical extensions for your platform of business.

News on R Commercial Development -Rattle- R Data Mining Tool

R RANT- while the European R Core leadership led by the Great Dane, Pierre Dalgaard focuses on the small picture and virtually handing the whole commercial side to Prof Nie and David Smith at Revo Computing other smaller package developers have refused to be treated as cheap R and D developers for enterprise software. How’s the book sales coming along, Prof Peter? Any plans to write another R Book or are you done with writing your version of Mathematica (Ref-Newton). Running the R Core project team must be so hard I recommend the Tarantino movie “Inglorious B…” for Herr Doktors. -END

I believe that individual R Package creators like Prof Harell (Hmisc) , or Hadley Wickham (plyr) deserve a share of the royalties or REVENUE that Revolution Computing, or ANY software company that uses R.

On this note-Some updated news on Rattle the Data Mining Tool created by Dr Graham Williams. Once again R development taken ahead by Down Under chaps while the Big Guys thrash out the road map across the Pond.

Data Mining Resources

Citation –http://datamining.togaware.com/

Rattle is a free and open source data mining toolkit written in the statistical language R using the Gnome graphical interface. It runs under GNU/Linux, Macintosh OS X, and MS/Windows. Rattle is being used in business, government, research and for teaching data mining in Australia and internationally. Rattle can be purchased on DVD (or made available as a downloadable CD image) as a standalone installation for $450USD ($560AUD), using one of the following payment buttons.

The free and open source book, The Data Mining Desktop Survival Guide (ISBN 0-9757109-2-3) simply explains the otherwise complex algorithms and concepts of data mining, with examples to illustrate each algorithm using the statistical language R. The book is being written by Dr Graham Williams, based on his 20 years research and consulting experience in machine learning and data mining. An electronic PDF version is available for a small fee from Togaware ($40AUD/$35USD to cover costs and ongoing development);

Other Resources

  • The Data Mining Software Repository makes available a collection of free (as in libre) open source software tools for data mining
  • The Data Mining Catalogue lists many of the free and commercial data mining tools that are available on the market.
  • The Australasian Data Mining Conferences are supported by Togaware, which also hosts the web site.
  • Information about the Pacific Asia Knowledge Discovery and Data Mining series of conferences is also available.
  • Data Mining course is taught at the Australian National University.
  • See also the Canberra Analytics Practise Group.
  • A Data Mining Course was held at the Harbin Institute of Technology Shenzhen Graduate School, China, 6 December – 13 December 2006. This course introduced the basic concepts and algorithms of data mining from an applications point of view and introduced the use of R and Rattle for data mining in practise.
  • Data Mining Workshop was held over two days at the University of Canberra, 27-28 November, 2006. This course introduced the basic concepts and algorithms for data mining and the use of R and Rattle.

Using R for Data Mining

The open source statistical programming language R (based on S) is in daily use in academia and in business and government. We use R for data mining within the Australian Taxation Office. Rattle is used by those wishing to interact with R through a GUI.

R is memory based so that on 32bit CPUs you are limited to smaller datasets (perhaps 50,000 up to 100,000, depending on what you are doing). Deploying R on 64bit multiple CPU (AMD64) servers running GNU/Linux with 32GB of main memory provides a powerful platform for data mining.

R is open source, thus providing assurance that there will always be the opportunity to fix and tune things that suit our specific needs, rather than rely on having to convince a vendor to fix or tune their product to suit our needs.

Also, by being open source, we can be sure that the code will always be available, unlike some of the data mining products that have disappearded (e.g., IBM’s Intelligent Miner).

See earlier interview-


Webfocus RStat: Pervasive BI using R

Here is a great reporting and BI tool from Information Builders  and uses the Rattle R GUI ( covered earlier here http://www.decisionstats.com/2009/01/13/interview-dr-graham-williams/).

So if you are looking for generation next reporting solution here is one called WebFocus RStat.



Predict the Future and Make Effective Decisions Today

Traditional reporting solutions provide a clear picture of past occurrences, but have little power to shed light on the future. The ability to anticipate and prepare for upcoming events can greatly impact the decisions that need to be made today.

WebFOCUS RStat is the market’s first fully-integrated business intelligence and data mining environment, seamlessly bridging the gap between backward and forward-facing views of business operations. With WebFOCUS RStat, companies can easily and cost-effectively deploy predictive models as intuitive scoring applications. So business users at all levels can make decisions based on accurate, validated future predictions, instead of relying on gut instinct alone.

WebFOCUS RStat provides a single platform for BI, data modeling, and scoring. This eliminates the need to purchase and maintain multiple tools, and frees analysts and other statisticians from spending countless hours extracting and querying data. At the same time, it reduces costs, simplifies maintenance, and optimizes IT resources.

But, the greatest benefit WebFOCUS RStat offers is significantly increased accuracy. With the R engine – a powerful and flexible open source statistical programming language – as its underlying analysis tool, WebFOCUS RStat can deliver results that are consistent, complete, and correct – every time.

WebFOCUS RStat provides:

  • A single tool, fully integrated with Developer Studio and WebFOCUS Reporting Servers with access to over 300 data sources, for both BI developers and data miners
  • Comprehensive data exploration, descriptive statistics, and interactive graphs
  • In-depth data visualization and transformation
  • Hypothesis testing, clustering, and correlation analysis

Other key WebFOCUS RStat features include:

  • The ability to build and export models for prediction and classification
  • Comprehensive model evaluation

Incidently the parent company which is based in Tennessee has some interesting numbers-


Company At A Glance
  • $300 million in revenue
  • Over 30 years of experience
  • More than 1,400 employees
  • Over 12,000 customers
  • Over 350 business partners
  • 47 offices and 26 worldwide distributors
  • Rapid application creation through easy incorporation of scoring routines into WebFOCUS reports

See Also-



Decisionstats Interviews

Here is a list of interviews that I have published- these are specific to analytics and data mining and include only the most recent interviews. If I have missed out any notable recent interview related to analytics and data mining, kindly do let me know. Hat Tip to Karl Rexer, for this suggestion .

Date    Name of Interviewee    Designation and Organization

09-Jun    Karl Rexer                          President, Rexer Analytics
05-Jun    Jim Daves                          CMO, SAS Institute
04-Jun    Paul van Eikeren                 President and CEO, Blue Reference
29-May    David Smith                      Director of Community, REvolution Computing
17-May    Dominic Pouzin                 CEO, Data Applied
11-May    Bruno Delahaye                 VP, KXEN
04-May    Ron Ramos                        Director, Zementis
30-Apr    Oliver Jouve                       VP, SPSS Inc
21-Apr    Fabian Dill                         Co- Founder, Knime.com
18-Apr    Alicia Mcgreevey                 Head Marketing, Visual Numerics
27-Mar    Francoise Soulie Fogelman    VP, KXEN
17-Mar    Jon Peck                            Principal Software Engineer, SPSS Inc
06-Mar    Anne Milley                        Director of product marketing, SAS Institute
04-Mar    Anne Milley                        Director of product marketing, SAS Institute
03-Feb    Phil Rack                            Creator, Bridge to R,and CEO Minequest
03-Feb    Michael Zeller                     CEO, Zementis
31-Jan    Richard Schultz                   CEO, Revolution Computing
21-Jan    Bob Muenchen                    Author, R for SAS and SPSS Users
13-Jan    Dr Graham Williams           Creator, Rattle GUI for R
05-Jan    Roger Haddad                    CEO, KXEN
26-Sep    June Dershewitz                  VP, Semphonic
04-Sep    Vincent Granville                 Head, Analyticbridge

The URl’s to specific interviews are also in this sheet.


Fast R Graphics

So you don’t know R  because you were always working on office projects and did not have time to learn. The R list looked down on you and told you to read the documentation first. And then you needed to create some fast R graphics and some R code.

Help is here-

Download R from http://www.r-project.org,install it

open it-go to packages> set CRAN Mirror > to your country from drop down

type following in the R GUI near the ‘ >’ prompt-

“install.packages(“rattle”, dependencies=TRUE)”

so it should loook like

>install.packages(“rattle”, dependencies=TRUE)

Wait 15 minutes while downloads happen

Then packages>load package>rattle

Type rattle() at the command prompt

Now – in the new window called Rattle

load data from a .csv file using the browse options

click execute

Go straight to Explore-and click on distibutions.

Note you can also download rattle from www.rattle.togaware.com , these guys are the best.

Here are the graphs


But what about the code (note some variable names disguised).The code may be intimidating to a novice R user but it is auto generated , its like jumping straight to SAS Enterprise without learning SAS Editor-

Go to the last tab -log and

see the auto generated code.


%d bloggers like this: