Visual Guides to CRISP-DM ,KDD and SEMMA

UPDATED- Here are three great examples of a visualization making a process easy to understand. Please click on the images to read them clearly.

1) It visualizes CRISP-DM and is made by Nicole Leaper (


2) KDD -Knowledge Discovery in Databases -visualization by Fayyad whom I have interviewed here at

and work By Gregory Piatetsky Shapiro interviewed by this website here


3) I am also attaching a visual representation of SEMMA from



Knowledge Discovery in Databases -KDD using PostgreSQL and #Rstats

Here is a small brief primer for beginners on configuring an open source database and using an open source analytics package.

All you need to know – is to read!


1. download PostgreSQL from PostgreSQL

Remember to store /memorize the password for the user postgres!

Create a connection using pgAdmin feature in Start Menu

2. download ODBC driver from
and the Win 64 edition from

install ODBC driver

3. Go to

Start Menu\Control Panel\All Control Panel Items\Administrative Tools\Data Sources (ODBC)

4. Configure the following details in System DSN and  User DSN using the ADD tabs .Test connection to check if connection is working

5. Start R and install and load library RODBC

6. Use following initial code for R- if you know SQL you can  do the rest
> library(RODBC)

> odbcDataSources(type = c(“all”, “user”, “system”))
SQLServer              PostgreSQL30             PostgreSQL35W
“SQL Server”    “PostgreSQL ANSI(x64)” “PostgreSQL Unicode(x64)”

> ajay=odbcConnect(“PostgreSQL30″, uid = “postgres”, pwd = “XX”)

> sqlTables(ajay)
1        postgres      public      names      TABLE

> crimedat <- sqlFetch(ajay, “names”)