decisionstats.com – DECISION STATS

RCOMM 2012 goes live in August

An awesome conference by an awesome software Rapid Miner remains one of the leading enterprise grade open source software , that can help you do a lot of things including flow driven data modeling ,web mining ,web crawling etc which even other software cant.

Presentations include:

Mining Machine 2 Machine Data (Katharina Morik, TU Dortmund University)
Handling Big Data (Andras Benczur, MTA SZTAKI)
Introduction of RapidAnalytics at Telenor (Telenor and United Consult)
and more

Here is a list of complete program

Program

Time Slot	Tuesday Training / Workshop 1	Wednesday Conference 1	Thursday Conference 2	Friday Training / Workshop 2
09:00 – 10:30		Introductory Speech Ingo Mierswa (Rapid-I)Resource-aware Data Mining or M2M Mining (Invited Talk) Katharina Morik (TU Dortmund University) More information Data Analysis NeurophRM: Integration of the Neuroph framework into RapidMiner Miloš Jovanović, Jelena Stojanović, Milan Vukićević, Vera Stojanović, Boris Delibašić (University of Belgrade)	To be announced (Invited Talk) Andras Benczur Recommender Systems Extending RapidMiner with Recommender Systems Algorithms Matej Mihelčić, Nino Antulov-Fantulin, Matko Bošnjak, Tomislav Šmuc (Ruđer Bošković Institute) Implementation of User Based Collaborative Filtering in RapidMiner Sérgio Morais, Carlos Soares (Universidade do Porto)	Parallel Training / Workshop Session Advanced Data Mining and Data Transformations or Development Workshop Part 2
10:30 – 11:00		Coffee Break	Coffee Break	Coffee Break
11:00 – 12:30		Data Analysis Nearest-Neighbor and Clustering based Anomaly Detection Algorithms for RapidMiner Mennatallah Amer, Markus Goldstein (DFKI) Customers’ LifeStyle Targeting on Big Data using Rapid Miner Maksim Drobyshev (LifeStyle Marketing Ltd) Robust GPGPU Plugin Development for RapidMiner Andor Kovács, Zoltán Prekopcsák (Budapest University of Technology and Economics)	Extensions Optimization Plugin For RapidMiner Venkatesh Umaashankar, Sangkyun Lee (TU Dortmund University; presented by Hendrik Blom) Image Mining Extension – Year After Radim Burget, Václav Uher, Jan Mašek (Brno University of Technology) Incorporating R Plots into RapidMiner Reports Peter Jeszenszky (University of Debrecen)
12:30 – 13:30		Lunch	Lunch	Lunch
13:30 – 15:30	Parallel Training / Workshop Session Basic Data Mining and Data Transformations or Development Workshop Part 1	Applications Introduction of RapidAnalyticy Enterprise Edition at Telenor Hungary t.b.a. (Telenor Hungary and United Consult) Application of RapidMiner in Steel Industry Research and Development Bengt-Henning Maas, Hakan Koc, Martin Bretschneider (Salzgitter Mannesmann Forschung) A Comparison of Data-driven Models for Forecast River Flow Milan Cisty, Juraj Bezak (Slovak University of Technology) Portfolio Optimization Using Local Linear Regression Ensembles in Rapid Miner Gábor Nagy, Tamás Henk, Gergő Barta (Budapest University of Technology and Economics)	Extensions An Octave Extension for RapidMiner Sylvain Marié (Schneider Electric) Unstructured Data Processing Data Streams with the RapidMiner Streams-Plugin Christian Bockermann, Hendrik Blom (TU Dortmund) Automated Creation of Corpuses for the Needs of Sentiment Analysis Peter Koncz, Jan Paralic (Technical University of Kosice) Demonstration: News from the Rapid-I Labs Simon Fischer; Rapid-I This short session demonstrates the latest developments from the Rapid-I lab and will let you how you can build powerful analysis processes and routines by using those RapidMiner tools.	Certification Exam
15:30 – 16:00	Coffee Break	Coffee Break	Coffee Break
16:00 – 18:00		Book Presentation and Game Show Data Mining for the Masses: A New Textbook on Data Mining for Everyone Matthew North (Washington & Jefferson College) Matthew North presents his new book “Data Mining for the Masses” introducing data mining to a broader audience and making use of RapidMiner for practical data mining problems. Game Show Did you miss last years’ game show “Who wants to be a data miner?”? Use RapidMiner for problems it was never created for and beat the time and other contestants!	User Support Get some Coffee for free – Writing Operators with RapidMiner Beans Christian Bockermann, Hendrik Blom (TU Dortmund) Meta-Modeling Execution Times of RapidMiner operators Matija Piškorec, Matko Bošnjak, Tomislav Šmuc (Ruđer Bošković Institute) Conference day ends at ca. 17:00.
19:30		Social Event (Conference Dinner)	Social Event (Visit of Bar District)

and you should have a look at https://rapid-i.com/rcomm2012f/index.php?option=com_content&view=article&id=65

Conference is in Budapest, Hungary,Europe.

( Disclaimer- Rapid Miner is an advertising sponsor of Decisionstats.com in case you didnot notice the two banner sized ads.)

Decisionstats.com is back from a dDOS

Servers were okay, it was the DNS server that got swamped.
I am sorry for the downtime- hopefully you didnt even notice
I have faced challenges like domain name hijacking, sql injection , malicious WP plugins and thats why shifted to a professional hosting. I stand by my vendors and their professional judgement, moving away would mean the hackers won.
This was very clever to swamp the DNS provider- my compliments to the tech talent behind this.
You would think that every webmaster would have a back up plan in case his site went dDOS, but surprisingly even corporate websites dont have a back up (under attack) plan

Rapid Miner User Conference 2012

One of those cool conferences that is on my bucket list- this time in Hungary (That’s a nice place)

But I am especially interested in seeing how far Radoop has come along !

Disclaimer- Rapid Miner has been a Decisionstats.com sponsor for many years. It is also a very cool software but I like the R Extension facility even more!

—————————————————————

and not very expensive too compared to other User Conferences in Europe!-

http://rcomm2012.org/index.php/registration/prices

Information about Registration

Early Bird registration until July 20th, 2012.
Normal registration from July 21st, 2012 until August 13th, 2012.
Latest registration from August 14th, 2012 until August 24th, 2012.
Students have to provide a valid Student ID during registration.
The Dinner is included in the All Days and in the Conference packages.
All prices below are net prices. Value added tax (VAT) has to be added if applicable.

Prices for Regular Visitors

Days and Event	Early Bird Rate	Normal Rate	Latest Registration
Tuesday (Training / Development 1)	190 Euro	230 Euro	280 Euro
Wednesday + Thursday (Conference)	290 Euro	350 Euro	420 Euro
Friday (Training / Development 2 and Exam)	190 Euro	230 Euro	280 Euro
All Days *(Full Package)*	610 Euro	740 Euro	900 Euro

Prices for Authors and Students

In case of students, please note that you will have to provide a valid student ID during registration.

Days and Event	Early Bird Rate	Normal Rate	Latest Registration
Tuesday (Training / Development 1)	90 Euro	110 Euro	140 Euro
Wednesday + Thursday (Conference)	140 Euro	170 Euro	210 Euro
Friday (Training / Development 2 and Exam)	90 Euro	110 Euro	140 Euro
All Days *(Full Package)*	290 Euro	350 Euro	450 Euro

http://rcomm2012.org/index.php/program

Program

Time Slot	Tuesday Training / Workshop 1	Wednesday Conference 1	Thursday Conference 2	Friday Training / Workshop 2
09:00 – 10:30		Introductory Speech Ingo Mierswa; Rapid-I Data Analysis NeurophRM: Integration of the Neuroph framework into RapidMiner Miloš Jovanović, Jelena Stojanović, Milan Vukićević, Vera Stojanović, Boris Delibašić (University of Belgrade)	To be announced (Invited Talk) To be announced Recommender Systems Extending RapidMiner with Recommender Systems Algorithms Matej Mihelčić, Nino Antulov-Fantulin, Matko Bošnjak, Tomislav Šmuc (Ruđer Bošković Institute) Implementation of User Based Collaborative Filtering in RapidMiner Sérgio Morais, Carlos Soares (Universidade do Porto)	Parallel Training / Workshop Session Advanced Data Mining and Data Transformations or Development Workshop Part 2
10:30 – 12:30		Data Analysis Nearest-Neighbor and Clustering based Anomaly Detection Algorithms for RapidMiner Mennatallah Amer, Markus Goldstein (DFKI) Customers’ LifeStyle Targeting on Big Data using Rapid Miner Maksim Drobyshev (LifeStyle Marketing Ltd) Robust GPGPU Plugin Development for RapidMiner Andor Kovács, Zoltán Prekopcsák (Budapest University of Technology and Economics)	Extensions Image Mining Extension – Year After Radim Burget, Václav Uher, Jan Mašek (Brno University of Technology) Incorporating R Plots into RapidMiner Reports Peter Jeszenszky (University of Debrecen) An Octave Extension for RapidMiner Sylvain Marié (Schneider Electric)
12:30 – 13:30		Lunch	Lunch	Lunch
13:30 – 15:00	Parallel Training / Workshop Session Basic Data Mining and Data Transformations or Development Workshop Part 1	Applications Application of RapidMiner in Steel Industry Research and Development Bengt-Henning Maas, Hakan Koc, Martin Bretschneider (Salzgitter Mannesmann Forschung) A Comparison of Data-driven Models for Forecast River Flow Milan Cisty, Juraj Bezak (Slovak University of Technology) Portfolio Optimization Using Local Linear Regression Ensembles in Rapid Miner Gábor Nagy, Tamás Henk, Gergő Barta (Budapest University of Technology and Economics)	Unstructured Data Processing Data Streams with the RapidMiner Streams-Plugin Christian Bockermann, Hendrik Blom (TU Dortmund) Automated Creation of Corpuses for the Needs of Sentiment Analysis Peter Koncz, Jan Paralic (Technical University of Kosice) Demonstration News from the Rapid-I Labs Simon Fischer; Rapid-I This short session demonstrates the latest developments from the Rapid-I lab and will let you how you can build powerful analysis processes and routines by using those RapidMiner tools.	Certification Exam
15:00 – 17:00		Book Presentation and Game Show Data Mining for the Masses: A New Textbook on Data Mining for Everyone Matthew North (Washington & Jefferson College) Matthew North presents his new book “Data Mining for the Masses” introducing data mining to a broader audience and making use of RapidMiner for practical data mining problems. Game Show Did you miss last years’ game show “Who wants to be a data miner?”? Use RapidMiner for problems it was never created for and beat the time and other contestants!	User Support Get some Coffee for free – Writing Operators with RapidMiner Beans Christian Bockermann, Hendrik Blom (TU Dortmund) Meta-Modeling Execution Times of RapidMiner operators Matija Piškorec, Matko Bošnjak, Tomislav Šmuc (Ruđer Bošković Institute)
19:00		Social Event (Conference Dinner)	Social Event (Visit of Bar District)

Training: Basic Data Mining and Data Transformations

This is a short introductory training course for users who are not yet familiar with RapidMiner or only have a few experiences with RapidMiner so far. The topics of this training session include

Basic Usage
- User Interface
- Creating and handling RapidMiner repositories
- Starting a new RapidMiner project
- Operators and processes
- Loading data from flat files
- Storing data, processes, and results
Predictive Models
- Linear Regression
- Naïve Bayes
- Decision Trees
Basic Data Transformations
- Changing names and roles
- Handling missing values
- Changing value types by discretization and dichotimization
- Normalization and standardization
- Filtering examples and attributes
Scoring and Model Evaluation
- Applying models
- Splitting data
- Evaluation methods
- Performance criteria
- Visualizing Model Performance

Training: Advanced Data Mining and Data Transformations

This is a short introductory training course for users who already know some basic concepts of RapidMiner and data mining and have already used the software before, for example in the first training on Tuesday. The topics of this training session include

Advanced Data Handling
- Sampling
- Balancing data
- Joins and Aggregations
- Detection and removal of outliers
- Dimensionality reduction
Control process execution
- Remember process results
- Recall process results
- Loops
- Using branches and conditions
- Exception handling
- Definition of macros
- Usage of macros
- Definition of log values
- Clearing log tables
- Transforming log tables to data

Development Workshop Part 1 and Part 2

Want to exchange ideas with the developers of RapidMiner? Or learn more tricks for developing own operators and extensions? During our development workshops on Tuesday and Friday, we will build small groups of developers each working on a small development project around RapidMiner. Beginners will get a comprehensive overview of the architecture of RapidMiner before making the first steps and learn how to write own operators. Advanced developers will form groups with our experienced developers, identify shortcomings of RapidMiner and develop a new extension which might be presented during the conference already. Unfinished work can be continued in the second workshop on Friday before results might be published on the Marketplace or can be taken home as a starting point for new custom operators.

Using Google Analytics with R

Some code to read in data from Google Analytics data. Some modifications include adding the SSL authentication code and modifying (in bold) the table.id parameter to choose correct website from a GA profile with many websites

The Google Analytics Package files can be downloaded from http://code.google.com/p/r-google-analytics/downloads/list

It provides access to Google Analytics data natively from the R Statistical Computing programming language. You can use this library to retrieve an R data.frame with Google Analytics data. Then perform advanced statistical analysis, like time series analysis and regressions.

Supported Features

Access to v2 of the Google Analytics Data Export API Data Feed
A QueryBuilder class to simplify creating API queries
API response is converted directly into R as a data.frame
Library returns the aggregates, and confidence intervals of the metrics, dynamically if they exist
Auto-pagination to return more than 10,000 rows of information by combining multiple data requests. (Upper Limit 1M rows)
Authorization through the ClientLogin routine
Access to all the profiles ids for the authorized user
Full documentation and unit tests

Code-

> library(XML)

> library(RCurl)

Loading required package: bitops

> #Change path name in the following to the folder you downloaded the Google Analytics Package

> source(“C:/Users/KUs/Desktop/CANADA/R/RGoogleAnalytics/R/RGoogleAnalytics.R”)

> source(“C:/Users/KUs/Desktop/CANADA/R/RGoogleAnalytics/R/QueryBuilder.R”)

> # download the file needed for authentication

> download.file(url=”http://curl.haxx.se/ca/cacert.pem”, destfile=”cacert.pem”)

trying URL ‘http://curl.haxx.se/ca/cacert.pem’ Content type ‘text/plain’ length 215993 bytes (210 Kb) opened

URL downloaded 210 Kb

> # set the curl options

> curl <- getCurlHandle()

> options(RCurlOptions = list(capath = system.file(“CurlSSL”, “cacert.pem”,

+ package = “RCurl”),

+ ssl.verifypeer = FALSE))

> curlSetOpt(.opts = list(proxy = ‘proxyserver:port’), curl = curl)

An object of class “CURLHandle” Slot “ref”: <pointer: 0000000006AA2B70>

> # 1. Create a new Google Analytics API object

> ga <- RGoogleAnalytics()

> # 2. Authorize the object with your Google Analytics Account Credentials

> ga$SetCredentials(“USERNAME”, “PASSWORD”)

> # 3. Get the list of different profiles, to help build the query

> profiles <- ga$GetProfileData()

> profiles #Error Check to See if we get the right website

$profile AccountName ProfileName TableId

1 dudeofdata.com dudeofdata.com ga:44926237

2 knol.google.com knol.google.com ga:45564890

3 decisionstats.com decisionstats.com ga:46751946

$total.results

total.results

1 3

> # 4. Build the Data Export API query

> #Modify the start.date and end.date parameters based on data requirements

> #Modify the table.id at table.id = paste(profiles$profile[X,3]) to get the X th website in your profile

> # 4. Build the Data Export API query

> query <- QueryBuilder() > query$Init(start.date = “2012-01-09”, + end.date = “2012-03-20”, + dimensions = “ga:date”,

+ metrics = “ga:visitors”,

+ sort = “ga:date”,

+ table.id = paste(profiles$profile[3,3]))

> #5. Make a request to get the data from the API

> ga.data <- ga$GetReportData(query)

[1] “Executing query: https://www.google.com/analytics/feeds/data?start-date=2012%2D01%2D09&end-date=2012%2D03%2D20&dimensions=ga%3Adate&metrics=ga%3Avisitors&sort=ga%3Adate&ids=ga%3A46751946”

> #6. Look at the returned data

> str(ga.data)

List of 3

$ data :’data.frame’: 72 obs. of 2 variables: ..

$ ga:date : chr [1:72] “20120109” “20120110” “20120111” “20120112” … ..

$ ga:visitors: num [1:72] 394 405 381 390 323 47 169 67 94 89 …

$ aggr.totals :’data.frame’: 1 obs. of 1 variable: ..

$ aggregate.totals: num 28348

$ total.results: num 72

> head(ga.data$data)

ga:date ga:visitors

1 20120109 394

2 20120110 405

3 20120111 381

4 20120112 390

5 20120113 323

6 20120114 47 >

> #Plotting the Traffic >

> plot(ga.data$data[,2],type=”l”)

Update- Some errors come from pasting Latex directly to WordPress. Here is some code , made pretty-r in case you want to play with the GA api

library(XML)

library(RCurl)

#Change path name in the following to the folder you downloaded the Google Analytics Package 

source("C:/Users/KUs/Desktop/CANADA/R/RGoogleAnalytics/R/RGoogleAnalytics.R")

source("C:/Users/KUs/Desktop/CANADA/R/RGoogleAnalytics/R/QueryBuilder.R")
# download the file needed for authentication
download.file(url="http://curl.haxx.se/ca/cacert.pem", destfile="cacert.pem")

# set the curl options
curl <- getCurlHandle()
options(RCurlOptions = list(capath = system.file("CurlSSL", "cacert.pem",
package = "RCurl"),
ssl.verifypeer = FALSE))
curlSetOpt(.opts = list(proxy = 'proxyserver:port'), curl = curl)

# 1. Create a new Google Analytics API object 

ga <- RGoogleAnalytics()

# 2. Authorize the object with your Google Analytics Account Credentials 

ga$SetCredentials("ohri2007@gmail.com", "XXXXXXX")

# 3. Get the list of different profiles, to help build the query

profiles <- ga$GetProfileData()

profiles #Error Check to See if we get the right website

# 4. Build the Data Export API query 

#Modify the start.date and end.date parameters based on data requirements 

#Modify the table.id at table.id = paste(profiles$profile[X,3]) to get the X th website in your profile 
# 4. Build the Data Export API query
query <- QueryBuilder()
query$Init(start.date = "2012-01-09",
                   end.date = "2012-03-20",
                   dimensions = "ga:date",
                   metrics = "ga:visitors",
                   sort = "ga:date",
                   table.id = paste(profiles$profile[3,3]))

#5. Make a request to get the data from the API 

ga.data <- ga$GetReportData(query)

#6. Look at the returned data 

str(ga.data)

head(ga.data$data)

#Plotting the Traffic 

plot(ga.data$data[,2],type="l")

Created by Pretty R at inside-R.org

Updated Blogging Policy for Decisionstats.com

: Image via Wikipedia

I will be moving and transitioning all cultural,philosophical ,poetry, and political writing to separate blogs.

Decisionstats is for better TECHNICAL decisions by FASTER STATS (on technology).

for better political decisions (how to organize protests in Asia when the govt cuts off the internet) (separate culture blog),
better cultural decisions (which movie should we go to) (separate culture blog),
better poetry reading (seperate TUMBLR blog)

Thanks,

Ajay Ohri

Decisionstats.com Version N.0 (decisionstats.com)
Browsing update- Dear Decisionstats.com Reader (decisionstats.com)
Do android hackers tweet about electric sheep? (decisionstats.com)

Browsing update- Dear Decisionstats.com Reader

In view of the recent root level breach of WordPress, which may include viewing source code for hidden hacks or Trojans, as effective immediately, please Decisionstats.com has no responsibility for any viruses, or Trojans that you may inadvertently download while on this website. I will be responsible for any deliberate malicious honey traps I put up , but any body putting an interesting comment with a link on this website , can and may direct you to phishing.

All disputes will be to subject to the jurisdiction of Tis Hazari Court, Delhi, India as already mentioned.

Getting Flatr on a wordpress.com blog

What is Flattr?

social micro payments- aka another way for bloggers, tweeters, facebookies to make money.

Thing of it as the Paypal plus a ReTweetmeme button.

FlattR is the new legal business of the creator of Pirate Bay- the large search engine for bit torrent data.

and how to enable it on WordPress.com

Read some snarkly grrovy instructions here with a screenshot

http://thereturnofthepublic.wordpress.com/2011/04/10/putting-flattr-on-a-wordpress-com-blog-a-guide-for-drooling-imbeciles/

1.) Open a Flattr.com account here. This should be reasonably straightforward. A monkey hitting keys at random could manage it in about half an hour. It took me less than 45 minutes.

2.) In the top right of ‘Your Flattr Dashboard’ there is a button ‘Submit Thing’. Click on that and enter the details of your blog – the URL (like decisionstats.com for me) and a description (make that atleast 3 sentences). Flattr will create a page – for example,https://flattr.com/thing/162940/example-blog

Now go to your wordpress dashboard- sharing tab.

/wp-admin/options-general.php?page=sharing

Add the following lines to your New Add Service in respective tabs

URL= https://flattr.com/thing/175763/DecisionStats (change this to the one created for yourself instep2 above)

ICON = http://api.flatrr.com/button/flattr-badge-large.png

If you have a non WordPress blog see instructions at http://markup.io/v/jz3wv155bsfg or screenshot of instructions here-

Program

Time Slot

Tuesday Training / Workshop 1

Wednesday Conference 1

Thursday Conference 2

Friday Training / Workshop 2

09:00 – 10:30

10:30 – 11:00

Coffee Break

Coffee Break

Coffee Break

11:00 – 12:30

12:30 – 13:30

Lunch

Lunch

Lunch

13:30 – 15:30

15:30 – 16:00

Coffee Break

Coffee Break

Coffee Break

16:00 – 18:00

19:30

Social Event (Conference Dinner)

Social Event (Visit of Bar District)

Please share:

Please share:

Information about Registration

Prices for Regular Visitors

Days and Event

Early Bird Rate

Normal Rate

Latest Registration

Prices for Authors and Students

Days and Event

Early Bird Rate

Normal Rate

Latest Registration

Program

Time Slot

Tuesday Training / Workshop 1

Wednesday Conference 1

Thursday Conference 2

Friday Training / Workshop 2

09:00 – 10:30

10:30 – 12:30

12:30 – 13:30

Lunch

Lunch

Lunch

13:30 – 15:00

15:00 – 17:00

19:00

Social Event (Conference Dinner)

Social Event (Visit of Bar District)

Training: Basic Data Mining and Data Transformations

Training: Advanced Data Mining and Data Transformations

Development Workshop Part 1 and Part 2

Please share:

Supported Features

Please share:

Related articles

Please share:

Please share:

Please share:

Time
Slot

Tuesday
Training / Workshop 1

Wednesday
Conference 1

Thursday
Conference 2

Friday
Training / Workshop 2

Time
Slot

Tuesday
Training / Workshop 1

Wednesday
Conference 1

Thursday
Conference 2

Friday
Training / Workshop 2