Rapid Miner User Conference 2012

One of those cool conferences that is on my bucket list- this time in Hungary (That’s a nice place)

But I am especially interested in seeing how far Radoop has come along !

Disclaimer- Rapid Miner has been a Decisionstats.com sponsor  for many years. It is also a very cool software but I like the R Extension facility even more!

—————————————————————

and not very expensive too compared to other User Conferences in Europe!-

http://rcomm2012.org/index.php/registration/prices

Information about Registration

  • Early Bird registration until July 20th, 2012.
  • Normal registration from July 21st, 2012 until August 13th, 2012.
  • Latest registration from August 14th, 2012 until August 24th, 2012.
  • Students have to provide a valid Student ID during registration.
  • The Dinner is included in the All Days and in the Conference packages.
  • All prices below are net prices. Value added tax (VAT) has to be added if applicable.

Prices for Regular Visitors

Days and Event
Early Bird Rate
Normal Rate
Latest Registration
Tuesday

(Training / Development 1)

190 Euro 230 Euro 280 Euro
Wednesday + Thursday

(Conference)

290 Euro 350 Euro 420 Euro
Friday

(Training / Development 2 and Exam)

190 Euro 230 Euro 280 Euro
All Days

(Full Package)

610 Euro 740 Euro 900 Euro

Prices for Authors and Students

In case of students, please note that you will have to provide a valid student ID during registration.

Days and Event
Early Bird Rate
Normal Rate
Latest Registration
Tuesday

(Training / Development 1)

90 Euro 110 Euro 140 Euro
Wednesday + Thursday

(Conference)

140 Euro 170 Euro 210 Euro
Friday

(Training / Development 2 and Exam)

90 Euro 110 Euro 140 Euro
All Days

(Full Package)

290 Euro 350 Euro 450 Euro
Time
Slot
Tuesday
Training / Workshop 1
Wednesday
Conference 1
Thursday
Conference 2
Friday
Training / Workshop 2
09:00 – 10:30
Introductory Speech
Ingo Mierswa; Rapid-I 

Data Analysis

 

NeurophRM: Integration of the Neuroph framework into RapidMiner
Miloš Jovanović, Jelena Stojanović, Milan Vukićević, Vera Stojanović, Boris Delibašić (University of Belgrade)

To be announced (Invited Talk)
To be announced

 

Recommender Systems

 

Extending RapidMiner with Recommender Systems Algorithms
Matej Mihelčić, Nino Antulov-Fantulin, Matko Bošnjak, Tomislav Šmuc (Ruđer Bošković Institute)

Implementation of User Based Collaborative Filtering in RapidMiner
Sérgio Morais, Carlos Soares (Universidade do Porto)

Parallel Training / Workshop Session

Advanced Data Mining and Data Transformations

or

Development Workshop Part 2

10:30 – 12:30
Data Analysis

Nearest-Neighbor and Clustering based Anomaly Detection Algorithms for RapidMiner
Mennatallah Amer, Markus Goldstein (DFKI)

Customers’ LifeStyle Targeting on Big Data using Rapid Miner
Maksim Drobyshev (LifeStyle Marketing Ltd)

Robust GPGPU Plugin Development for RapidMiner
Andor Kovács, Zoltán Prekopcsák (Budapest University of Technology and Economics)

Extensions

Image Mining Extension – Year After
Radim Burget, Václav Uher, Jan Mašek (Brno University of Technology)

Incorporating R Plots into RapidMiner Reports
Peter Jeszenszky (University of Debrecen)

An Octave Extension for RapidMiner
Sylvain Marié (Schneider Electric)

12:30 – 13:30
Lunch
Lunch
Lunch
13:30 – 15:00
Parallel Training / Workshop Session

Basic Data Mining and Data Transformations

or

Development Workshop Part 1

Applications

Application of RapidMiner in Steel Industry Research and Development
Bengt-Henning Maas, Hakan Koc, Martin Bretschneider (Salzgitter Mannesmann Forschung)

A Comparison of Data-driven Models for Forecast River Flow
Milan Cisty, Juraj Bezak (Slovak University of Technology)

Portfolio Optimization Using Local Linear Regression Ensembles in Rapid Miner
Gábor Nagy, Tamás Henk, Gergő Barta (Budapest University of Technology and Economics)

Unstructured Data


Processing Data Streams with the RapidMiner Streams-Plugin
Christian Bockermann, Hendrik Blom (TU Dortmund)

Automated Creation of Corpuses for the Needs of Sentiment Analysis
Peter Koncz, Jan Paralic (Technical University of Kosice)

 

Demonstration

 

News from the Rapid-I Labs
Simon Fischer; Rapid-I

This short session demonstrates the latest developments from the Rapid-I lab and will let you how you can build powerful analysis processes and routines by using those RapidMiner tools.

Certification Exam
15:00 – 17:00
Book Presentation and Game Show

Data Mining for the Masses: A New Textbook on Data Mining for Everyone
Matthew North (Washington & Jefferson College)

Matthew North presents his new book “Data Mining for the Masses” introducing data mining to a broader audience and making use of RapidMiner for practical data mining problems.

 

Game Show
Did you miss last years’ game show “Who wants to be a data miner?”? Use RapidMiner for problems it was never created for and beat the time and other contestants!

User Support

Get some Coffee for free – Writing Operators with RapidMiner Beans
Christian Bockermann, Hendrik Blom (TU Dortmund)

Meta-Modeling Execution Times of RapidMiner operators
Matija Piškorec, Matko Bošnjak, Tomislav Šmuc (Ruđer Bošković Institute) 

19:00
Social Event (Conference Dinner)
Social Event (Visit of Bar District)

 

Training: Basic Data Mining and Data Transformations

This is a short introductory training course for users who are not yet familiar with RapidMiner or only have a few experiences with RapidMiner so far. The topics of this training session include

  • Basic Usage
    • User Interface
    • Creating and handling RapidMiner repositories
    • Starting a new RapidMiner project
    • Operators and processes
    • Loading data from flat files
    • Storing data, processes, and results
  • Predictive Models
    • Linear Regression
    • Naïve Bayes
    • Decision Trees
  • Basic Data Transformations
    • Changing names and roles
    • Handling missing values
    • Changing value types by discretization and dichotimization
    • Normalization and standardization
    • Filtering examples and attributes
  • Scoring and Model Evaluation
    • Applying models
    • Splitting data
    • Evaluation methods
    • Performance criteria
    • Visualizing Model Performance

 

Training: Advanced Data Mining and Data Transformations

This is a short introductory training course for users who already know some basic concepts of RapidMiner and data mining and have already used the software before, for example in the first training on Tuesday. The topics of this training session include

  • Advanced Data Handling
    • Sampling
    • Balancing data
    • Joins and Aggregations
    • Detection and removal of outliers
    • Dimensionality reduction
  • Control process execution
    • Remember process results
    • Recall process results
    • Loops
    • Using branches and conditions
    • Exception handling
    • Definition of macros
    • Usage of macros
    • Definition of log values
    • Clearing log tables
    • Transforming log tables to data

 

Development Workshop Part 1 and Part 2

Want to exchange ideas with the developers of RapidMiner? Or learn more tricks for developing own operators and extensions? During our development workshops on Tuesday and Friday, we will build small groups of developers each working on a small development project around RapidMiner. Beginners will get a comprehensive overview of the architecture of RapidMiner before making the first steps and learn how to write own operators. Advanced developers will form groups with our experienced developers, identify shortcomings of RapidMiner and develop a new extension which might be presented during the conference already. Unfinished work can be continued in the second workshop on Friday before results might be published on the Marketplace or can be taken home as a starting point for new custom operators.

Radoop 0.3 launched- Open Source Graphical Analytics meets Big Data

What is Radoop? Quite possibly an exciting mix of analytics and big data computing

 

http://blog.radoop.eu/?p=12

What is Radoop?

Hadoop is an excellent tool for analyzing large data sets, but it lacks an easy-to-use graphical interface. RapidMiner is an excellent tool for data analytics, but its data size is limited by the memory available, and a single machine is often not enough to run the analyses on time. In this project, we combine the strengths of both projects and provide a RapidMiner extension for editing and running ETL, data analytics and machine learning processes over Hadoop.

We have closely integrated the highly optimized data analytics capabilities of Hive and Mahout, and the user-friendly interface of RapidMiner to form a powerful and easy-to-use data analytics solution for Hadoop.

 

and what’s new

http://blog.radoop.eu/?p=198

Radoop 0.3 released – fully graphical big data analytics

Today, Radoop had a major step forward with its 0.3 release. The new version of the visual big data analytics package adds full support for all major Hadoop distributions used these days: Apache Hadoop 0.20.2, 0.20.203, 1.0 and Cloudera’s Distribution including Apache Hadoop 3 (CDH3). It also adds support for large clusters by allowing the namenode, the jobtracker and the Hive server to reside on different nodes.

As Radoop’s promise is to make big data analytics easier, the 0.3 release is also focused on improving the user interface. It has an enhanced breakpointing system which allows to investigate intermediate results, and it adds dozens of quick fixes, so common process design mistakes get much easier to solve.

There are many further improvements and fixes, so please consult the release notes for more details. Radoop is in private beta mode, but heading towards a public release in Q2 2012. If you would like to get early access, then please apply at the signup page or describe your use case in email (beta at radoop.eu).

Radoop 0.3 (15 February 2012)

  • Support for Apache Hadoop 0.20.2, 0.20.203, 1.0 and Cloudera’s Distribution Including Apache Hadoop 3 (CDH3) in a single release
  • Support for clusters with separate master nodes (namenode, jobtracker, Hive server)
  • Enhanced breakpointing to evaluate intermediate results
  • Dozens of quick fixes for the most common process design errors
  • Improved process design and error reporting
  • New welcome perspective to help in the first steps
  • Many bugfixes and performance improvements

Radoop 0.2.2 (6 December 2011)

  • More Aggregate functions and distinct option
  • Generate ID operator for convenience
  • Numerous bug fixes and improvements
  • Improved user interface

Radoop 0.2.1 (16 September 2011)

  • Set Role and Data Multiplier operators
  • Management panel for testing Hadoop connections
  • Stability improvements for Hive access
  • Further small bugfixes and improvements

Radoop 0.2 (26 July 2011)

  • Three new algoritms: Fuzzy K-Means, Canopy, and Dirichlet clustering
  • Three new data preprocessing operators: Normalize, Replace, and Replace Missing Values
  • Significant speed improvements in data transmission and interactive analytics
  • Increased stability and speedup for K-Means
  • More flexible settings for Join operations
  • More meaningful error messages
  • Other small bugfixes and improvements

Radoop 0.1 (14 June 2011)

Initial release with 26 operators for data transmission, data preprocessing, and one clustering algorithm.

Note that Rapid Miner also has a great R extension so you can use R, a graphical interface and big data analytics is now easier and more powerful than ever.


Webinar: Using R within Oracle #rstats

Webinar: Using R within Oracle — Nov 30, noon EST

==========================================
Oracle now supports the R open source statistical programming language. Come to this webinar to learn more about using R within an Oracle environment.

— URL for TechCast: https://stbeehive.oracle.com/bconf/confDetails?confID=334B:3BF0:owch:38893C00F42F38A1E0404498C8A6612B0004075AECF7&guest=true&confKey=608880
— Web Conference ID: 303397
— Web Conference Key: 608880
— Dialup:             1-866-682-4770      , ID 5548204, passcode 1234

After a steady rise in the past few years, in 2010 the open source data mining software R overtook other tools to become the tool used by more data miners (43%) than any other (http://www.rexeranalytics.com/Data-Miner-Survey-Results-2010.html).

Several analytic tool vendors have added R-integration to their software. However, Oracle is the largest company to throw their weight behind R. On October 3, Oracle unveiled their integration of R: Oracle R Enterprise (http://www.oracle.com/us/corporate/features/features-oracle-r-enterprise-498732.html) as part of their Oracle Big Data Appliance announcement (http://www.oracle.com/us/corporate/press/512001).

Oracle R Enterprise allows users to perform statistical analysis with advanced visualization on data stored in Oracle Database. Oracle R Enterprise enables scalable R solutions, while facilitating production deployment of R scripts and Hadoop based solutions, as well as integration of R results with Oracle BI Publisher and OBIEE dashboards.

This TechCast introduces the various Oracle R Enterprise components and features, along with R script demonstrations that interface with Oracle Database.

TechCast presenter: Mark Hornick, Senior Manager, Oracle Advanced Analytics Development.
This TechCast is part of the ongoing TechCasts series coordinated by Oracle BIWA: The BI, Warehousing and Analytics SIG (http://www.oracleBIWA.org).

Statistics on Social Media

Some official statistics on social media from the owners themselves

1) Facebook-

http://www.facebook.com/press/info.php?statistics

Date -17 Nov 2011

Statistics

People on Facebook

Creating Pages on Google Plus for some languages

So I decided to create Pages on Google Plus for my favorite programming languages.

a programming language that lets you work more quickly and integrate your systems more effectively

Add to circles

  –  Comment  –  Share

Ajay Ohri

Ajay Ohri's profile photo

Ajay Ohri  –
Ajay Ohri shared a Google+ page with you.
Structured Query Language
Leading statistical language since 1960’s especially in sociology and market research
The leading statistical language in the world
The leading statistical language since 1970’s

Python

https://plus.google.com/107930407101060924456/posts

 

These are in accordance with Google’s Policies http://www.google.com/intl/en/+/policy/pagesterm.html  Continue reading “Creating Pages on Google Plus for some languages”

Contest : 2 free passes to Predictive Analytics World

I got some good news from the fine people at Predictive Analytics World.

 you qualify for 2 free passes to the PAW NYC event October 16-20, 2011.  I will be sending you a code to use for registration to receive these passes within the next couple of days.

If you cannot attend our PAW NYC event, please feel free to use these two free passes as a promotional tool within your blog.

Now I have been partnering with PAW for a long time, so it is nice to have free passes. I am grateful for their support of this blog. Therein lies my dilemma. I am in India, and a return ticket from NYC to India costs 1100$. Unless something drastic happens , I dont see myself with that kind of travel money.

Ergo.

I am offering two free passes to Predictive Analytics World . http://predictiveanalyticsworld.com/

All you need to do is – ahem- cough-

  1. like the Facebook Page of Decisionstats.  https://www.facebook.com/pages/Decisionstats/217450141605435 OR
  2.  Add me to a Google circle https://plus.google.com/116302364907696741272/posts OR
  3. Follow me on Twitter https://twitter.com/#!/0_h_r_1

AND


  1. Read one of my poems at my poetry blog at http://poemsforkush.wordpress.com/ and leave a comment with your email id please . It’s a promotion for my next book “Poets and Hackers” due for release in 2 weeks.
The 2 free passes are for any 2 days of the PAW NYC event.  This free pass may not be used for Text Analytics World conference being held the same week.  Please have your Contest winners use the Free Code:  XXXXXXXX.  This code will be good for two uses in registering. 
Thats ‘it. Two free passes , and go for it if you are around NYC in October. NY is a lovely place and I am wearing my red FDNY T shirt as I am typing this.

What do you get?

One of these –http://www.predictiveanalyticsworld.com/newyork/register.php (details awaited!) to

http://www.predictiveanalyticsworld.com/newyork/2011/

Predictive Analytics World Header Image

 

 

Web Analytics Certifications by Google

Google has a whole list of certifications for people wanting to be certified in analytics, and advertising related to internet.

Continue reading “Web Analytics Certifications by Google”