Workflows and MyExperiment.org

Here is a great website for sharing workflows – it is called MyExperiment.org and it can also include Work flows from many software.

myExperiment currently has 4742 members270 groups1842 workflows423 files and 173 packs

Could it also include workflow from Red-R from #rstats or Enterprise Miner

red R is a data flow based GUI in R at

http://red-r.org/

Red-R: A open source visual programming GUI interface for R

Red-R is a open source visual programming interface for R designed to bring the power of the R statistical environment to a broader audience. The goal of this project is to provide access to the massive library of packages in R (and even non-R packages) without any programming expertise. The Red-R framework uses concepts of data-flow programming to make data the center of attention while hiding all the programming complexity.

 

Whats the difference between a workflow and data flow- probably the GUI interface.

I see this http://blog.revolutionanalytics.com/2010/10/a-workflow-for-r.html

for good workflows in R

the best workflows achieve the following goals:

  • Transparency:
  • Maintainability:
  • Modularity:
  • Portability:
  • Reproducibility
  • Efficiency:

and http://stackoverflow.com/questions/1429907/workflow-for-statistical-analysis-and-report-writing

I generally break my projects into 4 pieces:

  1. load.R
  2. clean.R
  3. func.R
  4. do.R

load.R: Takes care of loading in all the data required. Typically this is a short file, reading in data from files, URLs and/or ODBC. Depending on the project at this point I’ll either write out the workspace usingsave() or just keep things in memory for the next step.

clean.R: This is where all the ugly stuff lives – taking care of missing values, merging data frames, handling outliers.

func.R: Contains all of the functions needed to perform the actual analysis. source()‘ing this file should have no side effects other than loading up the function definitions. This means that you can modify this file and reload it without having to go back an repeat steps 1 & 2 which can take a long time to run for large data sets.

do.R: Calls the functions defined in func.R to perform the analysis and produce charts and tables.

The main motivation for this set up is for working with large data whereby you don’t want to have to reload the data each time you make a change to a subsequent step. Also, keeping my code compartmentalized like this means I can come back to a long forgotten project and quickly read load.R and work out what data I need to update, and then look at do.R to work out what analysis was performed.

 

and yes there is a package if you want a workflow without a GUI

https://github.com/johnmyleswhite/ProjectTemplate#readme

Introduction

The ProjectTemplate package lets you automatically build a directory for a new R project with a standardized subdirectory structure. Using this structure, ProjectTemplate automates data and package loading. The hope is that standardized data loading, automatic importing of best practice packages, integrated unit testing and useful nudges towards keeping a cleanly organized codebase will improve the quality of R coding.

 

What kind of Workflows exist on My Experiment.Org

Filter by type
Filter by licence

 

speaking on an interesting Work Flow from SAS-Using SAS EM on a network/grid/cloud/ time sharing computer.

 

so just checking the Rapid Miner Workflows and back to

http://www.myexperiment.org/workflows?filter=TYPE_ID%28%2262%22%29

 

Original Uploader

Workflow Image Mining with RapidMiner (v1) 

Created: 28/04/10 @ 11:00:37 | Last updated: 28/04/10 @ 11:01:04

License: Creative Commons Attribution-No Derivative Works 3.0 Unported License

Preview

 

This is an image mining process using the image mining Web service provided by NHRF within e-Lico. It first uploads a set of images found in a directory, then preprocesses the images and visualizes the result. Furthermore, references to the uploaded images are stored in the local RapidMiner repository so they can later be used for further processing without uploading images a second time.

 

Rating: 0.0 / 5 (0 ratings)Versions: 1Reviews: 0Comments: 1 |Citations: 0

Viewed: 487 times | Downloaded: 234 times

Tags (5):

Show ViewDownload Download (v1)

Original Uploader

Workflow Transaction Analysis Demo from RM 5 Intro Day(v1) 

Created: 30/04/10 @ 08:19:39 | Last updated: 05/05/10 @ 09:58:51

License: Creative Commons Attribution-No Derivative Works 3.0 Unported License

Preview

 

This is the demo process presented at the RapidMiner 5 Intro Day. It combines customer segmentation with direct mailing. It loads some transaction data, aggregates and pivotes the data so it can be used by a clustering to perform a customer segmentation. Then, additional data is joined with the clustered data. First, response/no-response data is joined, and them some additional information about the users is added. Finally, customers are classified into response/no-response classes. The dat…

 

Rating: 0.0 / 5 (0 ratings)Versions: 1Reviews: 0Comments: 1 |Citations: 0

Viewed: 254 times | Downloaded: 135 times

This Workflow has no tags!

Show ViewDownload Download (v1)

Original Uploader

Workflow Ważenie atrybutów wg testu chi-kwadrat (v1) 

Created: 23/03/11 @ 07:37:04 | Last updated: 23/03/11 @ 08:57:19

License: Creative Commons Attribution-No Derivative Works 3.0 Unported License

Preview

 

Ten workflow pokazuje funkcjonowanie operatora Weight by Chi-Squared Statistics, który (jak sama nazwa wskazuje), przeprowadza test zgodności atrybutów za pomocą testu chi-kwadrat. Dla każdej pary atrybutów test wyznacza wartość statystyki chi-kwadrat i próbuje zidentyfikować pary silnie skorelowanych atrybutów.

 

Rating: 0.0 / 5 (0 ratings)Versions: 1Reviews: 0Comments: 0 |Citations: 0

Viewed: 6 times | Downloaded: 21 times

This Workflow has no tags!

Show ViewDownload Download (v1)

Original Uploader

Workflow Convert Nominal to Binominal to Numerical (v1) 

Created: 15/05/10 @ 09:44:19

License: Creative Commons Attribution-No Derivative Works 3.0 Unported License

Preview

 

This is a standard preprocessing subprocess taking nominal (categorical) attributes and introduces binominal dummy attributes before those are transformed to numerical which can be then used by learning schemes like SVM or Logistic Regression.

 

Rating: 0.0 / 5 (0 ratings)Versions: 1Reviews: 0Comments: 0 |Citations: 0

Viewed: 66 times | Downloaded: 36 times

This Workflow has no tags!

Show ViewDownload Download (v1)

Original Uploader

Workflow Linear Regression of Italian bookshops sellout data (v1) 

Created: 28/05/10 @ 16:31:01

License: Creative Commons Attribution-No Derivative Works 3.0 Unported License

Preview

 

Someone could help me? Why the correlation between the actual value and the predictive value of the attribute “Quantita” is so low?

 

Rating: 0.0 / 5 (0 ratings)Versions: 1Reviews: 0Comments: 4 |Citations: 0

Viewed: 79 times | Downloaded: 16 times

This Workflow has no tags!

Show ViewDownload Download (v1)

Original Uploader

Workflow Execute Program on Windows 7 (v1) 

Created: 09/06/10 @ 08:32:06

License: Creative Commons Attribution-No Derivative Works 3.0 Unported License

Preview

 

This simple process demonstrates how to execute a program on windows 7 even if the program path contains spaces. The process will start the Internet Explore if the path exists. Tags: Rapidminer, Execute Program, Windows 7

 

Rating: 0.0 / 5 (0 ratings)Versions: 1Reviews: 0Comments: 0 |Citations: 0

Viewed: 27 times | Downloaded: 11 times

This Workflow has no tags!

Show ViewDownload Download (v1)

Original Uploader

Workflow LSI content based recommender system template(v1) 

Created: 06/05/11 @ 20:40:24 | Last updated: 09/05/11 @ 13:59:18

Credits: User Ninoaf User Matko Bošnjak

Attributions: Workflow Content based recommender system template BlobDatasets for the pack: RCOMM2011 recommender systems workflow templates

License: Creative Commons Attribution-No Derivative Works 3.0 Unported License

Preview

 

This workflow performs LSI text-mining content based recommendation. We use SVD to capture latent semantics between items and words and to obtain low-dimensional representation of items. Latent Semantic Indexing (LSI) takes k greatest singular values and left and right singular vectors to obtain matrix  A_k=U_k * S_k * V_k^T. Items are represented as word-vectors in the original space, where each row in matrix A represents word-vector of particular item. Matrix U_k, on the other hand …

 

Rating: 0.0 / 5 (0 ratings)Versions: 1Reviews: 0Comments: 0 |Citations: 0

Viewed: 40 times | Downloaded: 16 times

Tags (8):

Show ViewDownload Download (v1)

Original Uploader

Workflow Macro Propagation into Subprocesses and Execute Process Operators (v1) 

Created: 25/06/10 @ 08:50:18

License: Creative Commons Attribution-No Derivative Works 3.0 Unported License

Preview

 

<p> This small process demonstrates how to propagate macros into a subprocess or Execute Process operator </p> <p> When macros are set once, they are available in subprocesses imediatly (see Generate Attributes in the seconde Subprocess operator) </p> <p> If someone wants to propagate a macro into a Excecute Process operator one need to edit the “macros” parameter in the Parameters list of the Execute Process operator. </p> Tags: Subprocess, Execute Proc…

 

Rating: 0.0 / 5 (0 ratings)Versions: 1Reviews: 0Comments: 0 |Citations: 0

Viewed: 40 times | Downloaded: 17 times

This Workflow has no tags!

Show ViewDownload Download (v1)

Original Uploader

Workflow Prepares data for gene correlation analysis (v1) 

Created: 23/05/11 @ 12:39:21 | Last updated: 23/05/11 @ 12:43:34

License: Creative Commons Attribution-No Derivative Works 3.0 Unported License

Preview

 

Prepares data for gene correlation analysis.

 

Rating: 0.0 / 5 (0 ratings)Versions: 1Reviews: 0Comments: 0 |Citations: 0

Viewed: 9 times | Downloaded: 2 times

This Workflow has no tags!

Show ViewDownload Download (v1)

Original Uploader

Workflow 222 (v1) 

Created: 04/09/10 @ 21:18:22

License: Creative Commons Attribution-No Derivative Works 3.0 Unported License

Preview

 

This process shows how several different classifiers could be graphically compared by means of multiple ROC curves.

 

Rating: 0.0 / 5 (0 ratings)Versions: 1Reviews: 0Comments: 0 |Citations: 0

Viewed: 21 times | Downloaded: 13 times

This Workflow has no tags!

 

Author: Ajay Ohri

http://about.me/ajayohri

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s