GrapheR

GrapherR

GrapheR is a Graphical User Interface created for simple graphs.

Depends: R (>= 2.10.0), tcltk, mgcv
Description: GrapheR is a multiplatform user interface for drawing highly customizable graphs in R. It aims to be a valuable help to quickly draw publishable graphs without any knowledge of R commands. Six kinds of graphs are available: histogram, box-and-whisker plot, bar plot, pie chart, curve and scatter plot.
License: GPL-2
LazyLoad: yes
Packaged: 2011-01-24 17:47:17 UTC; Maxime
Repository: CRAN
Date/Publication: 2011-01-24 18:41:47

More information about GrapheR at CRAN
Path: /cran/new | permanent link

Advantages of using GrapheR

It is bi-lingual (English and French) and can import in text and csv files
The intention is for even non users of R, to make the simple types of Graphs.
The user interface is quite cleanly designed. It is thus aimed as a data visualization GUI, but for a more basic level than Deducer.
Easy to rename axis ,graph titles as well use sliders for changing line thickness and color

Disadvantages of using GrapheR

Lack of documentation or help. Especially tips on mouseover of some options should be done.
Some of the terms like absicca or ordinate axis may not be easily understood by a business user.
Default values of color are quite plain (black font on white background).
Can flood terminal with lots of repetitive warnings (although use of warnings() function limits it to top 50)
Some of axis names can be auto suggested based on which variable s being chosen for that axis.
Package name GrapheR refers to a graphical calculator in Mac OS – this can hinder search engine results

Using GrapheR

Data Input -Data Input can be customized for CSV and Text files.
GrapheR gives information on loaded variables (numeric versus Factors)
It asks you to choose the type of Graph
It then asks for usual Graph Inputs (see below). Note colors can be customized (partial window). Also number of graphs per Window can be easily customized
Graph is ready for publication

How would a graph of negative and positive acceleration differ (wiki.answers.com)
Contest to build an R package recommendation engine (dataists.com)
Dracula Graph Library (graphdracula.net)
Why does a cumulative frequency graph never go down (wiki.answers.com)
Heatmap tables (r-bloggers.com)
Graph.tk – Online Graphing Utility (freetech4teachers.com)
Graph Generator (topcoder.com)

Libreoffice 3.3 released

What does LibreOffice give you?

http://www.libreoffice.org/features/

WRITER is the word processor inside LibreOffice. Use it for everything, from dashing off a quick letter to producing an entire book with tables of contents, embedded illustrations, bibliographies and diagrams. The while-you-type auto-completion, auto-formatting and automatic spelling checking make difficult tasks easy (but are easy to disable if you prefer). Writer is powerful enough to tackle desktop publishing tasks such as creating multi-column newsletters and brochures. The only limit is your imagination.

CALC tames your numbers and helps with difficult decisions when you’re weighing the alternatives. Analyze your data with Calc and then use it to present your final output. Charts and analysis tools help bring transparency to your conclusions. A fully-integrated help system makes easier work of entering complex formulas. Add data from external databases such as SQL or Oracle, then sort and filter them to produce statistical analyses. Use the graphing functions to display large number of 2D and 3D graphics from 13 categories, including line, area, bar, pie, X-Y, and net – with the dozens of variations available, you’re sure to find one that suits your project.

IMPRESS is the fastest and easiest way to create effective multimedia presentations. Stunning animation and sensational special effects help you convince your audience. Create presentations that look even more professional than the standard presentations you commonly see at work. Get your collegues’ and bosses’ attention by creating something a little bit different.

DRAW lets you build diagrams and sketches from scratch. A picture is worth a thousand words, so why not try something simple with box and line diagrams? Or else go further and easily build dynamic 3D illustrations and special effects. It’s as simple or as powerful as you want it to be.

BASE is the database front-end of the LibreOffice suite. With Base, you can seamlessly integrate into your existing database structures. Based on imported and linked tables and queries from MySQL, PostgreSQL or Microsoft Access and many other data sources, you can build powerful databases containing forms, reports, views and queries. Full integration is possible with the in-built HSQL database.

MATH is a simple equation editor that lets you lay-out and display your mathematical, chemical, electrical or scientific equations quickly in standard written notation. Even the most-complex calculations can be understandable when displayed correctly. E=mc²

Open Documentation just announced release candidate 3 of Libre office.

New Features-

http://www.libreoffice.org/download/new-features/

General

Added the LibreColors to the palette;
Added Quickstarter for Unix builds;
Introduced Linux “Libertine G” and Linux “Biolinum G” fonts;
Implement import of alpha channel for RGBA .tiffs [http://bugs.freedesktop.org/show_bug.cgi?id=30472];
Show all appropiate formats by default on “Save As” [http://qa.openoffice.org/issues/show_bug.cgi?id=113141];
Use radio buttons for mutually exclusive menu options;
Replace the “Help Support” menu item by the “License Information” one;
Load and save documents in flat XML;
Made Help system available via the WikiHelp;
Option to enable saving of documents at all times (see Tools -> Options -> LibreOffice -> General -> “Allow to save document…”).

Calc

[http://bugs.freedesktop.org/show_bug.cgi?id=30559]: Added new tab page ‘Compatibility’ in the Options dialog;
Better default key bindings;
Use Ctrl-Shift-D to launch selection list in LibreOffice;
Added new image file used in the “insert new sheet” button. This image is not visible in read-only mode;
Fix fake small caps resizing factor [http://qa.openoffice.org/issues/show_bug.cgi?id=1526];
Added dotted/dashed borders in Calc;
Added icons for toggling sheet grids in Calc;
Better performance and interoperability on Excel doc import;
Better performance on DBF import;
Slightly better performance on ODS import;
Possibility to use English formula names;
Distributed alignment – allows one to specify ‘distributed’ horizontal alignment and ‘justified’ and ‘distributed’ vertical alignments within cells. This is notably useful for CJK locales;
Support for 3 different formula syntaxes: Calc A1, Excel A1 and Excel R1C1;
Configurable argument and array separators in formula expressions;
External reference works within OFFSET function;
Hitting TAB during auto-complete commits current selection and moves to the next cell;
Shift-TAB cycles through auto-complete selections;
Find and replace skips those cells that are filtered out (thus hidden);
Protecting sheet provides two additional sheet protection options, to optionally limit cursor placement in protected and unprotected areas;
Copying a range highlights the range being copied. It also allows you to paste it by hitting ENTER key. Hitting ESC removes the range highlight;
Jumping to and from references in formula cells via “Ctrl-[” and “Ctrl-]”;
Cell cursor stays at the original cell during range selection.

Writer

AutoCorrections match case of the words that AutoCorrect replaces. (Issuezilla 2838);
Ability to turn off number recognition in Writer;
RTF export (from GSoc);
Port of Lotus Word Pro filter;
New dialog box for title page.

Impress/Draw

PPTX chart import feature;
[http://qa.openoffice.org/issues/show_bug.cgi?id=112421] make “Presenter Screen” default to laptop, not projector;
Improve randomization in “Dissolve” transition.

Math

Default to just printing the formula itself in Math;
[http://qa.openoffice.org/issues/show_bug.cgi?id=113400] Maths brackets malformed in presentation mode.

Base

[http://qa.openoffice.org/issues/show_bug.cgi?id=112597] Added display properties to control shapes.

Development

UNO APIs for size and moveProtect of notes;
Via Issuezilla bug #i80184: allow addition of drawing documents to gallery via API.

Productivity Enhancements

New custom properties handling;
Embedding of standard PDF fonts;
New “Narrow” font family;
Increased document protection in Writer and Calc;
Automatic decimals digits for “General” format in Calc;
1 million rows in a spreadsheet;
New options for CSV (Comma-Separated Value) importation in Calc;
Insert drawing objects in charts;
Hierarchical axis labels for charts;
Improved slide layout handling in Impress;
Manual setting for primary key support for databases;
Support of Read-Only database registration;
New Math command: ‘nospace’.

Internationalization

Additional locale data.

Usability and Interface

Common search toolbar;
New easier-to-use print interface;
More options for changing case;
Redesign of thesaurus;
Resetting of text to the default language in Writer;
Text rendering of form controls in Writer;
Changed defaults for charts;
Colored sheet tabs in Calc;
Adaptation to marked selection for filter area in Calc;
Sort dialog box for DataPilot in Calc;
Display custom names for DataPilot fields, items and totals in Calc.

Developer Features and Extensibility

Grid control enhancements;
New MetaData node for database;
Extending database drivers using extensions.

Make Numbers Easier to Read in OpenOffice Calc (helpdeskgeek.com)
Libre Office, Using Java To A Lesser Extent (lockergnome.com)
OpenOffice vs. Office 2011: Rooting for the Underdog (appreaders.com)
LibreOffice RC 3 now available (omgubuntu.co.uk)
Libre Office Beta 3 released (omgubuntu.co.uk)
Rumblings From the LibreOffice Camp Signal Good Things Ahead (ostatic.com)
LibreOffice 3.3 RC2 released, available for download (omgubuntu.co.uk)
LibreOffice: Ready for Liftoff (zdnet.com)
LibreOffice – The Likely Future of OpenOffice (maketecheasier.com)
Replace OpenOffice.org with LibreOffice in Ubuntu [Linux Tip] (lifehacker.com)
LibreOffice Ubuntu PPA makes installation easy (omgubuntu.co.uk)

Trying out Google Prediction API from R

So I saw the news at NY R Meetup and decided to have a go at Prediction API Package (which first started off as a blog post at

http://onertipaday.blogspot.com/2010/11/r-wrapper-for-google-prediction-api.html

1)My OS was Ubuntu 10.10 Netbook

Ubuntu has a slight glitch plus workaround for installing the RCurl package on which the Google Prediction API is dependent- you need to first install this Ubuntu package for RCurl to install libcurl4-gnutls-dev

Once you install that using Synaptic,

Simply start R

2) Install Packages rjson and Rcurl using install.packages and choosing CRAN

Since GooglePredictionAPI is not yet on CRAN

3) Download that package from

https://code.google.com/p/google-prediction-api-r-client/downloads/detail?name=googlepredictionapi_0.1.tar.gz&can=2&q=

You need to copy this downloaded package to your “first library ” folder

When you start R, simply run

.libPaths()[1]

and thats the folder you copy the GooglePredictionAPI package you downloaded.

5) Now the following line works

Under R prompt,

> install.packages("googlepredictionapi_0.1.tar.gz", repos=NULL, type="source")

6) Uploading data to Google Storage using the GUI (rather than gs util)

Just go to https://sandbox.google.com/storage/

and thats the Google Storage manager

Notes on Training Data-

Use a csv file

The first column is the score column (like 1,0 or prediction score)

There are no headers- so delete headers from data file and move the dependent variable to the first column (Note I used data from the kaggle contest for R package recommendation at

http://kaggle.com/R?viewtype=data )

6) The good stuff:

Once you type in the basic syntax, the first time it will ask for your Google Credentials (email and password)

It then starts showing you time elapsed for training.

Now you can disconnect and go off (actually I got disconnected by accident before coming back in a say 5 minutes so this is the part where I think this is what happened is why it happened, dont blame me, test it for yourself) –

and when you come back (hopefully before token expires) you can see status of your request (see below)

> library(rjson)
> library(RCurl)
Loading required package: bitops
> library(googlepredictionapi)
> my.model <- PredictionApiTrain(data="gs://numtraindata/training_data")
The request for training has sent, now trying to check if training is completed
Training on numtraindata/training_data: time:2.09 seconds
Training on numtraindata/training_data: time:7.00 seconds

Note I changed the format from the URL where my data is located- simply go to your Google Storage Manager and right click on the file name for link address ( https://sandbox.google.com/storage/numtraindata/training_data.csv)

to gs://numtraindata/training_data (that kind of helps in any syntax error)

8) From the kind of high level instructions at https://code.google.com/p/google-prediction-api-r-client/, you could also try this on a local file

Usage

## Load googlepredictionapi and dependent libraries
library(rjson)
library(RCurl)
library(googlepredictionapi)

## Make a training call to the Prediction API against data in the Google Storage.
## Replace MYBUCKET and MYDATA with your data.
my.model <- PredictionApiTrain(data="gs://MYBUCKET/MYDATA")

## Alternatively, make a training call against training data stored locally as a CSV file.
## Replace MYPATH and MYFILE with your data.
my.model <- PredictionApiTrain(data="MYPATH/MYFILE.csv")

At the time of writing my data was still getting trained, so I will keep you posted on what happens.

An R interface to the Google Prediction API (revolutionanalytics.com)
Google Prediction Goes to the Movies (technoverseblog.com)
11 new APIs: Google Predictions, Amazon User Management (programmableweb.com)
R at Google (r-bloggers.com)
Google API Console Opens Up Millions of Queries Daily (programmableweb.com)
Canonical Design Team: So, you want to provide an API for the world to use? (design.canonical.com)

Deleting Twitter, Facebook,LinkedIn- Accepting Life

This Thanksgiving as I prayed to God for my family– I prayed to him to give me more time with my loving family. An insight or revelation struck me-

I was spending more time with my computer than with my loved ones.

Is Twitter, Facebook, LinkedIn essential to living? No

I have 1700 followers on Twitter

1100 “Friends” on Facebook, and 9429 “Connections” on Linkedin

Deleting Facebook was an emotionally wrenching decision- see this screenshot- I tried to download all my account- family photos (320 mb) but connection kept breaking-

so I had just deactivate and not delete the account. You win, Zuckenberg

How to-

Right Hand Top Corner —-Account Settings- Deactivate Account

After Facebook de activates your account- it mocks you by saying this this in YELLOW “

Your Facebook account has been deactivated.

To reactivate your account, log in using your old login email and password. You will be able to use the site like you used to.

We hope you come back soon.”

I go back to Facebook to download all my family photos before final deletion (and not just de activation)- I get this message

It may take a little while for us to gather all of your photos, wall posts, messages, and other information. We will then ask you to verify your identity in order to help protect the security of your account.

Yeah Yeah Mark.

One Down Two to Go

Deleting Twitter

Twitter was disappointingly easy-

Go to http://twitter.com/settings/account

At bottom left you see Deactivate my account.

Twitter tries to scare me now-

Is this goodbye?

This action is permanent.

Are you sure you don’t want to reconsider? Was it something we said? Tell us.

Before you deactivate your account, know this:

This action is permanent: account restoration is currently disabled.
You do not need to deactivate your account to change your username. (You can change it on the settings page. All @replies and followers will remain unchanged.)
Your account may be viewable on twitter.com for a few days after deactivation.
We have no control over content indexed by search engines like Google.
If you’re creating a new account and want to use the same user name, phone number and/or email address associated with this account, you must first change them on this account before you deactivate it. If you don’t, the information will be tied to this account and unavailable for use.
Okay, fine, deactivate my account (thats the button)

I clicked the Okay fine Button.

One more pop up-

Re-enter your Twitter password to deactivate @DecisionStats.

Ok Done-

Twitter tries to scare me again —-

You deactivated your account.

Account restoration is currently unavailable. Here is the message you agreed to before deactivating your account:

his action is permanent.

Before you deactivate your account, know this:

This action is permanent: account restoration is currently disabled.
You do not need to deactivate your account to change your username. (You can change it on the settings page. All @replies and followers will remain unchanged.)
Your account may be viewable on twitter.com for a few days after deactivation.
We have no control over content indexed by search engines like Google.
If you’re creating a new account and want to use the same user name, phone number and/or email address associated with this account, you must first change them on this account before you deactivate it. If you don’t, the information will be tied to this account and unavailable for use.

So Long Twitter, I gotta spend more time with my offline family okay.

And probably anyone trying to do sentiment analysis on my twitter feed for social media analytics now has an incomplete data point (hehe)

Last One- Linkedin 9349 connections are valuable- I was thinking of auctioning this on E Bay but they kicked me out.

So I just go for deletion.

I spend 10 minuted looking for the delete account button-this is getting a bit annoying now.

I finally go to

http://help.linkedin.com/

Item 4- Closing my Account-

Linkedin neither scares me nor emotionally coddles me – This is what is says-

Closing Your Account

How do I close my account?

Log into the account you wish to close.
Hover your cursor over your name in the top right of your home page and then click “Settings”.
Click on “Close Your Account” under Personal Information.
Select a reason for closing your account.
Click on “Continue”.

Members should only have one LinkedIn account. Multiple accounts can prevent the ability to accept an Invitation. Closing additional accounts should resolve this dilemma. Prior to closing any secondary accounts:

Inventory all connections and identify any that may be missing from the primary account you wish to keep.
Send Invitations to those connections missing from the primary account.
Update any profile information that maybe on other account profiles.

So I try exporting all 9300 + connections using http://www.linkedin.com/addressBookExport

I dont think I will send that many invites again- but some of these people have been good to me ( 18 of them even wrote recommendations- which are non exportable it seems)

I check my downloaded csv file- yup all 9379 email addresses are there.

Final round-

Update- Linkedin does NOT get deleted. I get this-

Your close account request must be processed by customer support for the following reason:

You have more than 250 connections.

You will receive a confirmation email from customer support indicating that they received your request to close your account.

The account that customer support will process for closure is below:

Ajay Ohri
9,429 Connections
16 Recommendations
ohri2007@gmail.com (primary address)

and the email says

Member Comment: ajay ohri

11/24/2010 23:10

Member ID: 6691344
Member Name: Ajay Ohri

The member has attempted to self close this account and was unable because:

The member has a large network of connections to close. Please close during non peak hours.

Please confirm with the member when his or her account has been successfully closed.

So long people- you know where to find me- on this blog (and some on skype).

And if you dont’ know how to find me on my blog-

Happy Thanksgiving-and kill that Turkey 🙂

Demi Lovato Deactivates Her Twitter Account (shoppingblog.com)
John Mayer Deactivates His Twitter Account (shoppingblog.com)
New Facebook privacy tip: ‘Super-logoff’ (cnn.com)
LinkedIn CEO: We’re adding a new user every second (money.cnn.com)
Stephen Fry says ‘bye-bye’ to Twitter … again (cbc.ca)
Making Friends: LinkedIn vs. Facebook vs. Twitter (e1evation.com)

Cloud Computing with R

Illusion of Depth and Space (4/22) - Rotating ... — Image by Dominic's pics via Flickr

Here is a short list of resources and material I put together as starting points for R and Cloud Computing It’s a bit messy but overall should serve quite comprehensively.

Cloud computing is a commonly used expression to imply a generational change in computing from desktop-servers to remote and massive computing connections,shared computers, enabled by high bandwidth across the internet.

As per the National Institute of Standards and Technology Definition,
Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.

(Citation: The NIST Definition of Cloud Computing

Authors: Peter Mell and Tim Grance
Version 15, 10-7-09
National Institute of Standards and Technology, Information Technology Laboratory
http://csrc.nist.gov/groups/SNS/cloud-computing/cloud-def-v15.doc)

R is an integrated suite of software facilities for data manipulation, calculation and graphical display.

From http://cran.r-project.org/doc/FAQ/R-FAQ.html#R-Web-Interfaces

R Web Interfaces

Rweb is developed and maintained by Jeff Banfield. The Rweb Home Page provides access to all three versions of Rweb—a simple text entry form that returns output and graphs, a more sophisticated JavaScript version that provides a multiple window environment, and a set of point and click modules that are useful for introductory statistics courses and require no knowledge of the R language. All of the Rweb versions can analyze Web accessible datasets if a URL is provided.
The paper “Rweb: Web-based Statistical Analysis”, providing a detailed explanation of the different versions of Rweb and an overview of how Rweb works, was published in the Journal of Statistical Software (http://www.jstatsoft.org/v04/i01/).

Ulf Bartel has developed R-Online, a simple on-line programming environment for R which intends to make the first steps in statistical programming with R (especially with time series) as easy as possible. There is no need for a local installation since the only requirement for the user is a JavaScript capable browser. See http://osvisions.com/r-online/ for more information.

Rcgi is a CGI WWW interface to R by MJ Ray. It had the ability to use “embedded code”: you could mix user input and code, allowing the HTMLauthor to do anything from load in data sets to enter most of the commands for users without writing CGI scripts. Graphical output was possible in PostScript or GIF formats and the executed code was presented to the user for revision. However, it is not clear if the project is still active.

Currently, a modified version of Rcgi by Mai Zhou (actually, two versions: one with (bitmap) graphics and one without) as well as the original code are available from http://www.ms.uky.edu/~statweb/.

CGI-based web access to R is also provided at http://hermes.sdu.dk/cgi-bin/go/. There are many additional examples of web interfaces to R which basically allow to submit R code to a remote server, see for example the collection of links available from http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/StatCompCourse.

David Firth has written CGIwithR, an R add-on package available from CRAN. It provides some simple extensions to R to facilitate running R scripts through the CGI interface to a web server, and allows submission of data using both GET and POST methods. It is easily installed using Apache under Linux and in principle should run on any platform that supports R and a web server provided that the installer has the necessary security permissions. David’s paper “CGIwithR: Facilities for Processing Web Forms Using R” was published in the Journal of Statistical Software (http://www.jstatsoft.org/v08/i10/). The package is now maintained by Duncan Temple Lang and has a web page athttp://www.omegahat.org/CGIwithR/.

Rpad, developed and actively maintained by Tom Short, provides a sophisticated environment which combines some of the features of the previous approaches with quite a bit of JavaScript, allowing for a GUI-like behavior (with sortable tables, clickable graphics, editable output), etc.
Jeff Horner is working on the R/Apache Integration Project which embeds the R interpreter inside Apache 2 (and beyond). A tutorial and presentation are available from the project web page at http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/RApacheProject.

Rserve is a project actively developed by Simon Urbanek. It implements a TCP/IP server which allows other programs to use facilities of R. Clients are available from the web site for Java and C++ (and could be written for other languages that support TCP/IP sockets).

OpenStatServer is being developed by a team lead by Greg Warnes; it aims “to provide clean access to computational modules defined in a variety of computational environments (R, SAS, Matlab, etc) via a single well-defined client interface” and to turn computational services into web services.

Two projects use PHP to provide a web interface to R. R_PHP_Online by Steve Chen (though it is unclear if this project is still active) is somewhat similar to the above Rcgi and Rweb. R-php is actively developed by Alfredo Pontillo and Angelo Mineo and provides both a web interface to R and a set of pre-specified analyses that need no R code input.

webbioc is “an integrated web interface for doing microarray analysis using several of the Bioconductor packages” and is designed to be installed at local sites as a shared computing resource.

Rwui is a web application to create user-friendly web interfaces for R scripts. All code for the web interface is created automatically. There is no need for the user to do any extra scripting or learn any new scripting techniques. Rwui can also be found at http://rwui.cryst.bbk.ac.uk.

Finally, the R.rsp package by Henrik Bengtsson introduces “R Server Pages”. Analogous to Java Server Pages, an R server page is typically HTMLwith embedded R code that gets evaluated when the page is requested. The package includes an internal cross-platform HTTP server implemented in Tcl, so provides a good framework for including web-based user interfaces in packages. The approach is similar to the use of the brew package withRapache with the advantage of cross-platform support and easy installation.

Also additional R Cloud Computing Use Cases
http://wwwdev.ebi.ac.uk/Tools/rcloud/

ArrayExpress R/Bioconductor Workbench

Remote access to R/Bioconductor on EBI’s 64-bit Linux Cluster

Start the workbench by downloading the package for your operating system (Macintosh or Windows), or via Java Web Start, and you will get access to an instance of R running on one of EBI’s powerful machines. You can install additional packages, upload your own data, work with graphics and collaborate with colleagues, all as if you are running R locally, but unlimited by your machine’s memory, processor or data storage capacity.

Most up-to-date R version built for multicore CPUs
Access to all Bioconductor packages
Access to our computing infrastructure
Fast access to data stored in EBI’s repositories (e.g., public microarray data in ArrayExpress)

Using R Google Docs
http://www.omegahat.org/RGoogleDocs/run.pdf
It uses the XML and RCurl packages and illustrates that it is relatively quick and easy
to use their primitives to interact with Web services.

Using R with Amazon
Citation
http://rgrossman.com/2009/05/17/running-r-on-amazons-ec2/

Amazon’s EC2 is a type of cloud that provides on demand computing infrastructures called an Amazon Machine Images or AMIs. In general, these types of cloud provide several benefits:

Simple and convenient to use. An AMI contains your applications, libraries, data and all associated configuration settings. You simply access it. You don’t need to configure it. This applies not only to applications like R, but also can include any third-party data that you require.
On-demand availability. AMIs are available over the Internet whenever you need them. You can configure the AMIs yourself without involving the service provider. You don’t need to order any hardware and set it up.
Elastic access. With elastic access, you can rapidly provision and access the additional resources you need. Again, no human intervention from the service provider is required. This type of elastic capacity can be used to handle surge requirements when you might need many machines for a short time in order to complete a computation.
Pay per use. The cost of 1 AMI for 100 hours and 100 AMI for 1 hour is the same. With pay per use pricing, which is sometimes called utility pricing, you simply pay for the resources that you use.

Connecting to R on Amazon EC2- Detailed tutorials
Ubuntu Linux version
https://decisionstats.com/2010/09/25/running-r-on-amazon-ec2/
and Windows R version
https://decisionstats.com/2010/10/02/running-r-on-amazon-ec2-windows/

Connecting R to Data on Google Storage and Computing on Google Prediction API
https://github.com/onertipaday/predictionapirwrapper
R wrapper for working with Google Prediction API

This package consists in a bunch of functions allowing the user to test Google Prediction API from R.
It requires the user to have access to both Google Storage for Developers and Google Prediction API:
see http://code.google.com/apis/storage/ and http://code.google.com/apis/predict/ for details.

Example usage:

#This example requires you had previously created a bucket named data_language on your Google Storage and you had uploaded a CSV file named language_id.txt (your data) into this bucket – see for details
library(predictionapirwrapper)

and Elastic R for Cloud Computing
http://user2010.org/tutorials/Chine.html

Abstract

Elastic-R is a new portal built using the Biocep-R platform. It enables statisticians, computational scientists, financial analysts, educators and students to use cloud resources seamlessly; to work with R engines and use their full capabilities from within simple browsers; to collaborate, share and reuse functions, algorithms, user interfaces, R sessions, servers; and to perform elastic distributed computing with any number of virtual machines to solve computationally intensive problems.
Also see Karim Chine’s http://biocep-distrib.r-forge.r-project.org/

R for Salesforce.com

At the point of writing this, there seem to be zero R based apps on Salesforce.com This could be a big opportunity for developers as both Apex and R have similar structures Developers could write free code in R and charge for their translated version in Apex on Salesforce.com

Force.com and Salesforce have many (1009) apps at
http://sites.force.com/appexchange/home for cloud computing for
businesses, but very few forecasting and statistical simulation apps.

Example of Monte Carlo based app is here
http://sites.force.com/appexchange/listingDetail?listingId=a0N300000016cT9EAI#

These are like iPhone apps except meant for business purposes (I am
unaware if any university is offering salesforce.com integration
though google apps and amazon related research seems to be on)

Force.com uses a language called Apex and you can see
http://wiki.developerforce.com/index.php/App_Logic and
http://wiki.developerforce.com/index.php/An_Introduction_to_Formulas
Apex is similar to R in that is OOPs

SAS Institute has an existing product for taking in Salesforce.com data.

A new SAS data surveyor is
available to access data from the Customer Relationship Management
(CRM) software vendor Salesforce.com. at
http://support.sas.com/documentation/cdl/en/whatsnew/62580/HTML/default/viewer.htm#datasurveyorwhatsnew902.htm)

Personal Note-Mentioning SAS in an email to a R list is a big no-no in terms of getting a response and love. Same for being careless about which R help list to email (like R devel or R packages or R help)

For python based cloud see http://pi-cloud.com

Data Visualization using Tableau

Image representing Tableau Software as depicte... — Image via CrunchBase

Here is a great piece of software for data visualization– the public version is free.

And you can use it for Desktop Analytics as well as BI /server versions at very low cost.

About Tableau Software–

http://www.tableausoftware.com/press_release/tableau-massive-growth-hiring-q3-2010

Tableau was named by Software Magazine as the fastest growing software company in the $10 million to $30 million range in the world, and the second fastest growing software company worldwide overall. The ranking stems from the publication’s 28th annual Software 500 ranking of the world’s largest software service providers.

“We’re growing fast because the market is starving for easy-to-use products that deliver rapid-fire business intelligence to everyone. Our customers want ways to unlock their databases and produce engaging reports and dashboards,” said Christian Chabot CEO and co-founder of Tableau.

http://www.tableausoftware.com/about/who-we-are

History in the Making

Put together an Academy-Award winning professor from the nation’s most prestigious university, a savvy business leader with a passion for data, and a brilliant computer scientist. Add in one of the most challenging problems in software – making databases and spreadsheets understandable to ordinary people. You have just recreated the fundamental ingredients for Tableau.

The catalyst? A Department of Defense (DOD) project aimed at increasing people’s ability to analyze information and brought to famed Stanford professor, Pat Hanrahan. A founding member of Pixar and later its chief architect for RenderMan, Pat invented the technology that changed the world of animated film. If you know Buzz and Woody of “Toy Story”, you have Pat to thank.

Under Pat’s leadership, a team of Stanford Ph.D.s got together just down the hall from the Google folks. Pat and Chris Stolte, the brilliant computer scientist, realized that data visualization could produce large gains in people’s ability to understand information. Rather than analyzing data in text form and then creating visualizations of those findings, Pat and Chris invented a technology called VizQL™ by which visualization is part of the journey and not just the destination. Fast analytics and visualization for everyone was born.

While satisfying the DOD project, Pat and Chris met Christian Chabot, a former data analyst who turned into Jello when he saw what had been invented. The three formed a company and spun out of Stanford like so many before them (Yahoo, Google, VMWare, SUN). With Christian on board as CEO, Tableau rapidly hit one success after another: its first customer (now Tableau’s VP, Operations, Tom Walker), an OEM deal with Hyperion (now Oracle), funding from New Enterprise Associates, a PC Magazine award for “Product of the Year” just one year after launch, and now over 50,000 people in 50+ countries benefiting from the breakthrough.

also see http://www.tableausoftware.com/about/leadership

http://www.tableausoftware.com/about/board

—————————————————————————-

and now a demo I ran on the Kaggle contest data (it is a csv dataset with 95000 rows)

I found Tableau works extremely good at pivoting data and visualizing it -almost like Excel on Steroids. Download the free version here ( I dont know about an academic program (see links below) but software is not expensive at all)

http://buy.tableausoftware.com/

Desktop Personal Edition

The Personal Edition is a visual analysis and reporting solution for data stored in Excel, MS Access or Text Files. Available via download.

Product Information

$999*

Desktop Professional Edition

The Professional Edition is a visual analysis and reporting solution for data stored in MS SQL Server, MS Analysis Services, Oracle, IBM DB2, Netezza, Hyperion Essbase, Teradata, Vertica, MySQL, PostgreSQL, Firebird, Excel, MS Access or Text Files. Available via download.

Product Information

$1800*

Tableau Server

Tableau Server enables users of Tableau Desktop Professional to publish workbooks and visualizations to a server where users with web browsers can access and interact with the results. Available via download.

Product Information

* Price is per Named User and includes one year of maintenance (upgrades and support). Products are made available as a download immediately after purchase. You may revisit the download site at any time during your current maintenance period to access the latest releases.

Online Sales Leader Journey Education Marketing, Inc. Announces New Student Version of Tableau Desktop Professional 5.0 Software (eon.businesswire.com)
FlowingData is brought to you by… (flowingdata.com)
Datamark Selects Tableau to Provide Breakthrough Visibility into Education Lead Performance (eon.businesswire.com)
Tableau Reports Record Growth (xconomy.com)
Mariner Partners with VIA Intell, LLC to Deliver Visual Intelligence Solutions on Tableau Platform (eon.businesswire.com)
4 Ways to Visualize Voter Sentiment for the Midterm Elections (mashable.com)
September Housing Stats Around the Sound (seattlebubble.com)
Human-centric analysis (flowingdata.com)

Here comes PySpread- 85,899,345 rows and 14,316,555 columns

Whats new/ One more open source analytics package. Built like a spreadsheet with an ability to import a million cells-

From http://pyspread.sourceforge.net/index.html

about	Pyspread is a cross-platform Python spreadsheet application. It is based on and written in the programming language Python. Instead of spreadsheet formulas, Python expressions are entered into the spreadsheet cells. Each expression returns a Python object that can be accessed from other cells. These objects can represent anything including lists or matrices.
features	In pyspread, cells expect Python expressions and return Python objects. Therefore, complex data types such as lists, trees or matrices can be handled within a single cell. Macros can be used for functions that are too complex for a single expression. Since Python modules can be easily used without external scripts, arbitrary size rational numbers (via gmpy), fixed point decimal numbers for business calculations, (via the decimal module from the standard library) and advanced statistics including plotting functions (via RPy) can be used in the spreadsheet. Everything is directly available from each cell. Just use the grid Data can be imported and exported using csv files or the clipboard. Other forms of data exchange is possible using external Python modules. In order to simplify sparse matrix editing, pyspread features a three dimensional grid that can be sized up to 85,899,345 rows and 14,316,555 columns (64 bit-systems, depends on row height and column width). Note that importing a million cells requires about 500 MB of memory. The concept of pyspread allows doing everything from each cell that a Python script can do. This may very well include deleting your hard drive or sending your data via the Internet. Of course this is a non-issue if you sandbox properly or if you only use self developed spreadsheets. Since this is not the case for everyone (see the discussion at lwn.net), a GPG signature based trust model for spreadsheet files has been introduced. It ensures that only your own trusted files are executed on loading. Untrusted files are displayed in safe mode. You can trust a file manually. Inspect carefully.
requirements	Pyspread runs on Linux, Windows and *nix platforms with GTK+ support. There are reports that it works with MacOS X as well. If you would like to contribute by testing on OS X please contact me. Dependencies Python >=2.4 <3.0, numpy >=1.1.0 and wxPython >=2.8.10.1. Highly recommended for full functionality PyMe >=0.8.1, Note for Windows™ users: If you want to use signatures without compiling PyMe try out Gpg4win. gmpy >=1.1.0 and rpy >=1.0.3.
maturity	Pyspread is in early Beta release. This means that the core functionality is fully implemented but the program needs testing and polish.

and from the wiki

http://sourceforge.net/apps/mediawiki/pyspread/index.php?title=Main_Page

a spreadsheet with more powerful functions and data structures that are accessible inside each cell. Something like Python that empowers you to do things quickly. And yes, it should be free and it should run on Linux as well as on Windows. I looked around and found nothing that suited me. Therefore, I started pyspread.

Concept

Each cell accepts any input that works in a Python command line.
The inputs are parsed and evaluated by Python’s eval command.
The result objects are accessible via a 3D numpy object array.
String representations of the result objects are displayed in the cells.

Benefits

Each cell returns a Python object. This object can be anything including arrays and third party library objects.
Generator expressions can be used efficiently for data manipulation.
Efficient numpy slicing is used.
numpy methods are accessible for the data.

Installation

Download the pyspread tarball or zip and unzip at a convenient place
In case you do not have it already get and install Python, wxpython and numpy

If you want the examples to work, install gmpy, R and rpy

Really do check the version requirements that are mentioned on http://pyspread.sf.net

Get install privileges (e.g. become root)
Change into the directory and type

python setup.py install

Windows: Replace “python” with your Python interpreter (absolute path)

Become normal user again
Start pyspread by typing

pyspread

Enjoy

Tag: csv

GrapheR

Libreoffice 3.3 released

What does LibreOffice give you?

General

Calc

Writer

Impress/Draw

Math

Base

Development

Productivity Enhancements

Internationalization

Usability and Interface

Developer Features and Extensibility

Trying out Google Prediction API from R

Usage

Data Visualization using Tableau

History in the Making

Desktop Personal Edition

Desktop Professional Edition

Tableau Server

Here comes PySpread- 85,899,345 rows and 14,316,555 columns

Concept

Benefits

Installation

Links

Related Articles

Please share:

What does LibreOffice give you?

General

Calc

Writer

Impress/Draw

Math

Base

Development

Productivity Enhancements

Internationalization

Usability and Interface

Developer Features and Extensibility

Related Articles

Please share:

Usage

Related Articles

Please share:

Is this goodbye?

This action is permanent.

Before you deactivate your account, know this:

You deactivated your account.

his action is permanent.

Before you deactivate your account, know this:

Closing Your Account

How do I close my account?

Ajay Ohri

Related Articles

Please share:

(Citation: The NIST Definition of Cloud Computing

R Web Interfaces

ArrayExpress R/Bioconductor Workbench

Remote access to R/Bioconductor on EBI’s 64-bit Linux Cluster

Abstract

Related Articles

Please share:

History in the Making

Desktop Personal Edition

Desktop Professional Edition

Tableau Server

Related Articles

Please share:

Concept

Benefits

Installation

Links

Related Articles

Please share: