Sumo is a fully-functional web application template that exposes an authenticated user’s R session within java server pages.
Sumo: An Authenticating Web Application with an Embedded R Session by Timothy T. Bergsma and Michael S. Smith Abstract Sumo is a web application intended as a template for developers. It is distributed as a Java ‘war’ file that deploys automatically when placed in a Servlet container’s ‘webapps’
directory. If a user supplies proper credentials, Sumo creates a session-specific Secure Shell connection to the host and a user-specific R session over that connection. Developers may write dynamic server pages that make use of the persistent R session and user-specific file space.
and for Apple fanboys-
We created the xgrid package (Horton and Anoke, 2012) to provide a simple interface to this distributed computing system. The package facilitates use of an Apple Xgrid for distributed processing of a simulation with many independent repetitions, by simplifying job submission (or grid stuffing) and collation of results. It provides a relatively thin but useful layer between R and Apple’s ‘xgrid’ shell command, where the user constructs input scripts to be run remotely. A similar set of routines, optimized for parallel estimation of JAGS (just another Gibbs sampler) models is available within the runjags package (Denwood, 2010). However, with the exception of runjags, none of the previously mentioned packages support parallel computation over an Apple Xgrid.
Hmm I guess parallel computing enabled by Wifi on mobile phones would be awesome too ! So would be anything using iOS . See the rest of the R Journal at http://journal.r-project.org/current.html
Microsoft Robotics Developer Studio 4 beta (RDS4 beta) provides a wide range of support to help make it easy to develop robot applications. RDS4 beta includes a programming model that helps make it easy to develop asynchronous, state-driven applications. RDS4 beta provides a common programming framework that can be applied to support a wide variety of robots, enabling code and skill transfer.
RDS4 beta includes a lightweight asynchronous services-oriented runtime, a set of visual authoring and simulation tools, as well as templates, tutorials, and sample code to help you get started.
This release has extensive support for the Kinect sensor hardware throug the Kinect for Windows SDK allowing developers to create Kinect-enabled robots in the Visual Simulation Environment and in real life. Along with this release comes a standardized reference spec for building a Kinect-based robot.
Concurrency and Coordination Runtime (CCR) helps make it easier to handle asynchronous input and output by eliminating the conventional complexities of manual threading, locks, and semaphores. Lightweight state-oriented Decentralized Software Services (DSS) framework enables you to create program modules that can interoperate on a robot and connected PCs by using a relatively simple, open protocol.
Visual Programming Language (VPL) provides a relatively simple drag-and-drop visual programming language tool that helps make it easy to create robotics applications. VPL also provides the ability to take a collection of connected blocks and reuse them as a single block elsewhere in your program. VPL is also capable of generating human-readable C#.
DSS Manifest Editor (DSSME) provides a relatively simple creation of application configuration and distribution scenarios.
The DSS Log Analyzer tool allows you to view message flows across multiple DSS services. DSS Log Analyzer also allows you to inspect message details.
Visual Simulation Environment (VSE) provides the ability to simulate and test robotic applications using a 3D physics-based simulation tool. This allows developers to create robotics applications without the hardware. Sample simulation models and environments enable you to test your application in a variety of 3D virtual environments.
A bit early but the latest editions of both SAS and R were released last week.
SAS 9.3is clearly a major release with multiple enhancements to make SAS both relevant and pertinent in enterprise software in the age of big data. Also many more R specific, JMP specific and partners like Teradata specific enhancements.
SAS® Data Integration Server includes DataFlux® Data Management Platform for enhanced data quality
Master Data Management (DataFlux® qMDM)
Provides support for master hub of trusted entity data.
Analytics
SAS® Enterprise Miner™
New survival analysis predicts when an event will happen, not just if it will happen.
New rate making capability for insurance predicts optimal insurance premium for individuals based on attributes known at application time.
Time Series Data Mining node (experimental) applies data mining techniques to transactional, time-stamped data.
Support Vector Machines node (experimental) provides a supervised machine learning method for prediction and classification.
SAS® Forecast Server
SAS Forecast Server is integrated with the SAP APO Demand Planning module to provide SAP users with access to a superior forecasting engine and automatic forecasting capabilities.
SAS® Model Manager
Seamless integration of R models with the ability to register and manage R models in SAS Model Manager.
Ability to perform champion/challenger side-by-side comparisons between SAS and R models to see which model performs best for a specific need.
SAS/OR® and SAS® Simulation Studio
Optimization
Simulation
Automatic input distribution fitting using JMP with SAS Simulation Studio.
Text analytics
SAS® Text Miner
SAS® Enterprise Content Categorization
SAS® Sentiment Analysis
Scalability and high-performance
SAS® Analytics Accelerator for Teradata (new product)
SAS® Grid Manager
and latest from http://www.r-project.org/ I was a bit curious to know why the different licensing for R now (from GPL2 to GPL2- GPL 3)
• No parts of R are now licensed solely under GPL-2. The licences for packages rpart and survival have been changed, which means that the licence terms for R as distributed are GPL-2 | GPL-3.
This is a maintenance release to consolidate various minor fixes to 2.13.0.
CHANGES IN R VERSION 2.13.1:
NEW FEATURES:
• iconv() no longer translates NA strings as "NA".
• persp(box = TRUE) now warns if the surface extends outside the
box (since occlusion for the box and axes is computed assuming
the box is a bounding box). (PR#202.)
• RShowDoc() can now display the licences shipped with R, e.g.
RShowDoc("GPL-3").
• New wrapper function showNonASCIIfile() in package tools.
• nobs() now has a "mle" method in package stats4.
• trace() now deals correctly with S4 reference classes and
corresponding reference methods (e.g., $trace()) have been added.
• xz has been updated to 5.0.3 (very minor bugfix release).
• tools::compactPDF() gets more compression (usually a little,
sometimes a lot) by using the compressed object streams of PDF
1.5.
• cairo_ps(onefile = TRUE) generates encapsulated EPS on platforms
with cairo >= 1.6.
• Binary reads (e.g. by readChar() and readBin()) are now supported
on clipboard connections. (Wish of PR#14593.)
• as.POSIXlt.factor() now passes ... to the character method
(suggestion of Joshua Ulrich). [Intended for R 2.13.0 but
accidentally removed before release.]
• vector() and its wrappers such as integer() and double() now warn
if called with a length argument of more than one element. This
helps track down user errors such as calling double(x) instead of
as.double(x).
INSTALLATION:
• Building the vignette PDFs in packages grid and utils is now part
of running make from an SVN checkout on a Unix-alike: a separate
make vignettes step is no longer required.
These vignettes are now made with keep.source = TRUE and hence
will be laid out differently.
• make install-strip failed under some configuration options.
• Packages can customize non-standard installation of compiled code
via a src/install.libs.R script. This allows packages that have
architecture-specific binaries (beyond the package's shared
objects/DLLs) to be installed in a multi-architecture setting.
SWEAVE & VIGNETTES:
• Sweave() and Stangle() gain an encoding argument to specify the
encoding of the vignette sources if the latter do not contain a
\usepackage[]{inputenc} statement specifying a single input
encoding.
• There is a new Sweave option figs.only = TRUE to run each figure
chunk only for each selected graphics device, and not first using
the default graphics device. This will become the default in R
2.14.0.
• Sweave custom graphics devices can have a custom function
foo.off() to shut them down.
• Warnings are issued when non-portable filenames are found for
graphics files (and chunks if split = TRUE). Portable names are
regarded as alphanumeric plus hyphen, underscore, plus and hash
(periods cause problems with recognizing file extensions).
• The Rtangle() driver has a new option show.line.nos which is by
default false; if true it annotates code chunks with a comment
giving the line number of the first line in the sources (the
behaviour of R >= 2.12.0).
• Package installation tangles the vignette sources: this step now
converts the vignette sources from the vignette/package encoding
to the current encoding, and records the encoding (if not ASCII)
in a comment line at the top of the installed .R file.
DEPRECATED AND DEFUNCT:
• The internal functions .readRDS() and .saveRDS() are now
deprecated in favour of the public functions readRDS() and
saveRDS() introduced in R 2.13.0.
• Switching off lazy-loading of code _via_ the LazyLoad field of
the DESCRIPTION file is now deprecated. In future all packages
will be lazy-loaded.
• The off-line help() types "postscript" and "ps" are deprecated.
UTILITIES:
• R CMD check on a multi-architecture installation now skips the
user's .Renviron file for the architecture-specific tests (which
do read the architecture-specific Renviron.site files). This is
consistent with single-architecture checks, which use
--no-environ.
• R CMD build now looks for DESCRIPTION fields BuildResaveData and
BuildKeepEmpty for per-package overrides. See ‘Writing R
Extensions’.
BUG FIXES:
• plot.lm(which = 5) was intended to order factor levels in
increasing order of mean standardized residual. It ordered the
factor labels correctly, but could plot the wrong group of
residuals against the label. (PR#14545)
• mosaicplot() could clip the factor labels, and could overlap them
with the cells if a non-default value of cex.axis was used.
(Related to PR#14550.)
• dataframe[[row,col]] now dispatches on [[ methods for the
selected column (spotted by Bill Dunlap).
• sort.int() would strip the class of an object, but leave its
object bit set. (Reported by Bill Dunlap.)
• pbirthday() and qbirthday() did not implement the algorithm
exactly as given in their reference and so were unnecessarily
inaccurate.
pbirthday() now solves the approximate formula analytically
rather than using uniroot() on a discontinuous function.
The description of the problem was inaccurate: the probability is
a tail probablity (‘2 _or more_ people share a birthday’)
• Complex arithmetic sometimes warned incorrectly about producing
NAs when there were NaNs in the input.
• seek(origin = "current") incorrectly reported it was not
implemented for a gzfile() connection.
• c(), unlist(), cbind() and rbind() could silently overflow the
maximum vector length and cause a segfault. (PR#14571)
• The fonts argument to X11(type = "Xlib") was being ignored.
• Reading (e.g. with readBin()) from a raw connection was not
advancing the pointer, so successive reads would read the same
value. (Spotted by Bill Dunlap.)
• Parsed text containing embedded newlines was printed incorrectly
by as.character.srcref(). (Reported by Hadley Wickham.)
• decompose() used with a series of a non-integer number of periods
returned a seasonal component shorter than the original series.
(Reported by Rob Hyndman.)
• fields = list() failed for setRefClass(). (Reported by Michael
Lawrence.)
• Reference classes could not redefine an inherited field which had
class "ANY". (Reported by Janko Thyson.)
• Methods that override previously loaded versions will now be
installed and called. (Reported by Iago Mosqueira.)
• addmargins() called numeric(apos) rather than
numeric(length(apos)).
• The HTML help search sometimes produced bad links. (PR#14608)
• Command completion will no longer be broken if tail.default() is
redefined by the user. (Problem reported by Henrik Bengtsson.)
• LaTeX rendering of markup in titles of help pages has been
improved; in particular, \eqn{} may be used there.
• isClass() used its own namespace as the default of the where
argument inadvertently.
• Rd conversion to latex mis-handled multi-line titles (including
cases where there was a blank line in the \title section).
The Analytics 2011 Conference Series combines the power of SAS’s M2010 Data Mining Conference and F2010 Business Forecasting Conference into one conference covering the latest trends and techniques in the field of analytics. Analytics 2011 Conference Series brings the brightest minds in the field of analytics together with hundreds of analytics practitioners. Join us as these leading conferences change names and locations. At Analytics 2011, you’ll learn through a series of case studies, technical presentations and hands-on training. If you are in the field of analytics, this is one conference you can’t afford to miss.
Conference Details
October 24-25, 2011
Grande Lakes Resort
Orlando, FL
Here is a short list of resources and material I put together as starting points for R and Cloud Computing It’s a bit messy but overall should serve quite comprehensively.
Cloud computing is a commonly used expression to imply a generational change in computing from desktop-servers to remote and massive computing connections,shared computers, enabled by high bandwidth across the internet.
As per the National Institute of Standards and Technology Definition,
Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.
Rweb is developed and maintained by Jeff Banfield. The Rweb Home Page provides access to all three versions of Rweb—a simple text entry form that returns output and graphs, a more sophisticated JavaScript version that provides a multiple window environment, and a set of point and click modules that are useful for introductory statistics courses and require no knowledge of the R language. All of the Rweb versions can analyze Web accessible datasets if a URL is provided.
The paper “Rweb: Web-based Statistical Analysis”, providing a detailed explanation of the different versions of Rweb and an overview of how Rweb works, was published in the Journal of Statistical Software (http://www.jstatsoft.org/v04/i01/).
Ulf Bartel has developed R-Online, a simple on-line programming environment for R which intends to make the first steps in statistical programming with R (especially with time series) as easy as possible. There is no need for a local installation since the only requirement for the user is a JavaScript capable browser. See http://osvisions.com/r-online/ for more information.
Rcgi is a CGI WWW interface to R by MJ Ray. It had the ability to use “embedded code”: you could mix user input and code, allowing the HTMLauthor to do anything from load in data sets to enter most of the commands for users without writing CGI scripts. Graphical output was possible in PostScript or GIF formats and the executed code was presented to the user for revision. However, it is not clear if the project is still active.
Currently, a modified version of Rcgi by Mai Zhou (actually, two versions: one with (bitmap) graphics and one without) as well as the original code are available from http://www.ms.uky.edu/~statweb/.
David Firth has written CGIwithR, an R add-on package available from CRAN. It provides some simple extensions to R to facilitate running R scripts through the CGI interface to a web server, and allows submission of data using both GET and POST methods. It is easily installed using Apache under Linux and in principle should run on any platform that supports R and a web server provided that the installer has the necessary security permissions. David’s paper “CGIwithR: Facilities for Processing Web Forms Using R” was published in the Journal of Statistical Software (http://www.jstatsoft.org/v08/i10/). The package is now maintained by Duncan Temple Lang and has a web page athttp://www.omegahat.org/CGIwithR/.
Rpad, developed and actively maintained by Tom Short, provides a sophisticated environment which combines some of the features of the previous approaches with quite a bit of JavaScript, allowing for a GUI-like behavior (with sortable tables, clickable graphics, editable output), etc.
Jeff Horner is working on the R/Apache Integration Project which embeds the R interpreter inside Apache 2 (and beyond). A tutorial and presentation are available from the project web page at http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/RApacheProject.
Rserve is a project actively developed by Simon Urbanek. It implements a TCP/IP server which allows other programs to use facilities of R. Clients are available from the web site for Java and C++ (and could be written for other languages that support TCP/IP sockets).
OpenStatServer is being developed by a team lead by Greg Warnes; it aims “to provide clean access to computational modules defined in a variety of computational environments (R, SAS, Matlab, etc) via a single well-defined client interface” and to turn computational services into web services.
Two projects use PHP to provide a web interface to R. R_PHP_Online by Steve Chen (though it is unclear if this project is still active) is somewhat similar to the above Rcgi and Rweb. R-php is actively developed by Alfredo Pontillo and Angelo Mineo and provides both a web interface to R and a set of pre-specified analyses that need no R code input.
webbioc is “an integrated web interface for doing microarray analysis using several of the Bioconductor packages” and is designed to be installed at local sites as a shared computing resource.
Rwui is a web application to create user-friendly web interfaces for R scripts. All code for the web interface is created automatically. There is no need for the user to do any extra scripting or learn any new scripting techniques. Rwui can also be found at http://rwui.cryst.bbk.ac.uk.
Finally, the R.rsp package by Henrik Bengtsson introduces “R Server Pages”. Analogous to Java Server Pages, an R server page is typically HTMLwith embedded R code that gets evaluated when the page is requested. The package includes an internal cross-platform HTTP server implemented in Tcl, so provides a good framework for including web-based user interfaces in packages. The approach is similar to the use of the brew package withRapache with the advantage of cross-platform support and easy installation.
Remote access to R/Bioconductor on EBI’s 64-bit Linux Cluster
Start the workbench by downloading the package for your operating system (Macintosh or Windows), or via Java Web Start, and you will get access to an instance of R running on one of EBI’s powerful machines. You can install additional packages, upload your own data, work with graphics and collaborate with colleagues, all as if you are running R locally, but unlimited by your machine’s memory, processor or data storage capacity.
Most up-to-date R version built for multicore CPUs
Access to all Bioconductor packages
Access to our computing infrastructure
Fast access to data stored in EBI’s repositories (e.g., public microarray data in ArrayExpress)
Using R Google Docs http://www.omegahat.org/RGoogleDocs/run.pdf
It uses the XML and RCurl packages and illustrates that it is relatively quick and easy
to use their primitives to interact with Web services.
Amazon’s EC2 is a type of cloud that provides on demand computing infrastructures called an Amazon Machine Images or AMIs. In general, these types of cloud provide several benefits:
Simple and convenient to use. An AMI contains your applications, libraries, data and all associated configuration settings. You simply access it. You don’t need to configure it. This applies not only to applications like R, but also can include any third-party data that you require.
On-demand availability. AMIs are available over the Internet whenever you need them. You can configure the AMIs yourself without involving the service provider. You don’t need to order any hardware and set it up.
Elastic access. With elastic access, you can rapidly provision and access the additional resources you need. Again, no human intervention from the service provider is required. This type of elastic capacity can be used to handle surge requirements when you might need many machines for a short time in order to complete a computation.
Pay per use. The cost of 1 AMI for 100 hours and 100 AMI for 1 hour is the same. With pay per use pricing, which is sometimes called utility pricing, you simply pay for the resources that you use.
#This example requires you had previously created a bucket named data_language on your Google Storage and you had uploaded a CSV file named language_id.txt (your data) into this bucket – see for details
library(predictionapirwrapper)
Elastic-R is a new portal built using the Biocep-R platform. It enables statisticians, computational scientists, financial analysts, educators and students to use cloud resources seamlessly; to work with R engines and use their full capabilities from within simple browsers; to collaborate, share and reuse functions, algorithms, user interfaces, R sessions, servers; and to perform elastic distributed computing with any number of virtual machines to solve computationally intensive problems.
Also see Karim Chine’s http://biocep-distrib.r-forge.r-project.org/
R for Salesforce.com
At the point of writing this, there seem to be zero R based apps on Salesforce.com This could be a big opportunity for developers as both Apex and R have similar structures Developers could write free code in R and charge for their translated version in Apex on Salesforce.com
Force.com and Salesforce have many (1009) apps at http://sites.force.com/appexchange/home for cloud computing for
businesses, but very few forecasting and statistical simulation apps.
These are like iPhone apps except meant for business purposes (I am
unaware if any university is offering salesforce.com integration
though google apps and amazon related research seems to be on)
Personal Note-Mentioning SAS in an email to a R list is a big no-no in terms of getting a response and love. Same for being careless about which R help list to email (like R devel or R packages or R help)
A beautiful role playing game is Eve Online available at http://www.eve-online.com/. Its like a Star Wars fantasy game and the graphics are mind blowing . The downside- its so addictive that you are forewarned.
Also it employs economists who monitor the interactions of the thousands of people and millions of online interactions and take steps to ensure balance and fair play. Thus a useful simulation of people without the diruptive pain of the real world.