Renjin is now FOAS!
What is Renjin
Renjin is a JVM-based interpreter for the R language for statistical computing. This project is an initiative of BeDataDriven, a company providing consulting in analytics and decision support systems.
R on the JVM
Over the past two decades, the R language for statistical computing has emerged as the de facto standard for analysts, statisticians, and scientists. Today, a wide range of enterprises –from pharmaceuticals to insurance– depend on R for key business uses. Renjin is a new implementation of the R language and environment for the Java Virtual Machine (JVM), whose goal is to enable transparent analysis of big data sets and seamless integration with other enterprise systems such as databases and application servers.
Renjin is still under development, with a target of a version “1.0” in late 2013, but in the meantime it is being used in production for a number of our client projects, and supports most CRAN packages, including some with C/Fortran dependencies.
We built Renjin, a new interpreter for the JVM because we wanted the beauty, the flexibility, and power of R with the performance of the Java Virtual Machine.
R has been traditionally limited by the need to fit data sets into memory, and working with even modest sets of data can quickly exhaust memory due to historical limitations in GNU R interpreter’s implementation.
Renjin will allow R scripts to transparently interact with data wherever it’s stored, whether that’s on disk, in a remote database, or in the cloud.
While there have been attempts to bring big data to the original interpreter, these have generally provided a parallel set of data structures and algorithms, threatening a fragmentation of the language and platform. Renjin, in contrast, will allow existing R code to run on larger datasets with no modification, using R’s familiar and standard data structures and algorithms.
Renjin offers performance improvements in executing R code on several fronts:
- Vector operations: Renjin’s deferred computation engine automatically parallelizes and optimizes vector operation to run an order of magnitude faster, without the memory demands of computing intermediate structures
- Matrix operations: Renjin allows the user to plugin best-of-class implementations of BLAS, LAPACK, and FFT.
- Scalar operations: Renjin will compile frequently used portions of R code to JVM byte code on the fly, dramatically increasing performance of R’s notorious performance on for loops and other predominantly scalar code [2013Q3]
These improvements make it possible to perform real-time analyses using complex models.
Renjin enables R developers to deploy their code to Platform-as-a-Service providers like Google Appengine, Amazon Beanstalk or Heroku without worrying about scale or infrastructure. Renjin is pure Java – it can run anywhere.
However, I did test it and I think the R and Clojure community and even the professional R product companies can do a bit more to support R on JVM
I would also be careful on the licenses of the Java flavor used 😉
Nopes, Brian Ripley is still benevolent dictator of life at R. He wont be losing any sleep on this new fork of R!
But seriously 😉 !