Oracle adds R to Big Data Appliance -Use #Rstats

From the press release, Oracle gets on R and me too- NoSQL

http://www.oracle.com/us/corporate/press/512001

The Oracle Big Data Appliance is a new engineered system that includes an open source distribution of Apache™ Hadoop™, Oracle NoSQL Database, Oracle Data Integrator Application Adapter for Hadoop, Oracle Loader for Hadoop, and an open source distribution of R.

From

http://www.theregister.co.uk/2011/10/03/oracle_big_data_appliance/

the Big Data Appliance also includes the R programming language, a popular open source statistical-analysis tool. This R engine will integrate with 11g R2, so presumably if you want to do statistical analysis on unstructured data stored in and chewed by Hadoop, you will have to move it to Oracle after the chewing has subsided.

This approach to R-Hadoop integration is different from that announced last week between Revolution Analytics, the so-called Red Hat for stats that is extending and commercializing the R language and its engine, and Cloudera, which sells a commercial Hadoop setup called CDH3 and which was one of the early companies to offer support for Hadoop. Both Revolution Analytics and Cloudera now have Oracle as their competitor, which was no doubt no surprise to either.

In any event, the way they do it, the R engine is put on each node in the Hadoop cluster, and those R engines just see the Hadoop data as a native format that they can do analysis on individually. As statisticians do analyses on data sets, the summary data from all the nodes in the Hadoop cluster is sent back to their R workstations; they have no idea that they are using MapReduce on unstructured data.

Oracle did not supply configuration and pricing information for the Big Data Appliance, and also did not say when it would be for sale or shipping to customers

From

http://www.oracle.com/us/corporate/features/feature-oracle-nosql-database-505146.html

A Horizontally Scaled, Key-Value Database for the Enterprise
Oracle NoSQL Database is a commercial grade, general-purpose NoSQL database using a key/value paradigm. It allows you to manage massive quantities of data, cope with changing data formats, and submit simple queries. Complex queries are supported using Hadoop or Oracle Database operating upon Oracle NoSQL Database data.

Oracle NoSQL Database delivers scalable throughput with bounded latency, easy administration, and a simple programming model. It scales horizontally to hundreds of nodes with high availability and transparent load balancing. Customers might choose Oracle NoSQL Database to support Web applications, acquire sensor data, scale authentication services, or support online serves and social media.

and

from

http://siliconangle.com/blog/2011/09/30/oracle-adopting-open-source-r-to-connect-legacy-systems/

Oracle says it will integrate R with its Oracle Database. Other signs from Oracle show the deeper interest in using the statistical framework for integration with Hadoop to potentially speed statistical analysis. This has particular value with analyzing vast amounts of unstructured data, which has overwhelmed organizations, especially over the past year.

and

from

http://www.oracle.com/us/corporate/features/features-oracle-r-enterprise-498732.html

Oracle R Enterprise

Integrates the Open-Source Statistical Environment R with Oracle Database 11g
Oracle R Enterprise allows analysts and statisticians to run existing R applications and use the R client directly against data stored in Oracle Database 11g—vastly increasing scalability, performance and security. The combination of Oracle Database 11g and R delivers an enterprise-ready, deeply integrated environment for advanced analytics. Users can also use analytical sandboxes, where they can analyze data and develop R scripts for deployment while results stay managed inside Oracle Database.

Rcpp Workshop in San Francisco Oct 8th

 Rcpp Workshop in San Francisco  Oct 8th 

Following the successful one-day master class on Rcpp preceding this year’s R/Finance conference, a full-day master class on Rcpp and related topics which will be held on Saturday, October 8, in San Francisco.

Join Dirk Eddelbuettel for six hours of detailed and hands-on instructions and discussions aroundRcppinline,  RInsideRcppArmadilloRcppGSLRcppEigen and other packages—in an intimate small-group setting.

The full-day format allows combining an introductory morning session with a more advanced afternoon session while leaving room for sufficient breaks. We plan on having about six hours of instructions, a one-hour lunch break and two half-hour coffee breaks (and lunch and refreshments will be provided).

Morning session: “A Hands-on Introduction to R and C++”

The morning session will provide a practical introduction to the Rcpp package (and other related packages).  The focus will be on simple and straightforward applications of Rcpp in order to extend R and/or to significantly accelerate the execution of simple functions.

The tutorial will cover the inline package which permits embedding of self-contained C, C++ or FORTRAN code in R scripts. We will also discuss  RInside, to easily embed the R engine code in C++ applications, as well as standard Rcpp extension packages such as RcppArmadillo and RcppEigen for linear algebra (via highly expressive templated C++ libraries) and RcppGSL.

Afternoon session: “Advanced R and C++ Topics”

The afternoon tutorial will provide a hands-on introduction to more advanced Rcpp features. It will cover topics such as writing packages that use Rcpp, how Rcpp modules and the new R ReferenceClasses interact, and how Rcpp sugar lets us write C++ code that is often as expressive as R code. Another possible topic, time permitting, may be writing glue code to extend Rcpp to other C++ projects.

We also expect to leave some time to discuss problems brought by the class participants.

October 8, 2011 – San Franciso

AMA Executive Conference Center
@ the Marriott Hotel
55 4th Street, 2nd Level
San Francisco, CA 94103
Tel.             415-442-6770

Register Now!

Instructor Bio

Dirk Eddelbuettel Dirk E has been contributing packages to CRAN for nearly a decade. Among these are RQuantLib, digest, littler, random, RPostgreSQL, as well the Rcpp family of packages comprising Rcpp, RInside, RcppClassic, RcppExamples, RcppDE, RcppArmadillo and RcppEigen. He maintains the CRAN Task Views for Finance as well as High-Performance Computing, and is a founding co-organiser of the annual R / Finance conferences in Chicago. He has Ph.D. in Financial Econometrics from EHESS (Paris), and works in Chicago as a Quantitative Strategist.

Jaspersoft releasing new version – 4.2

Jaspersoft is planning to launch its version 4.2 to the world.

http://www.jaspersoft.com/event/upcoming-webinar-introducing-jaspersoft-42?elq=c0e7a97601f84a8399b1abc5cc84bbe5

Upcoming Webinar: Introducing Jaspersoft 4.2

Webinar

Date: September 29, 2011
Time: 10:00 AM PT/1:00 PM ET
Duration: 60 minutes
Language: English

Whether your building business intelligence (BI) solutions for your organization or for your customers, one thing is likely: your users want access to information anytime, anywhere. The challenge is getting the right information, to the right person, on the right device, without breaking your budget.

You can see precisely what we mean at the Jaspersoft 4.2 launch webinar on Thursday, September 29th.

Join us and see how Jaspersoft 4.2 can deliver superior choice for organizations looking to deliver information to end users, wherever they are.  Jaspersoft is focused on providing modern, usable, affordable BI for everyone.
●      Discover the new product capabilities that will improve BI access for your users
●      See Jaspersoft 4.2 live demos
●      Join Japersoft experts and fellow technology professionals in a real-time, interactive discussion.

Register to reserve your seat, today!

Open Source Analytics

My guest blog at Allanalytics.com is now up

http://www.allanalytics.com/ is the exciting community which looks at the business aspects to the analytics market with a great lineup of pedigree writers.

It is basically a point/counterpoint for and against open source analytics. I feel there is a scope of lot of improvement before open source dominates the world of analytics software like Android, Linux Web Server do in their markets. Part of this reason is – there needs to be more , much more investment in analytics research, development, easier to use interfaces, Big Data integration and rewarding ALL the writers of code regardless of whether the code is proprietary or open source.

A last word- I think open source analytics AND proprietary analytics software will have to learn to live with one another, with game theory dictating their response and counter-response. More competition is good, and open source is an AND option not an OR option to existing status quo.

You can read the full blog discussion at http://www.allanalytics.com/author.asp?section_id=1408&doc_id=233454&piddl_msgorder=thrd#msgs

Hopefully discussion would be more analytical than passionate 🙂 and greater investments in made in analytics by all sides.

 

SAP HANA Contest

From SAP, a new contest based on HANA

http://wiki.sdn.sap.com/wiki/display/events/SAP%20HANA%20InnoJam%20Online%202011?bc=true

Become a champion and join our new developer challenge! Explore SAP HANA and share a critical business problem you want to solve on idea place for SAP HANA InnoJam online and describe your idea. Your participation starts here today, 13th September. The first 100 submitters will play on SAP HANA and get a chance to shine and win big prizes! To be prepared for the new challange, please read All sources you might need for your SAP HANA training, first chapters of the SAP HANA Pocketbook (coming soon) and read more on the official SAP HANA webpage.

Phase 1

  • Share a critical business problem you want to solve on Idea Place, the place is open from September 13th, 2011
  • First 100 submitters go to Phase 2

Phase 2

  • Join SAP HANA Champions community, get started with HANA, build teams
  • Access SAP HANA sandbox
  • Use SAP HANA to solve your business problem
  • Get helped by SAP experts if and when needed

Phase 3

  • Share your solution: written description and a short video
  • Top 8 finalists go to Phase 4

Phase 4

  • 8 finalists present their problem/solution/feedback in front of SAP executives and a panel of judges (early January in Palo Alto)
  • Winner prizes will be identified soon
Also suggested reading-

Use R for Business- Competition worth $ 20,000 #rstats

All you contest junkies, R lovers and general change the world people, here’s a new contest to use R in a business application

http://www.revolutionanalytics.com/news-events/news-room/2011/revolution-analytics-launches-applications-of-r-in-business-contest.php

REVOLUTION ANALYTICS LAUNCHES “APPLICATIONS OF R IN BUSINESS” CONTEST

$20,000 in Prizes for Users Solving Business Problems with R

 

PALO ALTO, Calif. – September 1, 2011 – Revolution Analytics, the leading commercial provider of R software, services and support, today announced the launch of its “Applications of R in Business” contest to demonstrate real-world uses of applying R to business problems. The competition is open to all R users worldwide and submissions will be accepted through October 31. The Grand Prize winner for the best application using R or Revolution R will receive $10,000.

The bonus-prize winner for the best application using features unique to Revolution R Enterprise – such as itsbig-data analytics capabilities or its Web Services API for R – will receive $5,000. A panel of independent judges drawn from the R and business community will select the grand and bonus prize winners. Revolution Analytics will present five honorable mention prize winners each with $1,000.

“We’ve designed this contest to highlight the most interesting use cases of applying R and Revolution R to solving key business problems, such as Big Data,” said Jeff Erhardt, COO of Revolution Analytics. “The ability to process higher-volume datasets will continue to be a critical need and we encourage the submission of applications using large datasets. Our goal is to grow the collection of online materials describing how to use R for business applications so our customers can better leverage Big Analytics to meet their analytical and organizational needs.”

To enter Revolution Analytics’ “Applications of R in Business” competition Continue reading “Use R for Business- Competition worth $ 20,000 #rstats”

Cloud Computing using Python

I liked the new features in PiCloud , which is a cloud computing way to use Python. Python is increasingly popular as a computational language, and the cloud is the way where HW is headed to atleast as of 2011-12

http://www.picloud.com/

The new features allows you to publish your own functions as urls.

 By publishing your Python functions to URLs. Why would you want to publish a function?

  • To call your Python functions from a programming language other than Python.
  • To use PiCloud from Google AppEngine, which does not support our native client library.
  • To easily setup a scalable RPC system.

Here’s a peek at the interface:

You publish a Python function

cloud.rest.publish(your_func, ‘myfunction’)

We give you a URL Back

https://api.picloud.com/r/2/myfunction/

You make an HTTP request using your method of choice to the URL

curl -k -u ‘key:secret_key’ https://api.picloud.com/r/2/myfunction/

It certainly is an interesting development and I am wondering how other languages can adopt this paradigm as well.
For R, as of now http://www.cloudnumbers.com/ seems to be the only player in the cloud.
It would be exciting to see more players in the cloud statistical analytical space.