2011 Analytics Recap

Events in the field of data that impacted us in 2011

1) Oracle unveiled plans for R Enterprise. This is one of the strongest statements of its focus on in-database analytics. Oracle also unveiled plans for a Public Cloud

2) SAS Institute released version 9.3 , a major analytics software in industry use.

3) IBM acquired many companies in analytics and high tech. Again.However the expected benefits from Cognos-SPSS integration are yet to show a spectacular change in market share.

2011 Selected acquisitions

Emptoris Inc. December 2011

Cúram Software Ltd. December 2011

DemandTec December 2011

Platform Computing October 2011

 Q1 Labs October 2011

Algorithmics September 2011

 i2 August 2011

Tririga March 2011

 

4) SAP promised a lot with SAP HANA- again no major oohs and ahs in terms of market share fluctuations within analytics.

http://www.sap.com/india/news-reader/index.epx?articleID=17619

5) Amazon continued to lower prices of cloud computing and offer more options.

http://aws.amazon.com/about-aws/whats-new/2011/12/21/amazon-elastic-mapreduce-announces-support-for-cc2-8xlarge-instances/

6) Google continues to dilly -dally with its analytics and cloud based APIs. I do not expect all the APIs in the Google APIs suit to survive and be viable in the enterprise software space.  This includes Google Cloud Storage, Cloud SQL, Prediction API at https://code.google.com/apis/console/b/0/ Some of the location based , translation based APIs may have interesting spin offs that may be very very commercially lucrative.

7) Microsoft -did- hmm- I forgot. Except for its investment in Revolution Analytics round 1 many seasons ago- very little excitement has come from MS plans in data mining- The plugins for cloud based data mining from Excel remain promising yet , while Azure remains a stealth mode starter.

8) Revolution Analytics promised us a GUI and didnt deliver (till yet 🙂 ) . But it did reveal a much better Enterprise software Revolution R 5.0 is one of the strongest enterprise software in the R /Stat Computing space and R’s memory handling problem is now an issue of perception than actual stuff thanks to newer advances in how it is used.

9) More conferences, more books and more news on analytics startups in 2011. Big Data analytics remained a strong buzzword. Expect more from this space including creative uses of Hadoop based infrastructure.

10) Data privacy issues continue to hamper and impede effective analytics usage. So does rational and balanced regulation in some of the most advanced economies. We expect more regulation and better guidelines in 2012.

Revolution Webinar Series #Rstats

Revolution Analytics Webinar-

 

Featured Webinar
David Champagne REGISTER NOW
Presenter David Champagne
CTO, Revolution Analytics
Date Tuesday, December 20th
Time 11:00AM – 11:30AM Pacific 
Click here for the webinar time in your local time zone

Big Data Starts with R

Traditional IT infrastructure is simply unable to meet

the demands of the new “Big Data Analytics” landscape.   Many enterprises are turning to the “R” statistical programming language and Hadoop (both open source projects) as a potential solution. This webinar will introduce the statistical capabilities of R within the Hadoop ecosystem.  We’ll cover:

  • An introduction to new packages developed by Revolution Analytics to facilitate interaction with the data stores HDFS and HBase so that they can be leveraged from the R environment
  • An overview of how to write Map Reduce jobs in R using Hadoop
  • Special considerations that need to be made when working with R and Hadoop.

We’ll also provide additional resources that are available to people interested in integrating R and Hadoop.

 

Upcoming Webinars
Wed, Dec 14th
11:00AM – 11:30AM PT
Revolution R Enterprise – 100% R and MoreR users already know why the R language is the lingua franca of statisticians today: because it’s the most powerful statistical language in the world. Revolution Analytics builds on the power of open source R, and adds performance, productivity and integration features to create Revolution R Enterprise. In this webinar, author and blogger David Smith will introduce the additional capabilities of Revolution R Enterprise.
 Archived Webinars-
Revolution Webinar: New Features in Revolution R Enterprise 5.0 (including RevoScaleR) to Support Scalable Data AnalysisRevolution R Enterprise 5.0 is Revolution Analytics’ scalable analytics platform.  At its core is Revolution Analytics’ enhanced Distribution of R, the world’s most widely-used project for statistical computing.  In this webinar, Dr. Ranney will discuss new features and show examples of the new functionality, which extend the platform’s usability, integration and scalability

 

Product Review – Revolution R 5.0

So I got the email from Revolution R. Version 5.0 is ready for download, and unlike half hearted attempts by many software companies they make it easy for the academics and researchers to get their free copy. Free as in speech and free as in beer.

Some thoughts-

1) R ‘s memory problem is now an issue of marketing and branding. Revolution Analytics has definitely bridged this gap technically  beautifully and I quote from their documentation-

The primary advantage 64-bit architectures bring to R is an increase in the amount of memory available to a given R process.
The first benefit of that increase is an increase in the size of data objects you can create. For example, on most 32-bit versions of R, the largest data object you can create is roughly 3GB; attempts to create 4GB objects result in errors with the message “cannot allocate vector of length xxxx.”
On 64-bit versions of R, you can generally create larger data objects, up to R’s current hard limit of 231 􀀀 1 elements in a vector (about 2 billion elements). The functions memory.size and memory.limit help you manage the memory used byWindows versions of R.
In 64-bit Revolution R Enterprise, R sets the memory limit by default to the amount of physical RAM minus half a gigabyte, so that, for example, on a machine with 8GB of RAM, the default memory limit is 7.5GB:

2) The User Interface is best shown as below or at  https://docs.google.com/presentation/pub?id=1V_G7r0aBR3I5SktSOenhnhuqkHThne6fMxly_-4i8Ag&start=false&loop=false&delayms=3000

-(but I am still hoping for the GUI ,Revolution Analytics promised us for Christmas)

3) The partnership with Microsoft HPC is quite awesome given Microsoft’s track record in enterprise software penetration

but I am also interested in knowing more about the Oracle version of R and what it will do there.

Oracle adds R to Big Data Appliance -Use #Rstats

From the press release, Oracle gets on R and me too- NoSQL

http://www.oracle.com/us/corporate/press/512001

The Oracle Big Data Appliance is a new engineered system that includes an open source distribution of Apache™ Hadoop™, Oracle NoSQL Database, Oracle Data Integrator Application Adapter for Hadoop, Oracle Loader for Hadoop, and an open source distribution of R.

From

http://www.theregister.co.uk/2011/10/03/oracle_big_data_appliance/

the Big Data Appliance also includes the R programming language, a popular open source statistical-analysis tool. This R engine will integrate with 11g R2, so presumably if you want to do statistical analysis on unstructured data stored in and chewed by Hadoop, you will have to move it to Oracle after the chewing has subsided.

This approach to R-Hadoop integration is different from that announced last week between Revolution Analytics, the so-called Red Hat for stats that is extending and commercializing the R language and its engine, and Cloudera, which sells a commercial Hadoop setup called CDH3 and which was one of the early companies to offer support for Hadoop. Both Revolution Analytics and Cloudera now have Oracle as their competitor, which was no doubt no surprise to either.

In any event, the way they do it, the R engine is put on each node in the Hadoop cluster, and those R engines just see the Hadoop data as a native format that they can do analysis on individually. As statisticians do analyses on data sets, the summary data from all the nodes in the Hadoop cluster is sent back to their R workstations; they have no idea that they are using MapReduce on unstructured data.

Oracle did not supply configuration and pricing information for the Big Data Appliance, and also did not say when it would be for sale or shipping to customers

From

http://www.oracle.com/us/corporate/features/feature-oracle-nosql-database-505146.html

A Horizontally Scaled, Key-Value Database for the Enterprise
Oracle NoSQL Database is a commercial grade, general-purpose NoSQL database using a key/value paradigm. It allows you to manage massive quantities of data, cope with changing data formats, and submit simple queries. Complex queries are supported using Hadoop or Oracle Database operating upon Oracle NoSQL Database data.

Oracle NoSQL Database delivers scalable throughput with bounded latency, easy administration, and a simple programming model. It scales horizontally to hundreds of nodes with high availability and transparent load balancing. Customers might choose Oracle NoSQL Database to support Web applications, acquire sensor data, scale authentication services, or support online serves and social media.

and

from

http://siliconangle.com/blog/2011/09/30/oracle-adopting-open-source-r-to-connect-legacy-systems/

Oracle says it will integrate R with its Oracle Database. Other signs from Oracle show the deeper interest in using the statistical framework for integration with Hadoop to potentially speed statistical analysis. This has particular value with analyzing vast amounts of unstructured data, which has overwhelmed organizations, especially over the past year.

and

from

http://www.oracle.com/us/corporate/features/features-oracle-r-enterprise-498732.html

Oracle R Enterprise

Integrates the Open-Source Statistical Environment R with Oracle Database 11g
Oracle R Enterprise allows analysts and statisticians to run existing R applications and use the R client directly against data stored in Oracle Database 11g—vastly increasing scalability, performance and security. The combination of Oracle Database 11g and R delivers an enterprise-ready, deeply integrated environment for advanced analytics. Users can also use analytical sandboxes, where they can analyze data and develop R scripts for deployment while results stay managed inside Oracle Database.

Use R for Business- Competition worth $ 20,000 #rstats

All you contest junkies, R lovers and general change the world people, here’s a new contest to use R in a business application

http://www.revolutionanalytics.com/news-events/news-room/2011/revolution-analytics-launches-applications-of-r-in-business-contest.php

REVOLUTION ANALYTICS LAUNCHES “APPLICATIONS OF R IN BUSINESS” CONTEST

$20,000 in Prizes for Users Solving Business Problems with R

 

PALO ALTO, Calif. – September 1, 2011 – Revolution Analytics, the leading commercial provider of R software, services and support, today announced the launch of its “Applications of R in Business” contest to demonstrate real-world uses of applying R to business problems. The competition is open to all R users worldwide and submissions will be accepted through October 31. The Grand Prize winner for the best application using R or Revolution R will receive $10,000.

The bonus-prize winner for the best application using features unique to Revolution R Enterprise – such as itsbig-data analytics capabilities or its Web Services API for R – will receive $5,000. A panel of independent judges drawn from the R and business community will select the grand and bonus prize winners. Revolution Analytics will present five honorable mention prize winners each with $1,000.

“We’ve designed this contest to highlight the most interesting use cases of applying R and Revolution R to solving key business problems, such as Big Data,” said Jeff Erhardt, COO of Revolution Analytics. “The ability to process higher-volume datasets will continue to be a critical need and we encourage the submission of applications using large datasets. Our goal is to grow the collection of online materials describing how to use R for business applications so our customers can better leverage Big Analytics to meet their analytical and organizational needs.”

To enter Revolution Analytics’ “Applications of R in Business” competition Continue reading “Use R for Business- Competition worth $ 20,000 #rstats”

Revolution #Rstats Webinar

David Smith of Revo presents a nice webinar on the capabilities and abilities of Revolution R- if you are R curious and wonder how the commercial version has matured- you may want to take a look.

click below to view an executive Webinar

——————————————————————————————-

Revolution R Enterprise—presented by author and blogger David Smith:

Revolution R: 100% R and More
On-Demand Webinar

This Webinar covers how R users can upgrade to:

  • Multi-processor speed improvements and parallel processing
  • Productivity and debugging with an integrated development environment (IDE) for the R language
  • “Big Data” analysis, with out-of-memory storage of multi-gigabyte data sets
  • Web Services for R, to integrate R computations and graphics into 3rd-Party applications like Excel and BI Dashboards
  • Expert technical support and consulting services for R

This webinar will be of value to current R users who want to learn more about the additional capabilities of Revolution R Enterprise to enhance the productivity, ease of use, and enterprise readiness of open source R. R users in academia will also find this webinar valuable: we will explain how all members of the academic community can obtain Revolution R Enterprise free of charge.

—————————————————————————————

contact -1-855-GET-REVO or via online form.
info@revolutionanalytics.com | (650) 330-0553 | Twitter @RevolutionR

Interview Mike Boyarski Jaspersoft

Here is an interview with Mike Boyarski , Director Product Marketing at Jaspersoft

.

 

the largest BI community with over 14 million downloads, nearly 230,000 registered members, representing over 175,000 production deployments, 14,000 customers, across 100 countries.

Ajay- Describe your career in science from Biology to marketing great software.
Mike- I studied Biology with the assumption I’d pursue a career in medicine. It took about 2 weeks during an internship at a Los Angeles hospital to determine I should do something else.  I enjoyed learning about life science, but the whole health care environment was not for me.  I was initially introduced to enterprise-level software while at Applied Materials within their Microcontamination group.  I was able to assist with an internal application used to collect contamination data.  I later joined Oracle to work on an Oracle Forms application used to automate the production of software kits (back when documentation and CDs had to be physically shipped to recognize revenue). This gave me hands on experience with Oracle 7, web application servers, and the software development process.
I then transitioned to product management for various products including application servers, software appliances, and Oracle’s first generation SaaS based software infrastructure. In 2006, with the Siebel and PeopleSoft acquisitions underway, I moved on to Ingres to help re-invigorate their solid yet antiquated technology. This introduced me to commercial open source software and the broader Business Intelligence market.  From Ingres I joined Jaspersoft, one of the first and most popular open source Business Intelligence vendors, serving as head of product marketing since mid 2009.
Ajay- Describe some of the new features in Jaspersoft 4.1 that help differentiate it from the rest of the crowd. What are the exciting product features we can expect from Jaspersoft down the next couple of years.
Mike- Jaspersoft 4.1 was an exciting release for our customers because we were able to extend the latest UI advancements in our ad hoc report designer to the data analysis environment. Now customers can use a unified intuitive web-based interface to perform several powerful and interactive analytic functions across any data source, whether its relational, non-relational, or a Big Data source.
 The reality is that most (roughly 70%) of todays BI adoption is in the form of reports and dashboards. These tools are used to drive and measure an organizations business, however, data analysis presents the most strategic opportunity for companies because it can identify new opportunities, efficiencies, and competitive differentiation.  As more data comes online, the difference between those companies that are successful and those that are not will likely be attributed to their ability to harness data analysis techniques to drive and improve business performance. Thus, with Jaspersoft 4.1, and our improved ad hoc reporting and analysis UI we can effectively address a broader set of BI requirements for organizations of all sizes.
Ajay-  What do you think is a good metric to measure influence of an open source software product – is it revenue or is it number of downloads or number of users. How does Jaspersoft do by these counts.
Mike- History has shown that open source software is successful as a “bottoms up” disrupter within IT or the developer market.  Today, many new software projects and startup ventures are birthed on open source software, often initiated with little to no budget. As the organization achieves success with a particular project, the next initiative tends to be larger and more strategic, often displacing what was historically solved with a proprietary solution. These larger deployments strengthen the technology over time.
Thus, the more proven and battle tested an open source solution is, often measured via downloads, deployments, community size, and community activity, usually equates to its long term success. Linux, Tomcat, and MySQL have plenty of statistics to model this lifecycle. This model is no different for open source BI.
The success to date of Jaspersoft is directly tied to its solid proven technology and the vibrancy of the community.  We proudly and openly claim to have the largest BI community with over 14 million downloads, nearly 230,000 registered members, representing over 175,000 production deployments, 14,000 customers, across 100 countries.  Every day, 30,000 developers are using Jaspersoft to build BI applications.  Behind Excel, its hard to imagine a more widely used BI tool in the market.  Jaspersoft could not reach these kind of numbers with crippled or poorly architected software.
Ajay- What are your plans for leveraging cloud computing, mobile and tablet platforms and for making Jaspersoft more easy and global  to use.