Marry Big Data Analytics to High Performance Computing, and you get the buzzword of this season- High Performance Analytics.
It basically consists of Parallelized code to run in parallel on custom hardware, in -database analytics for speed, and cloud computing /high performance computing environments. On an operational level, it consists of software (as in analytics) partnering with software (as in databases, Map reduce, Hadoop) plus some hardware (HP or IBM mostly). It is considered a high margin , highly profitable, business with small number of deals compared to say desktop licenses.
As per HPC Wire- which is a great tool/newsletter to keep updated on HPC , SAS Institute has been busy on this front partnering with EMC Greenplum and TeraData (who also acquired SAS Partner AsterData to gain a much needed foot in the MR/SQL space) -while Revolution Analytics has been trying out a partnership with IBM (via it’s acquisition of NetEzza)
SAS is considered the undisputed leader in advanced analytics — that according to IDC who, in 2009, pegged the company with a 34.7 percent market share in this category. A subset of business analytics, advanced analytics uses compute-intensive data mining and statistical software techniques to extract complex relationships from databases. For SAS, it’s a half a billion dollar business.
In early April, SAS demonstrated the power of high performance analytics at its Global Forum meeting. In the first case, two racks (16 nodes) of Greenplum’s Data Computing Appliance (DCA) were used to run a logistic regression of bank loan defaults across a database with a billion records, applying just a few variables. The regression was able to complete in less than 80 seconds (as compared to 20 hours for an unspecified serial implementation). Another demonstration, this time on a 24-node Teradata platform, used 1,800 variables applied to 50 million observations. In this case, the analysis finished in 42 seconds.
You should read the complete article – it is an excellently written article on how technology should be written about, with complete details of hardware and software across two platforms, and very less lazy copy and paste from briefings, deck, PR as some other tech journalists are often prone to do.
An additional resource for keeping track of database technologies is DBMS2 written by Curt Monash, what i really like is Curt takes time to climb down from the pundit’s pulpit and explains coherently and concisely in terms people like me can understand. You should read the full article- this is just a summary.
- SAS no longer plans to go as far with in-database modeling as it previously intended.
- Rather, SAS plans to run in RAM on MPP DBMS appliances,exploiting MPI (Message Passing Interface).
- SAS HPA does make sense after all (dbms2.com)
- Greenplum & SAS Pair on ‘Big Analytics’ (java.sys-con.com)
- EMC Greenplum inks SAS partnership, launches new appliances (zdnet.com)
- Revolution Analytics Offers Free Software for Kaggle Competitors (readwriteweb.com)
- Revolution Analytics update (dbms2.com)
- Unpacking the EMC Greenplum Q1 sales disaster rumors (dbms2.com)
- IBM and Revolution team to create new in-database R (decisionstats.com)
- Cisco UCS Leads the Industry in Server Performance and Productivity (blogs.cisco.com)