#Rstats gets into Enterprise Cloud Software

Defense Agencies of the United States Departme...
Image via Wikipedia

Here is an excellent example of how websites should help rather than hinder new customers take a demo of the software without being overwhelmed by sweet talking marketing guys who dont know the difference between heteroskedasticity, probability, odds and likelihood.

It is made by Zementis (Dr Michael Zeller has been a frequent guest here) and Revolution Analytics is still the best shot in Enterprise software for #Rstats

Now if only Revo could get into the lucrative Department of Energy or Department of Defense business- they could change the world AND earn some more revenue than they have been doing. But seriously.

Check out http://deployr.revolutionanalytics.com/zementis/ and play with it. or better still mash it with some data viz and ROC curves.- or extend it with some APIS 😉

Analyzing Judas of Lady Gaga

Youtube, the big tube of the fat pipes of the internet’s shallow side (we shall discuss the deep Dark Net later)-

well Youtube enhanced video analytics greatly.

Have a look at Lady Gaga Judas video analytics data viz- do you think you can pack more or better data viz here.

Analytics is great- but Youtube please be a dear and hire some graphics designers once in a while. Nopes- not the ones your engineers are dating, but real graphics designers.

Tom Davenport to Keynote at PAW New York

Unidentified building, Babson College - IMG 0443
Image via Wikipedia

message from Predictive Analytics World. If you are NY based you may want to drop in and listen.———————————————————————————-Tom Davenport to Keynote at
Predictive Analytics World New York

Take advantage of Super Early Bird Pricing by May 20th and recognize savings of $400. Additional savings when you bring the team*

Announcing Tom Davenport Keynote:
Thomas Davenport Every Day Analytics:
Making Leading Edge Commonplace
Thomas Davenport
President’s Distinguished Prof, Babson College
Author, Competing on Analytics & Analytics at Work

Join your peers October 17-21, 2011 at the Hilton New York for Predictive Analytics World, the business event for predictive analytics professionals, managers and commercial practitioners, covering today’s commercial deployment of predictive analytics, across industries and across software vendors.

PAW NYC
 promises to once again break records as the biggest cross-vendor predictive analytics event ever. The conference program is packed with the top predictive analytics experts, practitioners, authors and business thought leaders, including keynote addresses from Thomas Davenport, author of Competing on Analytics: The New Science of Winning, and PAW Program Chair Eric Siegel, plus special sessions from industry heavy-weights Usama Fayyad and John Elder.

RAVE REVIEWS:I came to PAW because it provides case studies relevant to my industry. It has lived up to the expectation and I think it’s the best analytics conference I’ve ever attended!

Shaohua Zhang, Senior Data Mining Analyst
Rogers Telecommunications

Hands down, best applied analytics conference I have ever attended. Great exposure to cutting-edge predictive techniques and I was able to turn around and apply some of those learnings to my work immediately. I’ve never been able to say that after any conference I’ve attended before!

Jon Francis, Senior Statistician
T-Mobile

PAW NYC’s agenda covers black box trading, churn modeling, crowdsourcing, demand forecasting, ensemble models, fraud detection, healthcare, insurance applications, law enforcement, litigation, market mix modeling, mobile analytics, online marketing, risk management, social data, supply chain management, targeting direct marketing, uplift modeling (net lift), and other innovative applications that benefit organizations in new and creative ways.


Take advantage of Super Early Bird Pricing and realize
$400 in savings before May 20, 2011.

Note:  Each additional attendee from the same company registered at the same time receives an extra $200 off the Conference Pass.

Register Now!


eMetrics New York

Heritage offers 3 million chump change for Monkeys

My perspective is life is not fair, and if someone offers me 1 mill a year so they make 1 bill a year, I would still take it, especially if it leads to better human beings and better humanity on this planet. Health care isnt toothpaste.

Unless there are even more fine print changes involved- there exist several players in the pharma sector who do build and deploy models internally for denying claims or prospecting medical doctors with freebies, but they might just get caught with the new open data movement

————————————————————————————————–

A note from KDNuggets-

Heritage Health Prizereleased a second set of data on May 4. They also recently modified their ruleswhich now demand complete exclusivity and seem to disallow use of other tools (emphasis mine – Gregory PS)

21. LICENSE
By registering for the Competition, each Entrant (a) grants to Sponsor and its designees a worldwide, exclusive (except with respect to Entrant) , sub-licensable (through multiple tiers), transferable, fully paid-up, royalty-free, perpetual, irrevocable right to use, not use, reproduce, distribute (through multiple tiers), create derivative works of, publicly perform, publicly display, digitally perform, make, have made, sell, offer for sale and import the entry and the algorithm used to produce the entry, as well as any other algorithm, data or other information whatsoever developed or produced at any time using the data provided to Entrant in this Competition (collectively, the “Licensed Materials”), in any media now known or hereafter developed, for any purpose whatsoever, commercial or otherwise, without further approval by or payment to Entrant (the “License”) and
(b) represents that he/she/it has the unrestricted right to grant the License. 
Entrant understands and agrees that the License is exclusive except with respect to Entrant: Entrant may use the Licensed Materials solely for his/her/its own patient management and other internal business purposes but may not grant or otherwise transfer to any third party any rights to or interests in the Licensed Materials whatsoever.

This has lead to a call to boycott the competition by Tristan, who also notes that academics cannot publish their results without prior written approval of the Sponsor.

Anthony Goldbloom, CEO of Kaggle, emailed the HHP participants on May 4

HPN have asked me to pass on the following message: “The Heritage Provider Network is sponsoring the Heritage Health Prize to spur innovation and creative thinking in healthcare. HPN, however, is a medical group and must retain an exclusive license to the algorithms created using its data so as to ensure that the algorithms are used responsibly, and are only used to provide better health care to patients and not for improper purposes.
Put simply, while the competition hopes to spur innovation, this is not a competition regarding movie ratings or chess results. We hope that the clarifications we have made to the Rules and the FAQ adequately address your concerns and look forward to your participation in the competition.”

What do you think? Will the exclusive license prevent you from participating?

Heritage prize= 3mill now open

I am still angry with THE netflix for 1 mill I lost out. No sweat! this time the money is 3 times as much, it is legit, and yes baby you can change the world, make it a better place and get rich.! see details below-http://www.heritagehealthprize.com/c/hhp/Data

HERITAGE HEALTH PRIZE DATA FILES

You must accept this competition’s rules before you’ll be able to download data files.

IMPORTANT NOTE: The information provided below is intended only to provide general guidance to participants in the Heritage Health Prize Competition and is subject to the Competition Official Rules. Any capitalized term not defined below is defined in the Competition Official Rules. Please consult the Competition Official Rules for complete details.

Heritage Provider Network is providing Competition Entrants with deidentified member data collected during a forty-eight month period that is allocated among three data sets (the “Data Sets”). Competition Entrants will use the Data Sets to develop and test their algorithms for accurately predicting the number of days that the members will spend in a hospital (inpatient or emergency room visit) during the 12-month period following the Data Set cut-off date.

HHP_release2.zip contains the latest files, so you can ignore HHP_release1.zip. SampleEntry.CSV shows you how an entry should look.

Data Sets will be released to Entrants after registration on the Website according to the following schedule:

April 4, 2011 Claims Table – Y1 and DaysInHospital Table – Y2

May 4, 2011

All other Data Sets except Labs Table and Rx Table

From https://www.kaggle.com/

The $3 million Heritage Health Prize opens to entries

It’s been one month since the launch of the Heritage Health Prize. The prize has attracted some great publicity, receiving coverage from the Wall Street JournalThe EconomistSlate andForbes.

By now, people have had a good chance to poke around the first portion of the data. Now the fun starts! HPN have released two more years’-worth of data, set the accuracy threshold and are opening up the competition to entries. The data are available from the Heritage Health Prize page. Good luck to all participants!

The Deloitte/FIDE Chess Ratings Competition results

The Deloitte/FIDE Chess Ratings Competition attracted one of the strongest fields ever seen in a Kaggle Competition. The competition attracted 189 teams, ranging from chess ratings  experts to Netflix Prize winners. As Jeff Sonas wrote on the Kaggle blog last week, the  competition has far exceeded his expectations. A big congratulations the provisional winner, Tim Salimans, an econometrician at Erasmus University in Rotterdam. We look forward to reading about the approaches used by top performers on the Kaggle blog. We also look forward to the results of the FIDE prize, which could see the introduction of a new chess ratings system.

ICDAR 2011 Competition Results

The ICDAR 2011 competition also finished recently. The competiiton required participants to develop an algorithm that correctly matched handwriting samples. The winners were Lewis Griffin and Andrew Newell from the University College London who achieved Kaggle’s first ever perfect score by managing to match every sample correctly! Andrew and Lewis have posted a description of their winning method on the Kaggle blog.

Revolution R Enterprise

Since R is the most popular language used by Kaggle members, the Revolution Analytics team is making Revolution R Enterprise (the pre-eminent commercial version of R) available free of charge to Kaggle members. Revolution R Enterprise has several advantages over standard R, including the ability to seemlessly handle larger datasets. To get your free copy, visit http://info.revolutionanalytics.com/Kaggle.html.
Kaggle-in-Class

As many of you know, Kaggle offers a free platform, Kaggle-in-Class, for instructors who want to host competitions for their students. For those interested in hearing more about the use of Kaggle-in-Class as a teaching tool, Susan Holmes and Nelson Ray from Stanford University share their experience in a webinar organized by the Consortium for the Advancement of Undergraduate Statistics Education.

Google Storage for Developers goes into Enterprise Mode

Schematic representation of the SSL handshake ...
Image via Wikipedia

To help unify and uniform, collobrative work and data management and business models across the enterprise in secure SSL cloud environments- Google Storage has been rolling out some changes (read below)-this also gives you more options on the day Amazon goes ahem down (cough cough) because they didn’t think someone in their data environment could be sympathetic to free data.

——————————————————————————————————————————————————————–

https://groups.google.com/group/gs-announce

And now to the actual update.

We’re making some changes to Google Storage for Developers to make team-based development easier. As part of this work, we are introducing the concept of a project. In preparation for this feature, we will be creating projects for every user and migrating their buckets to it.

What does this mean for you?

Everything will continue to work as it always has. However, you will notice that if you perform a get-acl operation on any of your buckets, you will see extra ACL entries. These entries correspond to project groups. Each group has only one member – the person who owned the buckets before the bucket migration;  no additional rights have been granted to any of your buckets or objects. You should preserve these new ACL grants if you modify bucket ACLs.

An example entry for a modified ACL would look like this:

We’ll be rolling out these changes over the next few days,

http://blog.cloudberrylab.com/2011/04/cloudberry-explorer-for-google-storage.html

Detailed Note on GS-

https://code.google.com/apis/storage/

Google Storage for Developers is a RESTful service for storing and accessing your data on Google’s infrastructure. The service combines the performance and scalability of Google’s cloud with advanced security and sharing capabilities. Highlights include:

Fast, scalable, highly available object store

  • All data replicated to multiple U.S. data centers
  • Read-your-writes data consistency
  • Objects of hundreds of gigabytes in size per request with range-get support
  • Domain-scoped bucket namespace

Easy, flexible authentication and sharing

  • Key-based authentication
  • Authenticated downloads from a web browser
  • Individual- and group-level access controls

In addition, Google Storage for Developers offers a web-based interface for managing your storage and GSUtil, an open source command line tool and library. The service is also compatible with many existing cloud storage tools and libraries. With pay-as-you-go pricing, it’s easy to get started and scale as your needs grow.

Google Storage for Developers is currently only available to a limited number of developers. Please sign up to join the waiting list.

AsterData still alive;/launches SQL-MapReduce Developer Portal

so apparantly ole client AsterData continues to thrive under gentle touch of Terrific Data

———————————————————————————————————————————————————

Aster Data today launched the SQL-MapReduce Developer Portal, a new online community for data scientists and analytic developers. For your convenience, I copied the release below and it can also be found here. Please let me know if you have any questions or if there is anything else I can help you with.

Sara Korolevich

Point Communications Group for Aster Data

sarak@pointcgroup.com

Office: 602.279.1137

Mobile: 623.326.0881

Teradata Accelerates Big Data Analytics with First Collaborative Community for SQL-MapReduce®

New online community for data scientists and analytic developers enables development and sharing of powerful MapReduce analytics


San Carlos, California – Teradata Corporation (NYSE:TDC) today announced the launch of the Aster Data SQL-MapReduce® Developer Portal. This portal is the first collaborative online developer community for SQL-MapReduce analytics, an emerging framework for processing non-relational data and ultra-fast analytics.

“Aster Data continues to deliver on its unique vision for powerful analytics with a rich set of tools to make development of those analytics quick and easy,” said Tasso Argyros, vice president of Aster Data Marketing and Product Management, Teradata Corporation. “This new developer portal builds on Aster Data’s continuing SQL-MapReduce innovation, leveraging the flexibility and power of SQL-MapReduce for analytics that were previously impossible or impractical.”

The developer portal showcases the power and flexibility of Aster Data’s SQL-MapReduce – which uniquely combines standard SQL with the popular MapReduce distributed computing technology for processing big data – by providing a collaborative community for sharing SQL-MapReduce expert insights in addition to sharing SQL-MapReduce analytic functions and sample code. Data scientists, quantitative analysts, and developers can now leverage the experience, knowledge, and best practices of a community of experts to easily harness the power of SQL-MapReduce for big data analytics.

A recent report from IDC Research, “Taking Care of Your Quants: Focusing Data Warehousing Resources on Quantitative Analysts Matters,” has shown that by enabling data scientists with the tools to harness emerging types and sources of data, companies create significant competitive advantage and become leaders in their respective industry.

“The biggest positive differences among leaders and the rest come from the introduction of new types of data,” says Dan Vesset, program vice president, Business Analytics Solutions, IDC Research. “This may include either new transactional data sources or new external data feeds of transactional or multi-structured interactional data — the latter may include click stream or other data that is a by-product of social networking.”

Vesset goes on to say, “Aster Data provides a comprehensive platform for analytics and their SQL-MapReduce Developer Portal provides a community for sharing best practices and functions which can have an even greater impact to an organization’s business.”

With this announcement Aster Data extends its industry leadership in delivering the most comprehensive analytic platform for big data analytics — not only capable of processing massive volumes of multi-structured data, but also providing an extensive set of tools and capabilities that make it simple to leverage the power of MapReduce analytics. The Aster Data

SQL-MapReduce Developer Portal brings the power of SQL-MapReduce accessible to data scientists, quantitative analysis, and analytic developers by making it easy to share and collaborate with experts in developing SQL-MapReduce analytics. This portal builds on Aster Data’s history of SQL-MapReduce innovations, including:

  • The first deep integration of SQL with MapReduce
  • The first MapReduce support for .NET
  • The first integrated development environment, Aster Data
    Developer Express
  • A comprehensive suite of analytic functions, Aster Data
    Analytic Foundation

Aster Data’s patent-pending SQL-MapReduce enables analytic applications and functions that can deliver faster, deeper insights on terabytes to petabytes of data. These applications are implemented using MapReduce but delivered through standard SQL and business intelligence (BI) tools.

SQL-MapReduce makes it possible for data scientists and developers to empower business analysts with the ability to make informed decisions, incorporating vast amounts of data, regardless of query complexity or data type. Aster Data customers are using SQL-MapReduce for rich analytics including analytic applications for social network analysis, digital marketing optimization, and on-the-fly fraud detection and prevention.

“Collaboration is at the core of our success as one of the leading providers, and pioneers of social software,” said Navdeep Alam, director of Data Architecture at Mzinga. “We are pleased to be one of the early members of The Aster Data SQL-MapReduce Developer Portal, which will allow us the ability to share and leverage insights with others in using big data analytics to attain a deeper understanding of customers’ behavior and create competitive advantage for our business.”

SQL-MapReduce is one of the core capabilities within Aster Data’s flagship product. Aster DatanCluster™ 4.6, the industry’s first massively parallel processing (MPP) analytic platform has an integrated analytics engine that stores and processes both relational and non-relational data at scale. With Aster Data’s unique analytics framework that supports both SQL and
SQL-MapReduce™, customers benefit from rich, new analytics on large data volumes with complex data types. Aster Data analytic functions are embedded within the analytic platform and processed locally with data, which allows for faster data exploration. The SQL-MapReduce framework provides scalable fault-tolerance for new analytics, providing users with superior reliability, regardless of number of users, query size, or data types.


About Aster Data
Aster Data is a market leader in big data analytics, enabling the powerful combination of cost-effective storage and ultra-fast analysis of new sources and types of data. The Aster Data nCluster analytic platform is a massively parallel software solution that embeds MapReduce analytic processing with data stores for deeper insights on new data sources and types to deliver new analytic capabilities with breakthrough performance and scalability. Aster Data’s solution utilizes Aster Data’s patent-pending SQL-MapReduce to parallelize processing of data and applications and deliver rich analytic insights at scale. Companies including Barnes & Noble, Intuit, LinkedIn, Akamai, and MySpace use Aster Data to deliver applications such as digital marketing optimization, social network and relationship analysis, and fraud detection and prevention.


About Teradata
Teradata is the world’s leader in data warehousing and integrated marketing management through itsdatabase softwaredata warehouse appliances, and enterprise analytics. For more information, visitteradata.com.

# # #

Teradata is a trademark or registered trademark of Teradata Corporation in the United States and other countries.