RapidMiner launches extensions marketplace

For some time now, I had been hoping for a place where new package or algorithm developers get at least a fraction of the money that iPad or iPhone application developers get. Rapid Miner has taken the lead in establishing a marketplace for extensions. Is there going to be paid extensions as well- I hope so!!

This probably makes it the first “app” marketplace in open source and the second app marketplace in analytics after salesforce.com

It is hard work to think of new algols, and some of them can really be usefull.

Can we hope for #rstats marketplace where people downloading say ggplot3.0 atleast get a prompt to donate 99 cents per download to Hadley Wickham’s Amazon wishlist. http://www.amazon.com/gp/registry/1Y65N3VFA613B

Do you think it is okay to pay 99 cents per iTunes song, but not pay a cent for open source software.

I dont know- but I am just a capitalist born in a country that was socialist for the first 13 years of my life. Congratulations once again to Rapid Miner for innovating and leading the way.

http://rapid-i.com/component/option,com_myblog/show,Rapid-I-Marketplace-Launched.html/Itemid,172

RapidMinerMarketplaceExtensions 30 May 2011
Rapid-I Marketplace Launched by Simon Fischer

Over the years, many of you have been developing new RapidMiner Extensions dedicated to a broad set of topics. Whereas these extensions are easy to install in RapidMiner – just download and place them in the plugins folder – the hard part is to find them in the vastness that is the Internet. Extensions made by ourselves at Rapid-I, on the other hand,  are distributed by the update server making them searchable and installable directly inside RapidMiner.

We thought that this was a bit unfair, so we decieded to open up the update server to the public, and not only this, we even gave it a new look and name. The Rapid-I Marketplace is available in beta mode at http://rapidupdate.de:8180/ . You can use the Web interface to browse, comment, and rate the extensions, and you can use the update functionality in RapidMiner by going to the preferences and entering http://rapidupdate.de:8180/UpdateServer/ as the update server URL. (Once the beta test is complete, we will change the port back to 80 so we won’t have any firewall problems.)

As an Extension developer, just register with the Marketplace and drop me an email (fischer at rapid-i dot com) so I can give you permissions to upload your own extension. Upload is simple provided you use the standard RapidMiner Extension build process and will boost visibility of your extension.

Looking forward to see many new extensions there soon!

Disclaimer- Decisionstats is a partner of Rapid Miner. I have been liking the software for a long long time, and recently agreed to partner with them just like I did with KXEN some years back, and with Predictive AnalyticsConference, and Aster Data until last year.

I still think Rapid Miner is a very very good software,and a globally created software after SAP.

Here is the actual marketplace

http://rapidupdate.de:8180/UpdateServer/faces/index.xhtml

Welcome to the Rapid-I Marketplace Public Beta Test

The Rapid-I Marketplace will soon replace the RapidMiner update server. Using this marketplace, you can share your RapidMiner extensions and make them available for download by the community of RapidMiner users. Currently, we are beta testing this server. If you want to use this server in RapidMiner, you must go to the preferences and enter http://rapidupdate.de:8180/UpdateServer for the update url. After the beta test, we will change the port back to 80, which is currently occupied by the old update server. You can test the marketplace as a user (downloading extensions) and as an Extension developer. If you want to publish your extension here, please let us know via the contact form.

Hot Downloads
«« « 1 2 3 » »»
[Icon]The Image Processing Extension provides operators for handling image data. You can extract attributes describing colour and texture in the image, you can make several transformation of a image data which allows you to perform segmentation and detection of suspicious areas in image data.The extension provides many of image transformation and extraction operators ranging from Wavelet Decomposition, Hough Circle to Block Difference of Inverse probabilities.

[Icon]RapidMiner is unquestionably the world-leading open-source system for data mining. It is available as a stand-alone application for data analysis and as a data mining engine for the integration into own products. Thousands of applications of RapidMiner in more than 40 countries give their users a competitive edge.

  • Data IntegrationAnalytical ETLData Analysis, and Reporting in one single suite
  • Powerful but intuitive graphical user interface for the design of analysis processes
  • Repositories for process, data and meta data handling
  • Only solution with meta data transformation: forget trial and error and inspect results already during design time
  • Only solution which supports on-the-fly error recognition and quick fixes
  • Complete and flexible: Hundreds of data loading, data transformation, data modeling, and data visualization methods
[Icon]All modeling methods and attribute evaluation methods from the Weka machine learning library are available within RapidMiner. After installing this extension you will get access to about 100 additional modelling schemes including additional decision trees, rule learners and regression estimators.This extension combines two of the most widely used open source data mining solutions. By installing it, you can extend RapidMiner to everything what is possible with Weka while keeping the full analysis, preprocessing, and visualization power of RapidMiner.

[Icon]Finally, the two most widely used data analysis solutions – RapidMiner and R – are connected. Arbitrary R models and scripts can now be directly integrated into the RapidMiner analysis processes. The new R perspective offers the known R console together with the great plotting facilities of R. All variables and R scripts can be organized in the RapidMiner Repository.A directly included online help and multi-line editing makes the creation of R scripts much more comfortable.

Interview with Rob La Gesse Chief Disruption Officer Rackspace

Here is an interview with Rob La Gesse ,Chief Disruption Officer ,Rackspace Hosting.
Ajay- Describe your career  journey from not finishing college to writing software to your present projects?
Rob- I joined the Navy right out of High School. I had neither the money for college, or a real desire for it. I had several roles in the Navy, to include a Combat Medic station with the US Marine Corps and eventually becoming a Neonatal Respiratory Therapist.

After the Navy I worked as a Respiratory Therapist, a roofer, and I repaired print shop equipment. Basically whatever it took to make a buck or two.  Eventually I started selling computers.  That led me to running a multi-line dial-up BBS and I taught myself how to program.  Eventually that led to a job with a small engineering company where we developed WiFi.

After the WiFi project I started consulting on my own.  I used Rackspace to host my clients, and eventually they hired me.  I’ve been here almost three years and have held several roles. I currently manage Social Media, building 43 and am involved in several other projects such as the Rackspace Startup Program.

Ajay-  What is building43 all about ?

Rob- Building43 is a web site devoted to telling the stories behind technology startups. Basically, after we hired Robert Scoble and Rocky Barbanica we were figuring out how best we could work with them to both highlight Rackspace and customers.  That idea expanded beyond customers to highlighting anyone doing something incredible in the technology industry – mostly software startups.  We’ve had interviews with people like Mark Zuckerberg, CEO and Founder of FaceBook.  We’ve broken some news on the site, but it isn’t really a news site. It is a story telling site.

Rackspace has met some amazing new customers through the relationships that started with an interview.

Ajay-  How is life as Robert Scoble’s boss. Is he an easy guy to work with? Does he have super powers while he types?

Rob- Robert isn’t much different to manage than the rest of my employees. He is a person – no super powers.  But he does establish a unique perspective on things because he gets to see so much new technology early.  Often earlier than almost anyone else. It helps him to spot trends that others might not be seeing yet.
Ajay – Hosting companies are so so many. What makes Rackspace special for different kinds of customers?
Rob- I think what we do better than anyone is add that human touch – the people really care about your business.  We are a company that is focused on building one of the greatest service companies on the planet.  We sell support.  Hosting is secondary to service. Our motto is Fanatic Support®

and we actually look for people focused on delivering amazing customer experiences during our interviewing and hiring practices. People that find a personal sense of pride and reward by helping others should apply at
Rackspace.  We are hiring like crazy!

Ajay – Where do you see technology and the internet 5 years down the line? (we will visit the answers in 5 years 🙂 )?
Rob- I think the shift to Cloud computing is going to be dramatic.  I think in five years we will be much further down that path.  The scaling, cost-effectiveness, and on-demand nature of the Cloud are just too compelling for companies not to embrace. This changes business in fundamental ways – lower capital expenses, no need for in house IT staff, etc will save companies a lot of money and let them focus more on their core businesses. Computing will become another utility.  I also think mobile use of computing will be much more common than it is today.  And it is VERY common today.  Phones will replace car keys and credit cards (they already are). This too will drive use of Cloud computing  because we all want our data wherever we are – on whatever computing device we happen tobe using.
Ajay- GoDaddy CEO shoots elephants. What do you do in your  spare time, if any.
Rob- Well, I don’t hunt.  We do shoot a lot of video though! I enjoy playing poker, specifically Texas Hold ’em.  It is a very people oriented game, and people are my passion.

Brief Biography- (in his own words from http://www.lagesse.org/about/)

My technical background includes working on the development of WiFi, writing wireless applications for the Apple Newton, mentoring/managing several software-based start-ups, running software quality assurance teams and more. In 2008 I joined Rackspace as an employee – a “Racker”.  I was previously a 7 year customer and the company impressed me. My initial role was as Director of Software Development for the Rackspace Cloud.  It was soon evident that I was better suited to a customer facing role since I LOVE talking to customers. I am currently the Director of Customer Development Chief Disruption Officer.  I manage building43 and enjoy working with Robert Scoble and Rocky Barbanica to make that happen.  The org chart says they work for me.  Reality tells me the opposite :)

Go take a look – I’m proud of what we are building there (pardon the pun!).

I do a lot of other stuff at Rackspace – mostly because they let me!  I love a company that lets me try. Rackspace does that.Going further back, I have been a Mayor (in Hawaii). I have written successful shareware software. I have managed employees all over the world. I have been all over the world. I have also done roofing, repaired high end print-shop equipment, been a Neonatal Respiratory Therapist, done CPR on a boat, in a plane, and in a hardware store (and of course in hospitals).

I have treated jumpers from the Golden Gate Bridge – and helped save a few. I have lived in Illinois (Kankakee), California (San Diego, San Francisco and Novato), Texas (Corpus Christi and San Antonio), Florida (Pensacola and Palm Bay), Hawaii (Honolulu/Fort Shafter) and several other places for shorter durations.

For the last 8+ years I have been a single parent – and have done an amazing job (yes, I am a proud papa) thanks to having great kids.  They are both in College now – something I did NOT manage to accomplish. I love doing anything someone thinks I am not qualified to do.

I can be contacted at rob (at) lagesse (dot) org

you can follow Rob at http://twitter.com/kr8tr

Heritage offers 3 million chump change for Monkeys

My perspective is life is not fair, and if someone offers me 1 mill a year so they make 1 bill a year, I would still take it, especially if it leads to better human beings and better humanity on this planet. Health care isnt toothpaste.

Unless there are even more fine print changes involved- there exist several players in the pharma sector who do build and deploy models internally for denying claims or prospecting medical doctors with freebies, but they might just get caught with the new open data movement

————————————————————————————————–

A note from KDNuggets-

Heritage Health Prizereleased a second set of data on May 4. They also recently modified their ruleswhich now demand complete exclusivity and seem to disallow use of other tools (emphasis mine – Gregory PS)

21. LICENSE
By registering for the Competition, each Entrant (a) grants to Sponsor and its designees a worldwide, exclusive (except with respect to Entrant) , sub-licensable (through multiple tiers), transferable, fully paid-up, royalty-free, perpetual, irrevocable right to use, not use, reproduce, distribute (through multiple tiers), create derivative works of, publicly perform, publicly display, digitally perform, make, have made, sell, offer for sale and import the entry and the algorithm used to produce the entry, as well as any other algorithm, data or other information whatsoever developed or produced at any time using the data provided to Entrant in this Competition (collectively, the “Licensed Materials”), in any media now known or hereafter developed, for any purpose whatsoever, commercial or otherwise, without further approval by or payment to Entrant (the “License”) and
(b) represents that he/she/it has the unrestricted right to grant the License. 
Entrant understands and agrees that the License is exclusive except with respect to Entrant: Entrant may use the Licensed Materials solely for his/her/its own patient management and other internal business purposes but may not grant or otherwise transfer to any third party any rights to or interests in the Licensed Materials whatsoever.

This has lead to a call to boycott the competition by Tristan, who also notes that academics cannot publish their results without prior written approval of the Sponsor.

Anthony Goldbloom, CEO of Kaggle, emailed the HHP participants on May 4

HPN have asked me to pass on the following message: “The Heritage Provider Network is sponsoring the Heritage Health Prize to spur innovation and creative thinking in healthcare. HPN, however, is a medical group and must retain an exclusive license to the algorithms created using its data so as to ensure that the algorithms are used responsibly, and are only used to provide better health care to patients and not for improper purposes.
Put simply, while the competition hopes to spur innovation, this is not a competition regarding movie ratings or chess results. We hope that the clarifications we have made to the Rules and the FAQ adequately address your concerns and look forward to your participation in the competition.”

What do you think? Will the exclusive license prevent you from participating?

Heritage prize= 3mill now open

I am still angry with THE netflix for 1 mill I lost out. No sweat! this time the money is 3 times as much, it is legit, and yes baby you can change the world, make it a better place and get rich.! see details below-http://www.heritagehealthprize.com/c/hhp/Data

HERITAGE HEALTH PRIZE DATA FILES

You must accept this competition’s rules before you’ll be able to download data files.

IMPORTANT NOTE: The information provided below is intended only to provide general guidance to participants in the Heritage Health Prize Competition and is subject to the Competition Official Rules. Any capitalized term not defined below is defined in the Competition Official Rules. Please consult the Competition Official Rules for complete details.

Heritage Provider Network is providing Competition Entrants with deidentified member data collected during a forty-eight month period that is allocated among three data sets (the “Data Sets”). Competition Entrants will use the Data Sets to develop and test their algorithms for accurately predicting the number of days that the members will spend in a hospital (inpatient or emergency room visit) during the 12-month period following the Data Set cut-off date.

HHP_release2.zip contains the latest files, so you can ignore HHP_release1.zip. SampleEntry.CSV shows you how an entry should look.

Data Sets will be released to Entrants after registration on the Website according to the following schedule:

April 4, 2011 Claims Table – Y1 and DaysInHospital Table – Y2

May 4, 2011

All other Data Sets except Labs Table and Rx Table

From https://www.kaggle.com/

The $3 million Heritage Health Prize opens to entries

It’s been one month since the launch of the Heritage Health Prize. The prize has attracted some great publicity, receiving coverage from the Wall Street JournalThe EconomistSlate andForbes.

By now, people have had a good chance to poke around the first portion of the data. Now the fun starts! HPN have released two more years’-worth of data, set the accuracy threshold and are opening up the competition to entries. The data are available from the Heritage Health Prize page. Good luck to all participants!

The Deloitte/FIDE Chess Ratings Competition results

The Deloitte/FIDE Chess Ratings Competition attracted one of the strongest fields ever seen in a Kaggle Competition. The competition attracted 189 teams, ranging from chess ratings  experts to Netflix Prize winners. As Jeff Sonas wrote on the Kaggle blog last week, the  competition has far exceeded his expectations. A big congratulations the provisional winner, Tim Salimans, an econometrician at Erasmus University in Rotterdam. We look forward to reading about the approaches used by top performers on the Kaggle blog. We also look forward to the results of the FIDE prize, which could see the introduction of a new chess ratings system.

ICDAR 2011 Competition Results

The ICDAR 2011 competition also finished recently. The competiiton required participants to develop an algorithm that correctly matched handwriting samples. The winners were Lewis Griffin and Andrew Newell from the University College London who achieved Kaggle’s first ever perfect score by managing to match every sample correctly! Andrew and Lewis have posted a description of their winning method on the Kaggle blog.

Revolution R Enterprise

Since R is the most popular language used by Kaggle members, the Revolution Analytics team is making Revolution R Enterprise (the pre-eminent commercial version of R) available free of charge to Kaggle members. Revolution R Enterprise has several advantages over standard R, including the ability to seemlessly handle larger datasets. To get your free copy, visit http://info.revolutionanalytics.com/Kaggle.html.
Kaggle-in-Class

As many of you know, Kaggle offers a free platform, Kaggle-in-Class, for instructors who want to host competitions for their students. For those interested in hearing more about the use of Kaggle-in-Class as a teaching tool, Susan Holmes and Nelson Ray from Stanford University share their experience in a webinar organized by the Consortium for the Advancement of Undergraduate Statistics Education.

Google Storage for Developers goes into Enterprise Mode

Schematic representation of the SSL handshake ...
Image via Wikipedia

To help unify and uniform, collobrative work and data management and business models across the enterprise in secure SSL cloud environments- Google Storage has been rolling out some changes (read below)-this also gives you more options on the day Amazon goes ahem down (cough cough) because they didn’t think someone in their data environment could be sympathetic to free data.

——————————————————————————————————————————————————————–

https://groups.google.com/group/gs-announce

And now to the actual update.

We’re making some changes to Google Storage for Developers to make team-based development easier. As part of this work, we are introducing the concept of a project. In preparation for this feature, we will be creating projects for every user and migrating their buckets to it.

What does this mean for you?

Everything will continue to work as it always has. However, you will notice that if you perform a get-acl operation on any of your buckets, you will see extra ACL entries. These entries correspond to project groups. Each group has only one member – the person who owned the buckets before the bucket migration;  no additional rights have been granted to any of your buckets or objects. You should preserve these new ACL grants if you modify bucket ACLs.

An example entry for a modified ACL would look like this:

We’ll be rolling out these changes over the next few days,

http://blog.cloudberrylab.com/2011/04/cloudberry-explorer-for-google-storage.html

Detailed Note on GS-

https://code.google.com/apis/storage/

Google Storage for Developers is a RESTful service for storing and accessing your data on Google’s infrastructure. The service combines the performance and scalability of Google’s cloud with advanced security and sharing capabilities. Highlights include:

Fast, scalable, highly available object store

  • All data replicated to multiple U.S. data centers
  • Read-your-writes data consistency
  • Objects of hundreds of gigabytes in size per request with range-get support
  • Domain-scoped bucket namespace

Easy, flexible authentication and sharing

  • Key-based authentication
  • Authenticated downloads from a web browser
  • Individual- and group-level access controls

In addition, Google Storage for Developers offers a web-based interface for managing your storage and GSUtil, an open source command line tool and library. The service is also compatible with many existing cloud storage tools and libraries. With pay-as-you-go pricing, it’s easy to get started and scale as your needs grow.

Google Storage for Developers is currently only available to a limited number of developers. Please sign up to join the waiting list.

High Performance Analytics

Marry Big Data Analytics to High Performance Computing, and you get the buzzword of this season- High Performance Analytics.

It basically consists of Parallelized code to run in parallel on custom hardware, in -database analytics for speed, and cloud computing /high performance computing environments. On an operational level, it consists of software (as in analytics) partnering with software (as in databases, Map reduce, Hadoop) plus some hardware (HP or IBM mostly). It is considered a high margin , highly profitable, business with small number of deals compared to say desktop licenses.

As per HPC Wire- which is a great tool/newsletter to keep updated on HPC , SAS Institute has been busy on this front partnering with EMC Greenplum and TeraData (who also acquired  SAS Partner AsterData to gain a much needed foot in the MR/SQL space) Continue reading “High Performance Analytics”

iTunes finally gets some competition ?- Amazon Cloud Player

 

An interesting development is Amazon’s Cloud Player (though Cannonical may be credited for thinking of the idea first for Ubuntu One). Since Ubuntu One is dependent on the OS (and not the browser) this makes Amazon \s version more of a  mobile Cloud Player (as it seems to be an Android app and not an app that is independent of any platform, os or browser.

Since Android and Ubuntu are both Linux flavors, I am not sure if Cannonical has an exiting mobile app for Ubuntu One. Apple’s cloud plans also seems kind of ambiguous compared to Microsoft (Azure et al)

I guess we will have to wait for a true Cloud player.

 

http://www.amazon.com/b/ref=tsm_1_tw_s_dm_liujd5?node=2658409011&tag=cloudplayer-20

How to Get Started with Cloud Drive and Cloud Player

 

Step 1. Add music to Cloud Drive

Purchase a song or album from the Amazon MP3 Store and click the Save to Amazon Cloud Drive button when your purchase is complete. Your purchase will be saved for free.

 

Step 2. Play your music in Cloud Player for Web

Click the Launch Amazon Cloud Player button to start listening to your purchase. Add more music from your library by clicking theUpload to Cloud Drive button from the Cloud Player screen. Start with 5 GB of free Cloud Drive storage. Upgrade to 20 GB with an MP3 album purchase (see details). Use Cloud Player to browse and search your library, create playlists, and download to your computer.

 

Step 3. Enjoy your music on the go with Cloud Player for Android

Install the Amazon MP3 for Android app to use Cloud Player on your Android device. Shop the full Amazon MP3 store, save your purchases to Cloud Drive, stream your Cloud Player library, and download to your device right from your Android phone or tablet.

compare this with

https://one.ubuntu.com/music/

A cloud-enabled music store

The Ubuntu One Music Store is integrated with the Ubuntu One service making it a cloud-enabled digital music store. All purchases are transferred to your Ubuntu One personal cloud for safe storage and then conveniently downloaded to your synchronizing computers. And don’t worry aboutgoing over your storage quota with music purchases. You won’t need to pay more for personal cloud storage of music purchased from the Ubuntu One Music Store.

An Ubuntu One subscription is required to purchase music from the Ubuntu One Music Store. Choose from either the free 2 GB option or the 50 GB plan for $10 (USD) per month to synchronize more of your digital life.

5 regional stores and more in the works

  • The Ubuntu One Music requires Ubuntu 10.04 LTS and offers digital music through five regional stores.
  • The US, UK, and Germany stores offer music from all major and independent labels.
  • The EU store serves most of the EU member countries (2) and offers music from fewer major label artists.
  • The World store offers only independent label music and serves the countries not covered by the other regional stores.