Analytics 2011 Conference

From http://www.sas.com/events/analytics/us/

The Analytics 2011 Conference Series combines the power of SAS’s M2010 Data Mining Conference and F2010 Business Forecasting Conference into one conference covering the latest trends and techniques in the field of analytics. Analytics 2011 Conference Series brings the brightest minds in the field of analytics together with hundreds of analytics practitioners. Join us as these leading conferences change names and locations. At Analytics 2011, you’ll learn through a series of case studies, technical presentations and hands-on training. If you are in the field of analytics, this is one conference you can’t afford to miss.

Conference Details

October 24-25, 2011
Grande Lakes Resort
Orlando, FL

Analytics 2011 topic areas include:

Amazon Ec2 goes Red Hat

message from Amazing Amazon’s cloud team- this will also help for #rstats users given that revolution Analytics full versions on RHEL.

—————————————————-

on-demand instances of Amazon EC2 running Red Hat Enterprise Linux (RHEL) for as little as $0.145 per instance hour. The offering combines the cost-effectiveness, scalability and flexibility of running in Amazon EC2 with the proven reliability of Red Hat Enterprise Linux.

Highlights of the offering include:

  • Support is included through subscription to AWS Premium Support with back-line support by Red Hat
  • Ongoing maintenance, including security patches and bug fixes, via update repositories available in all Amazon EC2 regions
  • Amazon EC2 running RHEL currently supports RHEL 5.5, RHEL 5.6, RHEL 6.0 and RHEL 6.1 in both 32 bit and 64 bit formats, and is available in all Regions.
  • Customers who already own Red Hat licenses will continue to be able to use those licenses at no additional charge.
  • Like all services offered by AWS, Amazon EC2 running Red Hat Enterprise Linux offers a low-cost, pay-as-you-go model with no long-term commitments and no minimum fees.

For more information, please visit the Amazon EC2 Red Hat Enterprise Linux page.

which is

Amazon EC2 Running Red Hat Enterprise Linux

Amazon EC2 running Red Hat Enterprise Linux provides a dependable platform to deploy a broad range of applications. By running RHEL on EC2, you can leverage the cost effectiveness, scalability and flexibility of Amazon EC2, the proven reliability of Red Hat Enterprise Linux, and AWS premium support with back-line support from Red Hat.. Red Hat Enterprise Linux on EC2 is available in versions 5.5, 5.6, 6.0, and 6.1, both in 32-bit and 64-bit architectures.

Amazon EC2 running Red Hat Enterprise Linux provides seamless integration with existing Amazon EC2 features including Amazon Elastic Block Store (EBS), Amazon CloudWatch, Elastic-Load Balancing, and Elastic IPs. Red Hat Enterprise Linux instances are available in multiple Availability Zones in all Regions.

Sign Up

Pricing

Pay only for what you use with no long-term commitments and no minimum fee.

On-Demand Instances

On-Demand Instances let you pay for compute capacity by the hour with no long-term commitments.

Region:US – N. VirginiaUS – N. CaliforniaEU – IrelandAPAC – SingaporeAPAC – Tokyo
Standard Instances Red Hat Enterprise Linux
Small (Default) $0.145 per hour
Large $0.40 per hour
Extra Large $0.74 per hour
Micro Instances Red Hat Enterprise Linux
Micro $0.08 per hour
High-Memory Instances Red Hat Enterprise Linux
Extra Large $0.56 per hour
Double Extra Large $1.06 per hour
Quadruple Extra Large $2.10 per hour
High-CPU Instances Red Hat Enterprise Linux
Medium $0.23 per hour
Extra Large $0.78 per hour
Cluster Compute Instances Red Hat Enterprise Linux
Quadruple Extra Large $1.70 per hour
Cluster GPU Instances Red Hat Enterprise Linux
Quadruple Extra Large $2.20 per hour

Pricing is per instance-hour consumed for each instance type. Partial instance-hours consumed are billed as full hours.

↑ Top

and

Available Instance Types

Standard Instances

Instances of this family are well suited for most applications.

Small Instance – default*

1.7 GB memory
1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)
160 GB instance storage
32-bit platform
I/O Performance: Moderate
API name: m1.small

Large Instance

7.5 GB memory
4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)
850 GB instance storage
64-bit platform
I/O Performance: High
API name: m1.large

Extra Large Instance

15 GB memory
8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)
1,690 GB instance storage
64-bit platform
I/O Performance: High
API name: m1.xlarge

Micro Instances

Instances of this family provide a small amount of consistent CPU resources and allow you to burst CPU capacity when additional cycles are available. They are well suited for lower throughput applications and web sites that consume significant compute cycles periodically.

Micro Instance

613 MB memory
Up to 2 EC2 Compute Units (for short periodic bursts)
EBS storage only
32-bit or 64-bit platform
I/O Performance: Low
API name: t1.micro

High-Memory Instances

Instances of this family offer large memory sizes for high throughput applications, including database and memory caching applications.

High-Memory Extra Large Instance

17.1 GB of memory
6.5 EC2 Compute Units (2 virtual cores with 3.25 EC2 Compute Units each)
420 GB of instance storage
64-bit platform
I/O Performance: Moderate
API name: m2.xlarge

High-Memory Double Extra Large Instance

34.2 GB of memory
13 EC2 Compute Units (4 virtual cores with 3.25 EC2 Compute Units each)
850 GB of instance storage
64-bit platform
I/O Performance: High
API name: m2.2xlarge

High-Memory Quadruple Extra Large Instance

68.4 GB of memory
26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute Units each)
1690 GB of instance storage
64-bit platform
I/O Performance: High
API name: m2.4xlarge

High-CPU Instances

Instances of this family have proportionally more CPU resources than memory (RAM) and are well suited for compute-intensive applications.

High-CPU Medium Instance

1.7 GB of memory
5 EC2 Compute Units (2 virtual cores with 2.5 EC2 Compute Units each)
350 GB of instance storage
32-bit platform
I/O Performance: Moderate
API name: c1.medium

High-CPU Extra Large Instance

7 GB of memory
20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)
1690 GB of instance storage
64-bit platform
I/O Performance: High
API name: c1.xlarge

Cluster Compute Instances

Instances of this family provide proportionally high CPU resources with increased network performance and are well suited for High Performance Compute (HPC) applications and other demanding network-bound applications. Learn more about use of this instance type for HPC applications.

Cluster Compute Quadruple Extra Large Instance

23 GB of memory
33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem” architecture)
1690 GB of instance storage
64-bit platform
I/O Performance: Very High (10 Gigabit Ethernet)
API name: cc1.4xlarge

Cluster GPU Instances

Instances of this family provide general-purpose graphics processing units (GPUs) with proportionally high CPU and increased network performance for applications benefitting from highly parallelized processing, including HPC, rendering and media processing applications. While Cluster Compute Instances provide the ability to create clusters of instances connected by a low latency, high throughput network, Cluster GPU Instances provide an additional option for applications that can benefit from the efficiency gains of the parallel computing power of GPUs over what can be achieved with traditional processors. Learn more about use of this instance type for HPC applications.

Cluster GPU Quadruple Extra Large Instance

22 GB of memory
33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem” architecture)
2 x NVIDIA Tesla “Fermi” M2050 GPUs
1690 GB of instance storage
64-bit platform
I/O Performance: Very High (10 Gigabit Ethernet)
API name: cg1.4xlarge

 


Getting Started

To get started using Red Hat Enterprise Linux on Amazon EC2, perform the following steps:

  • Open and log into the AWS Management Console
  • Click on Launch Instance from the EC2 Dashboard
  • Select the Red Hat Enterprise Linux AMI from the QuickStart tab
  • Specify additional details of your instance and click Launch
  • Additional details can be found on each AMI’s Catalog Entry page

The AWS Management Console is an easy tool to start and manage your instances. If you are looking for more details on launching an instance, a quick video tutorial on how to use Amazon EC2 with the AWS Management Console can be found here .
A full list of Red Hat Enterprise Linux AMIs can be found in the AWS AMI Catalog.

↑ Top


Support

All customers running Red Hat Enterprise Linux on EC2 will receive access to repository updates from Red Hat. Moreover, AWS Premium support customers can contact AWS to get access to a support structure from both Amazon and Red Hat.

↑ Top


Resources

↑ Top


About Red Hat

Red Hat, the world’s leading open source solutions provider, is headquartered in Raleigh, NC with over 50 satellite offices spanning the globe. Red Hat provides high-quality, low-cost technology with its operating system platform, Red Hat Enterprise Linux, together with applications, management and Services Oriented Architecture (SOA) solutions, including the JBoss Enterprise Middleware Suite. Red Hat also offers support, training and consulting services to its customers worldwide.

 

also from Revolution Analytics- in case you want to #rstats in the cloud and thus kill all that talk of RAM dependency, slow R than other softwares (just increase the RAM above in the instances to keep it simple)

,or Revolution not being open enough

http://www.revolutionanalytics.com/downloads/gpl-sources.php

GPL SOURCES

Revolution Analytics uses an Open-Core Licensing model. We provide open- source R bundled with proprietary modules from Revolution Analytics that provide additional functionality for our users. Open-source R is distributed under the GNU Public License (version 2), and we make our software available under a commercial license.

Revolution Analytics respects the importance of open source licenses and has contributed code to the open source R project and will continue to do so. We have carefully reviewed our compliance with GPLv2 and have worked with Mark Radcliffe of DLA Piper, the outside General Legal Counsel of the Open Source Initiative, to ensure that we fully comply with the obligations of the GPLv2.

For our Revolution R distribution, we may make some minor modifications to the R sources (the ChangeLog file lists all changes made). You can download these modified sources of open-source R under the terms of the GPLv2, using either the links below or those in the email sent to you when you download a specific version of Revolution R.

Download GPL Sources

Product Version Platform Modified R Sources
Revolution R Community 3.2 Windows R 2.10.1
Revolution R Community 3.2 MacOS R 2.10.1
Revolution R Enterprise 3.1.1 RHEL R 2.9.2
Revolution R Enterprise 4.0 Windows R 2.11.1
Revolution R Enterprise 4.0.1 RHEL R 2.11.1
Revolution R Enterprise 4.1.0 Windows R 2.11.1
Revolution R Enterprise 4.2 Windows R 2.11.1
Revolution R Enterprise 4.2 RHEL R 2.11.1
Revolution R Enterprise 4.3 Windows & RHEL R 2.12.2

 

 

 

Why open source companies dont dance?

I have been pondering on this seemingly logical paradox for some time now-

1) Why are open source solutions considered technically better but not customer friendly.

2) Why do startups and app creators in social media or mobile get much more press coverage than

profitable startups in enterprise software.

3) How does tech journalism differ in covering open source projects in enterprise versus retail software.

4) What are the hidden rules of the game of enterprise software.

Some observations-

1) Open source companies often focus much more on technical community management and crowd sourcing code. Traditional software companies focus much more on managing the marketing community of customers and influencers. Accordingly the balance of power is skewed in favor of techies and R and D in open source companies, and in favor of marketing and analyst relations in traditional software companies.

Traditional companies also spend much more on hiring top notch press release/public relationship agencies, while open source companies are both financially and sometimes ideologically opposed to older methods of marketing software. The reverse of this is you are much more likely to see Videos and Tutorials by an open source company than a traditional company. You can compare the websites of ClouderaDataStax, Hadapt ,Appistry and Mapr and contrast that with Teradata or Oracle (which has a much bigger and much more different marketing strategy.

Social media for marketing is also more efficiently utilized by smaller companies (open source) while bigger companies continue to pay influential analysts for expensive white papers that help present the brand.

Lack of budgets is a major factor that limits access to influential marketing for open source companies particularly in enterprise software.

2 and 3) Retail software is priced at 2-100$ and sells by volume. Accordingly technology coverage of these software is based on volume.

Enterprise software is much more expensively priced and has much more discreet volume or sales points. Accordingly the technology coverage of enterprise software is more discreet, in terms of a white paper coming every quarter, a webinar every month and a press release every week. Retail software is covered non stop , but these journalists typically do not charge for “briefings”.

Journalists covering retail software generally earn money by ads or hosting conferences. So they have an interest in covering new stuff or interesting disruptive stuff. Journalists or analysts covering enterprise software generally earn money by white papers, webinars, attending than hosting conferences, writing books. They thus have a much stronger economic incentive to cover existing landscape and technologies than smaller startups.

4) What are the hidden rules of the game of enterprise software.

  • It is mostly a white man’s world. this can be proved by statistical demographic analysis
  • There is incestuous intermingling between influencers, marketers, and PR people. This can be proved by simple social network analysis of who talks to who and how much. A simple time series between sponsorship and analysts coverage also will prove this (I am working on quantifying this ).
  • There are much larger switching costs to enterprise software than retail software. This leads to legacy shoddy software getting much chances than would have been allowed in an efficient marketplace.
  • Enterprise software is a less efficient marketplace than retail software in all definitions of the term “efficient markets”
  • Cloud computing, and SaaS and Open source threatens to disrupt the jobs and careers of a large number of people. In the long term, they will create many more jobs, but in the short term, people used to comfortable living of enterprise software (making,selling,or writing) will actively and passively resist these changes to the  paradigms in the current software status quo.
  • Open source companies dont dance and dont play ball. They prefer to hire 4 more college grads than commission 2 more white papers.

and the following with slight changes from a comment I made on a fellow blog-

  • While the paradigm on how to create new software has evolved from primarily silo-driven R and D departments to a broader collaborative effort, the biggest drawback is software marketing has not evolved.
  • If you want your own version of the open source community editions to be more popular, some standardization is necessary for the corporate decision makers, and we need better marketing paradigms.
  • While code creation is crowdsourced, solution implementation cannot be crowdsourced. Customers want solutions to a problem not code.
  • Just as open source as a production and licensing paradigm threatens to disrupt enterprise software, it will lead to newer ways to marketing software given the hostility of existing status quo.

 

 

RapidMiner launches extensions marketplace

For some time now, I had been hoping for a place where new package or algorithm developers get at least a fraction of the money that iPad or iPhone application developers get. Rapid Miner has taken the lead in establishing a marketplace for extensions. Is there going to be paid extensions as well- I hope so!!

This probably makes it the first “app” marketplace in open source and the second app marketplace in analytics after salesforce.com

It is hard work to think of new algols, and some of them can really be usefull.

Can we hope for #rstats marketplace where people downloading say ggplot3.0 atleast get a prompt to donate 99 cents per download to Hadley Wickham’s Amazon wishlist. http://www.amazon.com/gp/registry/1Y65N3VFA613B

Do you think it is okay to pay 99 cents per iTunes song, but not pay a cent for open source software.

I dont know- but I am just a capitalist born in a country that was socialist for the first 13 years of my life. Congratulations once again to Rapid Miner for innovating and leading the way.

http://rapid-i.com/component/option,com_myblog/show,Rapid-I-Marketplace-Launched.html/Itemid,172

RapidMinerMarketplaceExtensions 30 May 2011
Rapid-I Marketplace Launched by Simon Fischer

Over the years, many of you have been developing new RapidMiner Extensions dedicated to a broad set of topics. Whereas these extensions are easy to install in RapidMiner – just download and place them in the plugins folder – the hard part is to find them in the vastness that is the Internet. Extensions made by ourselves at Rapid-I, on the other hand,  are distributed by the update server making them searchable and installable directly inside RapidMiner.

We thought that this was a bit unfair, so we decieded to open up the update server to the public, and not only this, we even gave it a new look and name. The Rapid-I Marketplace is available in beta mode at http://rapidupdate.de:8180/ . You can use the Web interface to browse, comment, and rate the extensions, and you can use the update functionality in RapidMiner by going to the preferences and entering http://rapidupdate.de:8180/UpdateServer/ as the update server URL. (Once the beta test is complete, we will change the port back to 80 so we won’t have any firewall problems.)

As an Extension developer, just register with the Marketplace and drop me an email (fischer at rapid-i dot com) so I can give you permissions to upload your own extension. Upload is simple provided you use the standard RapidMiner Extension build process and will boost visibility of your extension.

Looking forward to see many new extensions there soon!

Disclaimer- Decisionstats is a partner of Rapid Miner. I have been liking the software for a long long time, and recently agreed to partner with them just like I did with KXEN some years back, and with Predictive AnalyticsConference, and Aster Data until last year.

I still think Rapid Miner is a very very good software,and a globally created software after SAP.

Here is the actual marketplace

http://rapidupdate.de:8180/UpdateServer/faces/index.xhtml

Welcome to the Rapid-I Marketplace Public Beta Test

The Rapid-I Marketplace will soon replace the RapidMiner update server. Using this marketplace, you can share your RapidMiner extensions and make them available for download by the community of RapidMiner users. Currently, we are beta testing this server. If you want to use this server in RapidMiner, you must go to the preferences and enter http://rapidupdate.de:8180/UpdateServer for the update url. After the beta test, we will change the port back to 80, which is currently occupied by the old update server. You can test the marketplace as a user (downloading extensions) and as an Extension developer. If you want to publish your extension here, please let us know via the contact form.

Hot Downloads
«« « 1 2 3 » »»
[Icon]The Image Processing Extension provides operators for handling image data. You can extract attributes describing colour and texture in the image, you can make several transformation of a image data which allows you to perform segmentation and detection of suspicious areas in image data.The extension provides many of image transformation and extraction operators ranging from Wavelet Decomposition, Hough Circle to Block Difference of Inverse probabilities.

[Icon]RapidMiner is unquestionably the world-leading open-source system for data mining. It is available as a stand-alone application for data analysis and as a data mining engine for the integration into own products. Thousands of applications of RapidMiner in more than 40 countries give their users a competitive edge.

  • Data IntegrationAnalytical ETLData Analysis, and Reporting in one single suite
  • Powerful but intuitive graphical user interface for the design of analysis processes
  • Repositories for process, data and meta data handling
  • Only solution with meta data transformation: forget trial and error and inspect results already during design time
  • Only solution which supports on-the-fly error recognition and quick fixes
  • Complete and flexible: Hundreds of data loading, data transformation, data modeling, and data visualization methods
[Icon]All modeling methods and attribute evaluation methods from the Weka machine learning library are available within RapidMiner. After installing this extension you will get access to about 100 additional modelling schemes including additional decision trees, rule learners and regression estimators.This extension combines two of the most widely used open source data mining solutions. By installing it, you can extend RapidMiner to everything what is possible with Weka while keeping the full analysis, preprocessing, and visualization power of RapidMiner.

[Icon]Finally, the two most widely used data analysis solutions – RapidMiner and R – are connected. Arbitrary R models and scripts can now be directly integrated into the RapidMiner analysis processes. The new R perspective offers the known R console together with the great plotting facilities of R. All variables and R scripts can be organized in the RapidMiner Repository.A directly included online help and multi-line editing makes the creation of R scripts much more comfortable.

Who writes white papers?

A social network diagram
Image via Wikipedia

There are four main types of commercial white papers:

  • Business benefits: Makes a business case for a certain technology or methodology.
  • Technical: Describes how a certain technology works.
  • Hybrid: Combines business benefits with technical details in a single document.
  • Policy: Makes a case for a certain political solution to a societal or economic challenge.
Name the best white paper you ever read? (comment that in the field)..
What categoy of white papers is the best?
Do you think white papers are too expensive or they give adequate ROI?
To be continued- including

  1. demographic and social network analysis of analysts and white paper sponsors to measure interaction effects.
  2. white papers segmented by type of software company
  3. proc freq analysis of the words frequency data viz in white papers written by same analysts for different companies on same topics.
  4. Race and ethnic analysis of influencers and analysts in Business Analysts and Business Intelligence. – Null hypothesis – it is not a white mans world, women, Hispanics and other minorities are adequately represented.
Why I am doing this?
I am writing a white paper on WHO writes a white paper? 
Sponsorships are invited- but academics and startups in analytics may be preferred.

What is a White Paper?

Christine and Jimmy Wales
Image via Wikipedia

As per Jimmy Wales and his merry band at Wiki (pedia not leaky-ah)- The emphasis is mine

What is the best white paper you have read in the past 15 years.

Categories are-

  • Business benefits: Makes a business case for a certain technology or methodology.
  • Technical: Describes how a certain technology works.
  • Hybrid: Combines business benefits with technical details in a single document.
  • Policy: Makes a case for a certain political solution to a societal or economic challenge.
——————————————————————————————————————————————————



white paper is an authoritative report or guide that helps solve a problem. White papers are used to educate readers and help people make decisions, and are often requested and used in politics, policy, business, and technical fields. In commercial use, the term has also come to refer to documents used by businesses as a marketing or sales tool. Policy makers frequently request white papers from universities or academic personnel to inform policy developments with expert opinions or relevant research.

Government white papers

In the Commonwealth of Nations, “white paper” is an informal name for a parliamentary paper enunciating government policy; in the United Kingdom these are mostly issued as “Command papers“. White papers are issued by the government and lay out policy, or proposed action, on a topic of current concern. Although a white paper may on occasion be a consultation as to the details of new legislation, it does signify a clear intention on the part of a government to pass new law. White Papers are a “…. tool of participatory democracy … not [an] unalterable policy commitment.[1] “White Papers have tried to perform the dual role of presenting firm government policies while at the same time inviting opinions upon them.” [2]

In Canada, a white paper “is considered to be a policy document, approved by Cabinet, tabled in the House of Commons and made available to the general public.”[3] A Canadian author notes that the “provision of policy information through the use of white and green papers can help to create an awareness of policy issues among parliamentarians and the public and to encourage an exchange of information and analysis. They can also serve as educational techniques”.[4]

“White Papers are used as a means of presenting government policy preferences prior to the introduction of legislation”; as such, the “publication of a White Paper serves to test the climate of public opinion regarding a controversial policy issue and enables the government to gauge its probable impact”.[5]

By contrast, green papers, which are issued much more frequently, are more open ended. These green papers, also known as consultation documents, may merely propose a strategy to be implemented in the details of other legislation or they may set out proposals on which the government wishes to obtain public views and opinion.

White papers published by the European Commission are documents containing proposals for European Union action in a specific area. They sometimes follow a green paper released to launch a public consultation process.

For examples see the following:

 Commercial white papers

Since the early 1990s, the term white paper has also come to refer to documents used by businesses and so-called think tanks as marketing or sales tools. White papers of this sort argue that the benefits of a particular technologyproduct or policy are superior for solving a specific problem.

These types of white papers are almost always marketing communications documents designed to promote a specific company’s or group’s solutions or products. As a marketing tool, these papers will highlight information favorable to the company authorizing or sponsoring the paper. Such white papers are often used to generate sales leads, establish thought leadership, make a business case, or to educate customers or voters.

There are four main types of commercial white papers:

  • Business benefits: Makes a business case for a certain technology or methodology.
  • Technical: Describes how a certain technology works.
  • Hybrid: Combines business benefits with technical details in a single document.
  • Policy: Makes a case for a certain political solution to a societal or economic challenge.

Resources

  • Stelzner, Michael (2007). Writing White Papers: How to capture readers and keep them engaged. Poway, California: WhitePaperSource Publishing. pp. 214. ISBN 9780977716937.
  • Bly, Robert W. (2006). The White Paper Marketing Handbook. Florence, Kentucky: South-Western Educational Publishing. pp. 256. ISBN 9780324300826.
  • Kantor, Jonathan (2009). Crafting White Paper 2.0: Designing Information for Today’s Time and Attention Challenged Business Reader. Denver,Colorado: Lulu Publishing. pp. 167.ISBN 9780557163243.