CrowdANALYTIX

Here is a contest based community called CrowdANALYTIX.com which is quite nice and offers you free Revolution R for the statistical and analytical contests based there (a bit like Kaggle.com http://www.kaggle.com/). There are only 3 contests right now and that too low volume but I guess that number should increase. Also they seem to have a consulting arm.

Latest Analytics website- welcome! http://www.crowdanalytix.com/contests

Careers in #Rstats

I saw a posting for career with Revolution Analytics. Now I am probably on the wrong side of a H1 visa and the C,R skill-o-meter, but these look great for any aspiring R coder. Includes one free lance opp as well.

http://www.revolutionanalytics.com/aboutus/careers.php

We have many opportunities opening up—among them:

Job Title Location
Pre-sales Consultants / Technical Sales Palo Alto, CA
Parallel Computing Developer Palo Alto, CA or Seattle, WA
R Programmer (Freelance) Palo Alto, CA
Software Training Course Developer (Freelance) Palo Alto, CA
Build / Release Engineer Seattle, WA
QA Engineer Seattle, WA
Technical Writer Seattle, WA

 

Please send your resume to careers@revolutionanalytics.com

2) Indeed.com

Searching for “R” jobs and not just , R jobs, gives better results in search engines and job sites. It is still a tough keyword to search but it is getting better.

You can use this RSS feed http://www.indeed.co.in/rss?q=%22R%22++analytics+jobs or send by email option to get alerts

3) http://icrunchdata.com/

 

I Crunch Data has a good number of Analytics Jobs, and again using the keyword as R within quotes of “R” you can see lots of jobs here

http://www.icrunchdata.com/ViewJob.aspx?id=334914&keys=%22R%22

There used to be a Google Group on R jobs, but is too low volume compared to the actual number of R jobs out there.

Note the big demand is for analytics, and knowing more than one platform helps you in the job search than knowing just a single language.

 

 

 

Revolution #Rstats Webinar

David Smith of Revo presents a nice webinar on the capabilities and abilities of Revolution R- if you are R curious and wonder how the commercial version has matured- you may want to take a look.

click below to view an executive Webinar

——————————————————————————————-

Revolution R Enterprise—presented by author and blogger David Smith:

Revolution R: 100% R and More
On-Demand Webinar

This Webinar covers how R users can upgrade to:

  • Multi-processor speed improvements and parallel processing
  • Productivity and debugging with an integrated development environment (IDE) for the R language
  • “Big Data” analysis, with out-of-memory storage of multi-gigabyte data sets
  • Web Services for R, to integrate R computations and graphics into 3rd-Party applications like Excel and BI Dashboards
  • Expert technical support and consulting services for R

This webinar will be of value to current R users who want to learn more about the additional capabilities of Revolution R Enterprise to enhance the productivity, ease of use, and enterprise readiness of open source R. R users in academia will also find this webinar valuable: we will explain how all members of the academic community can obtain Revolution R Enterprise free of charge.

—————————————————————————————

contact -1-855-GET-REVO or via online form.
info@revolutionanalytics.com | (650) 330-0553 | Twitter @RevolutionR

Interview Mike Boyarski Jaspersoft

Here is an interview with Mike Boyarski , Director Product Marketing at Jaspersoft

.

 

the largest BI community with over 14 million downloads, nearly 230,000 registered members, representing over 175,000 production deployments, 14,000 customers, across 100 countries.

Ajay- Describe your career in science from Biology to marketing great software.
Mike- I studied Biology with the assumption I’d pursue a career in medicine. It took about 2 weeks during an internship at a Los Angeles hospital to determine I should do something else.  I enjoyed learning about life science, but the whole health care environment was not for me.  I was initially introduced to enterprise-level software while at Applied Materials within their Microcontamination group.  I was able to assist with an internal application used to collect contamination data.  I later joined Oracle to work on an Oracle Forms application used to automate the production of software kits (back when documentation and CDs had to be physically shipped to recognize revenue). This gave me hands on experience with Oracle 7, web application servers, and the software development process.
I then transitioned to product management for various products including application servers, software appliances, and Oracle’s first generation SaaS based software infrastructure. In 2006, with the Siebel and PeopleSoft acquisitions underway, I moved on to Ingres to help re-invigorate their solid yet antiquated technology. This introduced me to commercial open source software and the broader Business Intelligence market.  From Ingres I joined Jaspersoft, one of the first and most popular open source Business Intelligence vendors, serving as head of product marketing since mid 2009.
Ajay- Describe some of the new features in Jaspersoft 4.1 that help differentiate it from the rest of the crowd. What are the exciting product features we can expect from Jaspersoft down the next couple of years.
Mike- Jaspersoft 4.1 was an exciting release for our customers because we were able to extend the latest UI advancements in our ad hoc report designer to the data analysis environment. Now customers can use a unified intuitive web-based interface to perform several powerful and interactive analytic functions across any data source, whether its relational, non-relational, or a Big Data source.
 The reality is that most (roughly 70%) of todays BI adoption is in the form of reports and dashboards. These tools are used to drive and measure an organizations business, however, data analysis presents the most strategic opportunity for companies because it can identify new opportunities, efficiencies, and competitive differentiation.  As more data comes online, the difference between those companies that are successful and those that are not will likely be attributed to their ability to harness data analysis techniques to drive and improve business performance. Thus, with Jaspersoft 4.1, and our improved ad hoc reporting and analysis UI we can effectively address a broader set of BI requirements for organizations of all sizes.
Ajay-  What do you think is a good metric to measure influence of an open source software product – is it revenue or is it number of downloads or number of users. How does Jaspersoft do by these counts.
Mike- History has shown that open source software is successful as a “bottoms up” disrupter within IT or the developer market.  Today, many new software projects and startup ventures are birthed on open source software, often initiated with little to no budget. As the organization achieves success with a particular project, the next initiative tends to be larger and more strategic, often displacing what was historically solved with a proprietary solution. These larger deployments strengthen the technology over time.
Thus, the more proven and battle tested an open source solution is, often measured via downloads, deployments, community size, and community activity, usually equates to its long term success. Linux, Tomcat, and MySQL have plenty of statistics to model this lifecycle. This model is no different for open source BI.
The success to date of Jaspersoft is directly tied to its solid proven technology and the vibrancy of the community.  We proudly and openly claim to have the largest BI community with over 14 million downloads, nearly 230,000 registered members, representing over 175,000 production deployments, 14,000 customers, across 100 countries.  Every day, 30,000 developers are using Jaspersoft to build BI applications.  Behind Excel, its hard to imagine a more widely used BI tool in the market.  Jaspersoft could not reach these kind of numbers with crippled or poorly architected software.
Ajay- What are your plans for leveraging cloud computing, mobile and tablet platforms and for making Jaspersoft more easy and global  to use.

Revolution Analytics Product Launches for #rstats in 2011

Revolution Analytics just launched an roadmap detailing their product plan for 2011.

 

In particular I am excited for the new GUI coming up, the Hadoop packages, new K Means and Data Sort/merge using Revoscaler for bigger datasets, and also the option to offer support for community packages like ggplot2 titled ” More value in Community Version”. Continue reading “Revolution Analytics Product Launches for #rstats in 2011”

Contribution to #Rstats by Revolution

I have been watching for Revolution Analytics product almost since the inception of the company. It has managed to sail over storms, naysayers and critics with simple and effective strategy of launching good software, making good partnerships and keeping up media visibility with white papers, joint webinars, blogs, conferences and events.

However this is a listing of all technical contributions made by Revolution Analytics products to the #rstats project.

1) Useful Packages mostly in parallel processing or more efficient computing like

 

2) RevoScaler package to beat R’s memory problem (this is probably the best in my opinion as it is yet to be replicated by the open source version and is a clear cut reason for going in for the paid version)

http://www.revolutionanalytics.com/products/enterprise-big-data.php

  • Efficient XDF File Format designed to efficiently handle huge data sets.
  • Data Step Functionality to quickly clean, transform, explore, and visualize huge data sets.
  • Data selection functionality to store huge data sets out of memory, and select subsets of rows and columns for in-memory operation with all R functions.
  • Visualize Large Data sets with line plots and histograms.
  • Built-in Statistical Algorithms for direct analysis of huge data sets:
    • Summary Statistics
    • Linear Regression
    • Logistic Regression
    • Crosstabulation
  • On-the-fly data transformations to include derived variables in models without writing new data files.
  • Extend Existing Analyses by writing user- defined R functions to “chunk” through huge data sets.
  • Direct import of fixed-format text data files and SAS data sets into .xdf format

 

3) RevoDeploy R for  API based R solution – I somehow think this feature will get more important as time goes on but it seems a lower visibility offering right now.

http://www.revolutionanalytics.com/products/enterprise-deployment.php

  • Collection of Web services implemented as a RESTful API.
  • JavaScript and Java client libraries, allowing users to easily build custom Web applications on top of R.
  • .NET Client library — includes a COM interoperability to call R from VBA
  • Management Console for securely administrating servers, scripts and users through HTTP and HTTPS.
  • XML and JSON format for data exchange.
  • Built-in security model for authenticated or anonymous invocation of R Scripts.
  • Repository for storing R objects and R Script execution artifacts.

 

4) Revolutions IDE (or Productivity Environment) for a faster coding environment than command line. The GUI by Revolution Analytics is in the works. – Having used this- only the Code Snippets function is a clear differentiator from newer IDE and GUI. The code snippets is awesome though and even someone who doesnt know much R can get analysis set up quite fast and accurately.

http://www.revolutionanalytics.com/products/enterprise-productivity.php

  • Full-featured Visual Debugger for debugging R scripts, with call stack window and step-in, step-over, and step-out capability.
  • Enhanced Script Editor with hover-over help, word completion, find-across-files capability, automatic syntax checking, bookmarks, and navigation buttons.
  • Run Selection, Run to Line and Run to Cursor evaluation
  • R Code Snippets to automatically generate fill-in-the-blank sections of R code with tooltip help.
  • Object Browser showing available data and function objects (including those in packages), with context menus for plotting and editing data.
  • Solution Explorer for organizing, viewing, adding, removing, rearranging, and sourcing R scripts.
  • Customizable Workspace with dockable, floating, and tabbed tool windows.
  • Version Control Plug-in available for the open source Subversion version control software.

 

Marketing contributions from Revolution Analytics-

1) Sponsoring R sessions and user meets

2) Evangelizing R at conferences  and partnering with corporate partners including JasperSoft, Microsoft , IBM and others at http://www.revolutionanalytics.com/partners/

3) Helping with online initiatives like http://www.inside-r.org/ (which is curiously dormant and now largely superseded by R-Bloggers.com) and the syntax highlighting tool at http://www.inside-r.org/pretty-r. In addition Revolution has been proactive in reaching out to the community

4) Helping pioneer blogging about R and Twitter Hash tag discussions , and contributing to Stack Overflow discussions. Within a short while, #rstats online community has overtaken a lot more established names- partly due to decentralized nature of its working.

 

Did I miss something out? yes , they share their code by GPL.

 

Let me know by feedback

Amazon Ec2 goes Red Hat

message from Amazing Amazon’s cloud team- this will also help for #rstats users given that revolution Analytics full versions on RHEL.

—————————————————-

on-demand instances of Amazon EC2 running Red Hat Enterprise Linux (RHEL) for as little as $0.145 per instance hour. The offering combines the cost-effectiveness, scalability and flexibility of running in Amazon EC2 with the proven reliability of Red Hat Enterprise Linux.

Highlights of the offering include:

  • Support is included through subscription to AWS Premium Support with back-line support by Red Hat
  • Ongoing maintenance, including security patches and bug fixes, via update repositories available in all Amazon EC2 regions
  • Amazon EC2 running RHEL currently supports RHEL 5.5, RHEL 5.6, RHEL 6.0 and RHEL 6.1 in both 32 bit and 64 bit formats, and is available in all Regions.
  • Customers who already own Red Hat licenses will continue to be able to use those licenses at no additional charge.
  • Like all services offered by AWS, Amazon EC2 running Red Hat Enterprise Linux offers a low-cost, pay-as-you-go model with no long-term commitments and no minimum fees.

For more information, please visit the Amazon EC2 Red Hat Enterprise Linux page.

which is

Amazon EC2 Running Red Hat Enterprise Linux

Amazon EC2 running Red Hat Enterprise Linux provides a dependable platform to deploy a broad range of applications. By running RHEL on EC2, you can leverage the cost effectiveness, scalability and flexibility of Amazon EC2, the proven reliability of Red Hat Enterprise Linux, and AWS premium support with back-line support from Red Hat.. Red Hat Enterprise Linux on EC2 is available in versions 5.5, 5.6, 6.0, and 6.1, both in 32-bit and 64-bit architectures.

Amazon EC2 running Red Hat Enterprise Linux provides seamless integration with existing Amazon EC2 features including Amazon Elastic Block Store (EBS), Amazon CloudWatch, Elastic-Load Balancing, and Elastic IPs. Red Hat Enterprise Linux instances are available in multiple Availability Zones in all Regions.

Sign Up

Pricing

Pay only for what you use with no long-term commitments and no minimum fee.

On-Demand Instances

On-Demand Instances let you pay for compute capacity by the hour with no long-term commitments.

Region:US – N. VirginiaUS – N. CaliforniaEU – IrelandAPAC – SingaporeAPAC – Tokyo
Standard Instances Red Hat Enterprise Linux
Small (Default) $0.145 per hour
Large $0.40 per hour
Extra Large $0.74 per hour
Micro Instances Red Hat Enterprise Linux
Micro $0.08 per hour
High-Memory Instances Red Hat Enterprise Linux
Extra Large $0.56 per hour
Double Extra Large $1.06 per hour
Quadruple Extra Large $2.10 per hour
High-CPU Instances Red Hat Enterprise Linux
Medium $0.23 per hour
Extra Large $0.78 per hour
Cluster Compute Instances Red Hat Enterprise Linux
Quadruple Extra Large $1.70 per hour
Cluster GPU Instances Red Hat Enterprise Linux
Quadruple Extra Large $2.20 per hour

Pricing is per instance-hour consumed for each instance type. Partial instance-hours consumed are billed as full hours.

↑ Top

and

Available Instance Types

Standard Instances

Instances of this family are well suited for most applications.

Small Instance – default*

1.7 GB memory
1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)
160 GB instance storage
32-bit platform
I/O Performance: Moderate
API name: m1.small

Large Instance

7.5 GB memory
4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)
850 GB instance storage
64-bit platform
I/O Performance: High
API name: m1.large

Extra Large Instance

15 GB memory
8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)
1,690 GB instance storage
64-bit platform
I/O Performance: High
API name: m1.xlarge

Micro Instances

Instances of this family provide a small amount of consistent CPU resources and allow you to burst CPU capacity when additional cycles are available. They are well suited for lower throughput applications and web sites that consume significant compute cycles periodically.

Micro Instance

613 MB memory
Up to 2 EC2 Compute Units (for short periodic bursts)
EBS storage only
32-bit or 64-bit platform
I/O Performance: Low
API name: t1.micro

High-Memory Instances

Instances of this family offer large memory sizes for high throughput applications, including database and memory caching applications.

High-Memory Extra Large Instance

17.1 GB of memory
6.5 EC2 Compute Units (2 virtual cores with 3.25 EC2 Compute Units each)
420 GB of instance storage
64-bit platform
I/O Performance: Moderate
API name: m2.xlarge

High-Memory Double Extra Large Instance

34.2 GB of memory
13 EC2 Compute Units (4 virtual cores with 3.25 EC2 Compute Units each)
850 GB of instance storage
64-bit platform
I/O Performance: High
API name: m2.2xlarge

High-Memory Quadruple Extra Large Instance

68.4 GB of memory
26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute Units each)
1690 GB of instance storage
64-bit platform
I/O Performance: High
API name: m2.4xlarge

High-CPU Instances

Instances of this family have proportionally more CPU resources than memory (RAM) and are well suited for compute-intensive applications.

High-CPU Medium Instance

1.7 GB of memory
5 EC2 Compute Units (2 virtual cores with 2.5 EC2 Compute Units each)
350 GB of instance storage
32-bit platform
I/O Performance: Moderate
API name: c1.medium

High-CPU Extra Large Instance

7 GB of memory
20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)
1690 GB of instance storage
64-bit platform
I/O Performance: High
API name: c1.xlarge

Cluster Compute Instances

Instances of this family provide proportionally high CPU resources with increased network performance and are well suited for High Performance Compute (HPC) applications and other demanding network-bound applications. Learn more about use of this instance type for HPC applications.

Cluster Compute Quadruple Extra Large Instance

23 GB of memory
33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem” architecture)
1690 GB of instance storage
64-bit platform
I/O Performance: Very High (10 Gigabit Ethernet)
API name: cc1.4xlarge

Cluster GPU Instances

Instances of this family provide general-purpose graphics processing units (GPUs) with proportionally high CPU and increased network performance for applications benefitting from highly parallelized processing, including HPC, rendering and media processing applications. While Cluster Compute Instances provide the ability to create clusters of instances connected by a low latency, high throughput network, Cluster GPU Instances provide an additional option for applications that can benefit from the efficiency gains of the parallel computing power of GPUs over what can be achieved with traditional processors. Learn more about use of this instance type for HPC applications.

Cluster GPU Quadruple Extra Large Instance

22 GB of memory
33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem” architecture)
2 x NVIDIA Tesla “Fermi” M2050 GPUs
1690 GB of instance storage
64-bit platform
I/O Performance: Very High (10 Gigabit Ethernet)
API name: cg1.4xlarge

 


Getting Started

To get started using Red Hat Enterprise Linux on Amazon EC2, perform the following steps:

  • Open and log into the AWS Management Console
  • Click on Launch Instance from the EC2 Dashboard
  • Select the Red Hat Enterprise Linux AMI from the QuickStart tab
  • Specify additional details of your instance and click Launch
  • Additional details can be found on each AMI’s Catalog Entry page

The AWS Management Console is an easy tool to start and manage your instances. If you are looking for more details on launching an instance, a quick video tutorial on how to use Amazon EC2 with the AWS Management Console can be found here .
A full list of Red Hat Enterprise Linux AMIs can be found in the AWS AMI Catalog.

↑ Top


Support

All customers running Red Hat Enterprise Linux on EC2 will receive access to repository updates from Red Hat. Moreover, AWS Premium support customers can contact AWS to get access to a support structure from both Amazon and Red Hat.

↑ Top


Resources

↑ Top


About Red Hat

Red Hat, the world’s leading open source solutions provider, is headquartered in Raleigh, NC with over 50 satellite offices spanning the globe. Red Hat provides high-quality, low-cost technology with its operating system platform, Red Hat Enterprise Linux, together with applications, management and Services Oriented Architecture (SOA) solutions, including the JBoss Enterprise Middleware Suite. Red Hat also offers support, training and consulting services to its customers worldwide.

 

also from Revolution Analytics- in case you want to #rstats in the cloud and thus kill all that talk of RAM dependency, slow R than other softwares (just increase the RAM above in the instances to keep it simple)

,or Revolution not being open enough

http://www.revolutionanalytics.com/downloads/gpl-sources.php

GPL SOURCES

Revolution Analytics uses an Open-Core Licensing model. We provide open- source R bundled with proprietary modules from Revolution Analytics that provide additional functionality for our users. Open-source R is distributed under the GNU Public License (version 2), and we make our software available under a commercial license.

Revolution Analytics respects the importance of open source licenses and has contributed code to the open source R project and will continue to do so. We have carefully reviewed our compliance with GPLv2 and have worked with Mark Radcliffe of DLA Piper, the outside General Legal Counsel of the Open Source Initiative, to ensure that we fully comply with the obligations of the GPLv2.

For our Revolution R distribution, we may make some minor modifications to the R sources (the ChangeLog file lists all changes made). You can download these modified sources of open-source R under the terms of the GPLv2, using either the links below or those in the email sent to you when you download a specific version of Revolution R.

Download GPL Sources

Product Version Platform Modified R Sources
Revolution R Community 3.2 Windows R 2.10.1
Revolution R Community 3.2 MacOS R 2.10.1
Revolution R Enterprise 3.1.1 RHEL R 2.9.2
Revolution R Enterprise 4.0 Windows R 2.11.1
Revolution R Enterprise 4.0.1 RHEL R 2.11.1
Revolution R Enterprise 4.1.0 Windows R 2.11.1
Revolution R Enterprise 4.2 Windows R 2.11.1
Revolution R Enterprise 4.2 RHEL R 2.11.1
Revolution R Enterprise 4.3 Windows & RHEL R 2.12.2