Amazon cloud gets more exciting. We are still waiting for the Oracle and Google public clouds (compute) to open up out of beta! See their (rather cluttered) blog
Today, we are excited to announce a new generation of the original Amazon EC2 instance family. Second generation Standard instances (M3 instances) provide customers with the same balanced set of CPU and memory resources as first generation Standard instances (M1 instances) while providing customers with 50% more computational capability/core.
M3 instances are currently available in two instance types; extra-large (m3.xlarge) and double extra-large (m3.2xlarge). Examples of applications that can benefit from the additional CPU horsepower of these new instances include media encoding, batch processing, web servers, caching fleets, and many others. Currently, M3 instances are available in the US East (N. Virginia) Region starting at a Linux On-Demand price of $0.58/hr for extra-large instances. Customers can also purchase M3 instances as Reserved Instances or as Spot instances. We will introduce M3 instances in additional regions in the coming months.
To learn more about Amazon EC2 instance types and to find out which instance type might be useful for you, please visit the Amazon EC2 Instance type page.
Pricing Change for M1 Standard Instances
Along with the introduction of the M3 Standard instance family, we are announcing a reduction in Linux On-Demand pricing for M1 Standard instances in the US East (N. Virginia) and US West (Oregon) Regions by almost 19%. The new pricing is effective from November 1 and is described in the following table
Instance Type
Previous Price
New Price
m1.small
$0.080
$0.065
m1.medium
$0.160
$0.130
m1.large
$0.320
$0.260
m1.xlarge
$0.640
$0.520
You can find out more about pricing for all Amazon EC2 instances by visiting the Amazon EC2 pricing page.
This promotional offer enables you to try a limited amount of the Windows Azure platform at no charge. The subscription includes a base level of monthly compute hours, storage, data transfers, a SQL Azure database, Access Control transactions and Service Bus connections at no charge. Please note that any usage over this introductory base level will be charged at standard rates.
Included each month at no charge:
Windows Azure
25 hours of a small compute instance
500 MB of storage
10,000 storage transactions
SQL Azure
1GB Web Edition database (available for first 3 months only)
Windows Azure platform AppFabric
100,000 Access Control transactions
2 Service Bus connections
Data Transfers (per region)
500 MB in
500 MB out
Any monthly usage in excess of the above amounts will be charged at the standard rates. This introductory special will end on March 31, 2011 and all usage will then be charged at the standard rates.
As part of AWS’s Free Usage Tier, new AWS customers can get started with Amazon EC2 for free. Upon sign-up, new AWScustomers receive the following EC2 services each month for one year:
750 hours of EC2 running Linux/Unix Micro instance usage
750 hours of Elastic Load Balancing plus 15 GB data processing
10 GB of Amazon Elastic Block Storage (EBS) plus 1 million IOs, 1 GB snapshot storage, 10,000 snapshot Get Requests and 1,000 snapshot Put Requests
15 GB of bandwidth in and 15 GB of bandwidth out aggregated across all AWS services
Paid Instances-
Standard On-Demand Instances
Linux/UNIX Usage
Windows Usage
Small (Default)
$0.085 per hour
$0.12 per hour
Large
$0.34 per hour
$0.48 per hour
Extra Large
$0.68 per hour
$0.96 per hour
Micro On-Demand Instances
Micro
$0.02 per hour
$0.03 per hour
High-Memory On-Demand Instances
Extra Large
$0.50 per hour
$0.62 per hour
Double Extra Large
$1.00 per hour
$1.24 per hour
Quadruple Extra Large
$2.00 per hour
$2.48 per hour
High-CPU On-Demand Instances
Medium
$0.17 per hour
$0.29 per hour
Extra Large
$0.68 per hour
$1.16 per hour
Cluster Compute Instances
Quadruple Extra Large
$1.60 per hour
N/A*
Cluster GPU Instances
Quadruple Extra Large
$2.10 per hour
N/A*
* Windows is not currently available for Cluster Compute or Cluster GPU Instances.
NOTE- Amazon Instance definitions differ slightly from Azure definitions
Instances of this family are well suited for most applications.
Small Instance – default*
1.7 GB memory
1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)
160 GB instance storage
32-bit platform
I/O Performance: Moderate
API name: m1.small
Large Instance
7.5 GB memory
4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)
850 GB instance storage
64-bit platform
I/O Performance: High
API name: m1.large
Extra Large Instance
15 GB memory
8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)
1,690 GB instance storage
64-bit platform
I/O Performance: High
API name: m1.xlarge
Micro Instances
Instances of this family provide a small amount of consistent CPU resources and allow you to burst CPU capacity when additional cycles are available. They are well suited for lower throughput applications and web sites that consume significant compute cycles periodically.
Micro Instance
613 MB memory
Up to 2 EC2 Compute Units (for short periodic bursts)
EBS storage only
32-bit or 64-bit platform
I/O Performance: Low
API name: t1.micro
High-Memory Instances
Instances of this family offer large memory sizes for high throughput applications, including database and memory caching applications.
High-Memory Extra Large Instance
17.1 GB of memory
6.5 EC2 Compute Units (2 virtual cores with 3.25 EC2 Compute Units each)
420 GB of instance storage
64-bit platform
I/O Performance: Moderate
API name: m2.xlarge
High-Memory Double Extra Large Instance
34.2 GB of memory
13 EC2 Compute Units (4 virtual cores with 3.25 EC2 Compute Units each)
850 GB of instance storage
64-bit platform
I/O Performance: High
API name: m2.2xlarge
High-Memory Quadruple Extra Large Instance
68.4 GB of memory
26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute Units each)
1690 GB of instance storage
64-bit platform
I/O Performance: High
API name: m2.4xlarge
High-CPU Instances
Instances of this family have proportionally more CPU resources than memory (RAM) and are well suited for compute-intensive applications.
High-CPU Medium Instance
1.7 GB of memory
5 EC2 Compute Units (2 virtual cores with 2.5 EC2 Compute Units each)
350 GB of instance storage
32-bit platform
I/O Performance: Moderate
API name: c1.medium
High-CPU Extra Large Instance
7 GB of memory
20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)
1690 GB of instance storage
64-bit platform
I/O Performance: High
API name: c1.xlarge
Cluster Compute Instances
Instances of this family provide proportionally high CPU resources with increased network performance and are well suited for High Performance Compute (HPC) applications and other demanding network-bound applications. Learn more about use of this instance type for HPC applications.
Cluster Compute Quadruple Extra Large Instance
23 GB of memory
33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem” architecture)
1690 GB of instance storage
64-bit platform
I/O Performance: Very High (10 Gigabit Ethernet)
API name: cc1.4xlarge
Cluster GPU Instances
Instances of this family provide general-purpose graphics processing units (GPUs) with proportionally high CPU and increased network performance for applications benefitting from highly parallelized processing, including HPC, rendering and media processing applications. While Cluster Compute Instances provide the ability to create clusters of instances connected by a low latency, high throughput network, Cluster GPU Instances provide an additional option for applications that can benefit from the efficiency gains of the parallel computing power of GPUs over what can be achieved with traditional processors. Learn moreabout use of this instance type for HPC applications.
Cluster GPU Quadruple Extra Large Instance
22 GB of memory
33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem” architecture)
2 x NVIDIA Tesla “Fermi” M2050 GPUs
1690 GB of instance storage
64-bit platform
I/O Performance: Very High (10 Gigabit Ethernet)
API name: cg1.4xlarge
versus-
Windows Azure compute instances come in five unique sizes to enable complex applications and workloads.
Compute Instance Size
CPU
Memory
Instance Storage
I/O Performance
Extra Small
1 GHz
768 MB
20 GB*
Low
Small
1.6 GHz
1.75 GB
225 GB
Moderate
Medium
2 x 1.6 GHz
3.5 GB
490 GB
High
Large
4 x 1.6 GHz
7 GB
1,000 GB
High
Extra large
8 x 1.6 GHz
14 GB
2,040 GB
High
*There is a limitation on the Virtual Hard Drive (VHD) size if you are deploying a Virtual Machine role on an extra small instance. The VHD can only be up to 15 GB.
Amazon just did a cluster Christmas present for us tech geek lizards- before Google could out doogle them with end of the Betas (cough- its on NDA)
Clusters used by Academic Departments now have a great chance to reduce cost without downsizing- but only if the CIO gets the email.
While Professor Goodnight of SAS / North Carolina University is still playing time sharing versus mind sharing games with analytical birdies – his 70 mill server farm set in Feb last is about to get ready
( I heard they got public subsidies for environment- but thats historic for SAS– taking public things private -right Prof as SAS itself began as a publicly funded project. and that was in the 1960s and they didnt even have no lobbyists as well. )
In realted R news, Dirk E has been thinking of a R HPC book without paying attention to Amazon but would now have to include Amazon
(he has been thinking of writing that book for 5 years, but hey he’s got a day job, consulting gigs with revo, photo ops at Google, a blog, packages to maintain without binaries, Dirk E we await thy book with bated holes.
Whos Dirk E – well http://dirk.eddelbuettel.com/ is like the Terminator of R project (in terms of unpronounceable surnames)
Unique to Cluster Compute and Cluster GPU instances is the ability to group them into clusters of instances for use with HPC
applications. This is particularly valuable for those applications that rely on protocols like Message Passing Interface (MPI) for tightly coupled inter-node communication.
Cluster Compute and Cluster GPU instances function just like other Amazon EC2 instances but also offer the following features for optimal performance with HPC applications:
When run as a cluster of instances, they provide low latency, full bisection 10 Gbps bandwidth between instances. Cluster sizes up through and above 128 instances are supported.
Cluster Compute and Cluster GPU instances include the specific processor architecture in their definition to allow developers to tune their applications by compiling applications for that specific processor architecture in order to achieve optimal performance.
The Cluster Compute instance family currently contains a single instance type, the Cluster Compute Quadruple Extra Large with the following specifications:
23 GB of memory 33.5 EC2 Compute Units (2 x Intel XeonX5570, quad-core “Nehalem” architecture) 1690 GB of instance storage 64-bit platform I/O Performance: Very High (10 Gigabit Ethernet) API name: cc1.4xlarge
The Cluster GPU instance family currently contains a single instance type, the Cluster GPU Quadruple Extra Large with the following specifications:
22 GB of memory 33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem” architecture) 2 x NVIDIA Tesla “Fermi” M2050GPUs 1690 GB of instance storage 64-bit platform I/O Performance: Very High (10 Gigabit Ethernet) API name: cg1.4xlarge
Running R on an Amazon EC2 has following benefits-
1) Elastic Memory and Number of Processors for heavy computation 2) Affordable micro instances for smaller datasets (2 cents per hour for Unix to 3 cents per hour). 3) An easy to use interface console for managing datasets as well as processes
Running R on an Amazon EC2 on Windows Instance has following additional benefits-
1) Remote Desktop makes operation of R very easy 2) 64 Bit R can be used 3) You can also use your evaluation of Revolution R Enterprise (which is free to academics) and quite inexpensive for enterprise software for corporates.
You can thus combine R GUIs (like Rattle , R Cmdr or Deducer based upon your need for statistical analysis, data mining or graphical analysis) , with 64 Bit OS, and Revolution’s REvoScaler Package to manage huge huge datasets at a very easy to use analytics solution.
(note if you select SQL Server it will cost you extra)
Then go through the following steps and launch instance
Selecting EC2 compute depending on number of cores, memory needs and budget
Create a key pair (a .pem file which is basically an encrypted password) and download it. For tags, etc just click on and pass (or read and create some tags to help you remember, and organize multiple instances) In configure firewall, remember to Enable Access to RDP (Remote Desktop) and HTTP. You can choose to enable whole internet or your own ip address/es for logging in Review and launch instance
Go to instance (leftmost margin)
and see status (yellow for pending) Click on Instance Actions-Connect on Top Bar to see following Download the .RDP shortcut file and Click on Instance Actions-Request Admin Password
Wait 15 minutes while burning few cents for free as Microsoft creates a password for you Have coffee (or tea is you are health minded) Click Again on Instance Actions-Request Admin Password Open the key pair file (or .pem file created earlier) using
notepad, and copy and paste the Private Key (looks like gibberish)- and click Decrypt.
Retrieve Password for logging on.
Note the new password generated- this is your Remote Desktop Password.
Click on the .rdp file (or Shortcut file created earlier)- It will connect to your Windows instance.
Enter the new generated password in Remote Desktop
Login
This looks like a new clean machine with just Windows OS installed on it. Install Chrome (or any other browser) if you do not use Internet Explorer Install Acrobat Reader (for documentation), Revolution R Enterprise~ 490 mb (it will automatically ask to install the .NET framework-4 files) and /or R
Install packages (I recommend installing R Commander, Rattle and Deducer). Apart from the fact that these GUIs are quite complimentary- they also will install almost all main packages that you need for analysis (as their dependencies) Revolution R installs parallel programming packages by default.
If you want to save your files for working later, you can make a snapshot (go to amazon console-ec2- left margin- ABS -Snapshot- you will see an attached memory (green light)- click on create snapshot to save your files for working later If you want to use my Windows snapshot you can work on it , just when you start your Amazon Ec2 you can click on snapshots and enter details (see snapshot name below) for making a copy or working on it for exploring either 64 bit R, or multi core cloud computing or just trying out Revolution R’s new packages for academic purposes.
Parallel Computing Toolbox™ lets you solve computationally and data-intensive problems using multicore processors, GPUs, and computer clusters. High-level constructs—parallel for-loops, special array types, and parallelized numerical algorithms—let you parallelize MATLAB® applications without CUDA or MPI programming. You can use the toolbox with Simulink® to run multiple simulations of a model in parallel.
The toolbox provides eight workers (MATLAB computational engines) to execute applications locally on a multicore desktop. Without changing the code, you can run the same application on a computer cluster or a grid computing service (using MATLAB Distributed Computing Server™). You can run parallel applications interactively or in batch.
The gputools package by Buckner provides several common data-mining algorithms which are implemented using a mixture of nVidia‘s CUDA langauge and cublas library. Given a computer with an nVidia GPU these functions may be substantially more efficient than native R routines. The rpud package provides an optimised distance metric for NVidia-based GPUs.
The cudaBayesreg package by da Silva implements the rhierLinearModel from the bayesm package using nVidia’s CUDA langauge and tools to provide high-performance statistical analysis of fMRI voxels.
The rgpu package (see below for link) aims to speed up bioinformatics analysis by using the GPU.
The magma package provides an interface to the hybrid GPU/CPU library Magma (see below for link).
The gcbd package implements a benchmarking framework for BLAS and GPUs (using gputools).
I tried to search for SAS and GPU and SPSS and GPU but got nothing. Maybe they would do well to atleast test these alternative hardwares-
Also see Matlab on GPU comparison for the product Jacket vs Parallel Computing Toolbox