New Amazon Instance: High I/O for NoSQL

Latest from the Amazon Cloud-

hi1.4xlarge instances come with eight virtual cores that can deliver 35 EC2 Compute Units (ECUs) of CPU performance, 60.5 GiB of RAM, and 2 TiB of storage capacity across two SSD-based storage volumes. Customers using hi1.4xlarge instances for their applications can expect over 120,000 4 KB random write IOPS, and as many as 85,000 random write IOPS (depending on active LBA span). These instances are available on a 10 Gbps network, with the ability to launch instances into cluster placement groups for low-latency, full-bisection bandwidth networking.

High I/O instances are currently available in three Availability Zones in US East (N. Virginia) and two Availability Zones in EU West (Ireland) regions. Other regions will be supported in the coming months. You can launch hi1.4xlarge instances as On Demand instances starting at $3.10/hour, and purchase them as Reserved Instances

http://aws.amazon.com/ec2/instance-types/

High I/O Instances

Instances of this family provide very high instance storage I/O performance and are ideally suited for many high performance database workloads. Example applications include NoSQL databases like Cassandra and MongoDB. High I/O instances are backed by Solid State Drives (SSD), and also provide high levels of CPU, memory and network performance.

High I/O Quadruple Extra Large Instance

60.5 GB of memory
35 EC2 Compute Units (8 virtual cores with 4.4 EC2 Compute Units each)
2 SSD-based volumes each with 1024 GB of instance storage
64-bit platform
I/O Performance: Very High (10 Gigabit Ethernet)
Storage I/O Performance: Very High*
API name: hi1.4xlarge

*Using Linux paravirtual (PV) AMIs, High I/O Quadruple Extra Large instances can deliver more than 120,000 4 KB random read IOPS and between 10,000 and 85,000 4 KB random write IOPS (depending on active logical block addressing span) to applications. For hardware virtual machines (HVM) and Windows AMIs, performance is approximately 90,000 4 KB random read IOPS and between 9,000 and 75,000 4 KB random write IOPS. The maximum sequential throughput on all AMI types (Linux PV, Linux HVM, and Windows) per second is approximately 2 GB read and 1.1 GB write.

How big is R on CRAN #rstats

3.87 GB and 3786 packages. Thats what you need to install the whole of R as on CRAN

( Note- Many IT administrators /Compliance Policies in enterprises forbid installing from the Internet in work offices.

Which is where the analytics,$$, and people are)

As downloaded from the CRAN Mirror at UCLA.

Takes 3 hours to download at 1 mbps (I was on an Amazon Ec2 instance)

See screenshot.

Next question- who is the man responsible in the R project for deleting old /depreciated/redundant packages if the authors dont do it.

 

Visualizing Bigger Data in R using Tabplot

The amazing tabplot package creates the tableplot feature for visualizing huge chunks of data. This is a great example of creative data visualization that is resource lite and extremely fast in a first look at the data. (note- The tabplot package is being used and table plot function is being used . The TABLEPLOT package is different and is NOT being used here).

library(ggplot2)
data(diamonds)
library(tabplot)
tableplot(diamonds)
system.time(tableplot(diamonds))

visualizing a 50000 row by 10 variable dataset in 0.7 s is fast !!

click on screenshot to see it

and some say R is slow 😉

 

Note I used a free Windows Amazon EC2 Instance for it-

See screenshot for hardware configuration

 

the best thing is there is a handy GTK GUI for this package. You can check it out at

 

 

Using Cloud Computing for Hacking

This is not about hacking the cloud. Instead this is about using the cloud to hack

 

Some articles last year wrote on how hackers used Amazon Ec2 for hacking/ddos attacks.

http://www.pcworld.com/businesscenter/article/216434/cloud_computing_used_to_hack_wireless_passwords.html

Roth claims that a typical wireless password can be guessed by EC2 and his software in about six minutes. He proved this by hacking networks in the area where he lives. The type of EC2 computers used in the attack costs 28 cents per minute, so $1.68 is all it could take to lay open a wireless network.

and

http://www.bloomberg.com/news/2011-05-15/sony-attack-shows-amazon-s-cloud-service-lures-hackers-at-pennies-an-hour.html

Cloud services are also attractive for hackers because the use of multiple servers can facilitate tasks such as cracking passwords, said Ray Valdes, an analyst at Gartner Inc. Amazon could improve measures to weed out bogus accounts, he said.

 

and this article by Anti-Sec pointed out how one can obtain a debit card anonymously

https://www.facebook.com/notes/lulzsec/want-to-be-a-ghost-on-the-internet/230293097062823

VPN Account without paper trail

  • Purchase prepaid visa card with cash
  • Purchase Bitcoins with Money Order
  • Donate Bitcoins to different account

 

Masking your IP address to log on is done by TOR

https://www.torproject.org/download/download.html.en

and the actual flooding is done by tools like LOIC or HOIC

http://sourceforge.net/projects/loic/

and

http://www.4shared.com/rar/UmCu0ds1/hoic.html

 

So what safeguards can be expected from the next wave of Teenage Mutant Ninjas..?

 

Amazon gives away 750 hours /month of Windows based computing

and an additional 750 hours /month of Linux based computing. The windows instance is really quite easy for users to start getting the hang of cloud computing. and it is quite useful for people to tinker around, given Google’s retail cloud offerings are taking so long to hit the market

But it is only for new users.

http://aws.typepad.com/aws/2012/01/aws-free-usage-tier-now-includes-microsoft-windows-on-ec2.html

WS Free Usage Tier now Includes Microsoft Windows on EC2

The AWS Free Usage Tier now allows you to run Microsoft Windows Server 2008 R2 on an EC2 t1.micro instance for up to 750 hours per month. This benefit is open to new AWS customers and to those who are already participating in the Free Usage Tier, and is available in all AWS Regions with the exception of GovCloud. This is an easy way for Windows users to start learning about and enjoying the benefits of cloud computing with AWS.

The micro instances provide a small amount of consistent processing power and the ability to burst to a higher level of usage from time to time. You can use this instance to learn about Amazon EC2, support a development and test environment, build an AWS application, or host a web site (or all of the above). We’ve fine-tuned the micro instances to make them even better at running Microsoft Windows Server.

You can launch your instance from the AWS Management Console:

We have lots of helpful resources to get you started:

Along with 750 instance hours of Windows Server 2008 R2 per month, the Free Usage Tier also provides another 750 instance hours to run Linux (also on a t1.micro), Elastic Load Balancer time and bandwidth, Elastic Block Storage, Amazon S3 Storage, and SimpleDB storage, a bunch of Simple Queue Service and Simple Notification Service requests, and some CloudWatch metrics and alarms (see the AWS Free Usage Tier page for details). We’ve also boosted the amount of EBS storage space offered in the Free Usage Tier to 30GB, and we’ve doubled the I/O requests in the Free Usage Tier, to 2 million.

 

Amazon CC2 – The Big Cloud is finally here

Finally a powerful enough cloud computing instance from Amazon EC2 – called CC2 priced at 3$ per hour (for Windows instances) and 2.4$/hour for Linux

It would be interesting to see how SAS, IBM SPSS or R can leverage these

Storage – On the storage front, the CC2 instance type is packed with 60.5 GB of RAM and 3.37 TB of instance storage.

Processing – The CC2 instance type includes 2 Intel Xeon processors, each with 8 hardware cores. We’ve enabled Hyper-Threading, allowing each core to process a pair of instruction streams in parallel. Net-net, there are 32 hardware execution threads and you can expect 88 EC2 Compute Units (ECU’s) from this 64-bit instance type

On a somewhat smaller scale, you can launch your own array of 290 CC2 instances and create a Top500 supercomputer (63.7 teraFLOPS) at a cost of less than $1000 per hour

http://aws.typepad.com/aws/2011/11/next-generation-cluster-computing-on-amazon-ec2-the-cc2-instance-type.html

 

 

and

http://aws.amazon.com/hpc-applications/

 

 

Cluster Compute Eight Extra Large specifications:
88 EC2 Compute Units (Eight-core 2 x Intel Xeon)
60.5 GB of memory
3370 GB of instance storage
64-bit platform
I/O Performance: Very High (10 Gigabit Ethernet)
API name: cc2.8xlarge
Price: Starting from $2.40 per hour

But some caveats

  • The instances are available in a single Availability Zone in the US East (Northern Virginia) Region. We plan to add capacity in other EC2 Regions throughout 2012.
  • You can run 2 CC2 instances by default.
  • You cannot currently launch instances of this type within a Virtual Private Cloud (VPC).

Interview Markus Schmidberger ,Cloudnumbers.com

Here is an interview with Markus Schmidberger, Senior Community Manager for cloudnumbers.com. Cloudnumbers.com is the exciting new cloud startup for scientific computing. It basically enables transition to a R and other platforms in the cloud and makes it very easy and secure from the traditional desktop/server model of operation.

Ajay- Describe the startup story for setting up Cloudnumbers.com

Markus- In 2010 the company founders Erik Muttersbach (TU München), Markus Fensterer (TU München) and Moritz v. Petersdorff-Campen (WHU Vallendar) started with the development of the cloud computing environment. Continue reading “Interview Markus Schmidberger ,Cloudnumbers.com”