One more addition to the GPU stack that adds up power when combined with CPU and GPUs. For numeric computing, it may be essential to have GPU- CPU mixed software as almost all hardware people now have offered GPU-CPU products. Maybe software companies can get inspired for new kind of GPU-CPU blade server software again.
But for “true” supercomputing applications, the SL390s G7 is the go-to server. Like its sibling, the SL390s comes with Xeon 5600 processors, but the option to pair the CPUs with up to three on-board NVIDIA “Fermi” 20-series GPUs puts a lot more floating point performance into this design. Customers can choose from either the M2050 or M2070 Tesla GPU modules, the only difference being the amount of graphics memory — 3 GB of GDDR5 for the M2050 versus 6 GB for the M2070. Each GPU module is served by its own PCIe Gen2 x16 channel in order to maximize bandwidth to the graphics chips. At the maximum configuration with all three Fermi GPUs and two Westmere CPUs, a single server delivers on the order of 1 teraflop of double precision performance. “So this is very much a server that has been designed for HPC,” said Turkel.
With GPUs on board, the SL390s fill out a 2U half-width tray, so up to four of these can be packed into a 4U SL6500 chassis. A CPU-only version is also available and takes up just half the space (half-width 1U), enabling twice as many Xeons to occupy the same chassis. This configuration will likely be the server of choice for the majority of HPC setups, given that GPGPU deployment is really just getting started. Pricing on the CPU-only model starts at $2,259.
, the ProLiant SL390s G7, provides more raw FLOPS per square inch than any server HP has delivered to date, and is the basis for the 2.4 petaflop TSUBAME 2.0 supercomputer currently being deployed at the Tokyo Institute of Technology.
Revolution Analytics has just released Revolution R Enterprise 4.0.1 for Red Hat Enterprise Linux, a significant step forward in enterprise data analytics. Revolution R Enterprise 4.0.1 is built on R 2.11.1, the latest release of the open-source environment for data analysis and graphics. Also available is the initial release of our deployment server solution, RevoDeployR 1.0, designed to help you deliver R analytics via the Web. And coming soon to Linux: RevoScaleR, a new package for fast and efficient multi-core processing of large data sets.
As a registered user of the Academic version of Revolution R Enterprise for Linux, you can take advantage of these improvements by downloading and installing Revolution R Enterprise 4.0.1 today. You can install Revolution R Enterprise 4.0.1 side-by-side with your existing Revolution R Enterprise installations; there is no need to uninstall previous versions.
The following information is all you will need to download and install the Academic Edition.
Revolution R Enterprise Academic edition and RevoDeployR are supported on Red Hat® Enterprise Linux® 5.4 or greater (64-bit processors).
Approximately 300MB free disk space is required for a full install of Revolution R Enterprise. We recommend at least 1GB of RAM to use Revolution R Enterprise.
For the full list of system requirements for RevoDeployR, refer to the RevoDeployR™ Installation Guide for Red Hat® Enterprise Linux®.
You will first need to download the Revolution R Enterprise installer.
Installation Instructions for Revolution R Enterprise Academic Edition
After downloading the installer, do the following to install the software:
Unpack the installer using the following command:
tar -xzf Revo-Ent-4.0.1-RHEL5-desktop.tar.gz
Change directory to the RevolutionR_4.0.1 directory created.
Run the installer by typing ./install.py and following the on-screen prompts.
Getting Started with the Revolution R Enterprise
After you have installed the software, launch Revolution R Enterprise by typing Revo64 at the shell prompt.
Documentation is available in the form of PDF documents installed as part of the Revolution R Enterprise distribution. Type Revo.home(“doc”) at the R prompt to locate the directory containing the manuals Getting Started with Revolution R (RevoMan.pdf) and the ParallelR User’s Guide(parRman.pdf).
Installation Instructions for RevoDeployR (and RServe)
After downloading the RevoDeployR distribution, use the following steps to install the software:
Note: These instructions are for an automatic install. For more details or for manual install instructions, refer to RevoDeployR_Installation_Instructions_for_RedHat.pdf.
Log into the operating system as root.
Change directory to the directory containing the downloaded distribution for RevoDeployR and RServe.
Unzip the contents of the RevoDeployR tar file. At prompt, type:
tar -xzf deployrRedHat.tar.gz
Change directories. At the prompt, type:
Launch the automated installation script and follow the on-screen prompts. At the prompt, type:
./installRedHat.sh Note:Red Hat installs MySQL without a password.
Getting Started with RevoDeployR
After installing RevoDeployR, you will be directed to the RevoDeployR landing page. The landing page has links to documentation, the RevoDeployR management console, the API Explorer development tool, and sample code.
The simple R-benchmark-25.R test script is a quick-running survey of general R performance. The Community-developed test consists of three sets of small benchmarks, referred to in the script as Matrix Calculation, Matrix Functions, and Program Control.
Revolution Analytics has created its own tests to simulate common real-world computations. Their descriptions are explained below.
Linear Algebra Computation
Base R 2.9.2
Revolution R (1-core)
Revolution R (4-core)
Speedup (4 core)
Singular Value Decomposition
Principal Components Analysis
Linear Discriminant Analysis
Speedup = Slower time / Faster Time – 1
This routine creates a random uniform 10,000 x 5,000 matrix A, and then times the computation of the matrix product transpose(A) * A.
m <- 10000
n <- 5000
A <- matrix (runif (m*n),m,n)
system.time (B <- crossprod(A))
The system will respond with a message in this format:
User system elapsed
37.22 0.40 9.68
The “elapsed” times indicate total wall-clock time to run the timed code.
The table above reflects the elapsed time for this and the other benchmark tests. The test system was an INTEL® Xeon® 8-core CPU (model X55600) at 2.5 GHz with 18 GB system RAM running Windows Server 2008 operating system. For the Revolution R benchmarks, the computations were limited to 1 core and 4 cores by calling setMKLthreads(1) and setMKLthreads(4) respectively. Note that Revolution R performs very well even in single-threaded tests: this is a result of the optimized algorithms in the Intel MKL library linked to Revolution R. The slight greater than linear speedup may be due to the greater total cache available to all CPU cores, or simply better OS CPU scheduling–no attempt was made to pin execution threads to physical cores. Consult Revolution R’s documentation to learn how to run benchmarks that use less cores than your hardware offers.
The Cholesky matrix factorization may be used to compute the solution of linear systems of equations with a symmetric positive definite coefficient matrix, to compute correlated sets of pseudo-random numbers, and other tasks. We re-use the matrix B computed in the example above:
system.time (C <- chol(B))
Singular Value Decomposition with Applications
The Singular Value Decomposition (SVD) is a numerically-stable and very useful matrix decompisition. The SVD is often used to compute Principal Components and Linear Discriminant Analysis.
# Singular Value Deomposition
m <- 10000
n <- 2000
A <- matrix (runif (m*n),m,n)
system.time (S <- svd (A,nu=0,nv=0))
# Principal Components Analysis
m <- 10000
n <- 2000
A <- matrix (runif (m*n),m,n)
system.time (P <- prcomp(A))
# Linear Discriminant Analysis require (‘MASS’)
g <- 5
k <- round (m/2)
A <- data.frame (A, fac=sample (LETTERS[1:g],m,replace=TRUE))
train <- sample(1:m, k)
system.time (L <- lda(fac ~., data=A, prior=rep(1,g)/g, subset=train))
PALO ALTO, Calif., Sept. 20 — Revolution Analytics, the leading commercial provider of software and support for the popular open source R statistics language, today announced it will deliver Revolution R Enterprise for Microsoft Windows HPC Server 2008 R2, released today, enabling users to analyze very large data sets in high-performance computing environments.
R is a powerful open source statistics language and the modern system for predictive analytics. Revolution Analytics recently introduced RevoScaleR, new “Big Data” analysis capabilities, to its R distribution, Revolution R Enterprise. RevoScaleR solves the performance and capacity limitations of the R language by with parallelized algorithms that stream data across multiple cores on a laptop, workstation or server. Users can now process, visualize and model terabyte-class data sets at top speeds — without the need for specialized hardware.
“Revolution Analytics is pleased to support Microsoft’s Technical Computing initiative, whose efforts will benefit scientists, engineers and data analysts,” said David Champagne, CTO at Revolution. “We believe the engineering we have done for Revolution R Enterprise, in particular our work on big-data statistics and multicore computing, along with Microsoft’s HPC platform for technical computing, makes an ideal combination for high-performance large scale statistical computing.”
“Processing and analyzing this ‘big data’ is essential to better prediction and decision making,” said Bill Hamilton, director of technical computing at Microsoft Corp. “Revolution R Enterprise for Windows HPC Server 2008 R2 gives customers an extremely powerful tool that handles analysis of very large data and high workloads.”
REvolution R Enterprise is designed for both novice and experienced R users looking for a production-grade R distribution to perform mission critical predictive analytics tasks right from the desktop and scale across multiprocessor environments. Featuring RPE™ REvolution’s R Productivity Environment for Windows.
Of course R Enterprise is available on Linux but on Red Hat Enterprise Linux- it would be nice to see Amazom Machine Images as well as Ubuntu versions as well.
Like all virtual appliances, the main component of an AMI is a read-only filesystem image which includes an operating system (e.g., Linux, UNIX, or Windows) and any additional software required to deliver a service or a portion of it.
The AMI filesystem is compressed, encrypted, signed, split into a series of 10MB chunks and uploaded into Amazon S3 for storage. An XML manifest file stores information about the AMI, including name, version, architecture, default kernel id, decryption key and digests for all of the filesystem chunks.
An AMI does not include a kernel image, only a pointer to the default kernel id, which can be chosen from an approved list of safe kernels maintained by Amazon and its partners (e.g., RedHat, Canonical, Microsoft). Users may choose kernels other than the default when booting an AMI.
Paid: a for-pay AMI image that is registered with Amazon DevPay and can be used by any one who subscribes for it. DevPay allows developers to mark-up Amazon’s usage fees and optionally add monthly subscription fees.
This one is a work in progress but it can be used as a generic tutorial for creating and publishing R packages and then running them in an HPC environment. It can also be used for translating existing Algorithms in R or creating new Statistical Algorithms in R.
The best tutorial for fast and easy learning of parallel computation, code optimization and general High Performance Computing is by Dr. Dirk which he gave at User9 . He is a nice guy with a pleasant manner and temper though we disagree on his choice of Debian and my choice of Ubuntu as Linux.