Parallel Programming using R in Windows

Ashamed at my lack of parallel programming, I decided to learn some R Parallel Programming (after all parallel blogging is not really respect worthy in tech-geek-ninja circles).

So I did the usual Google- CRAN- search like a dog thing only to find some obstacles.

Obstacles-

Some Parallel Programming Packages like doMC are not available in Windows

http://cran.r-project.org/web/packages/doMC/index.html

Some Parallel Programming Packages like doSMP depend on Revolution’s Enterprise R (like –

http://blog.revolutionanalytics.com/2009/07/simple-scalable-parallel-computing-in-r.html

and http://www.r-statistics.com/2010/04/parallel-multicore-processing-with-r-on-windows/ (No the latest hack didnt work)

or are in testing like multicore (for Windows) so not available on CRAN

http://cran.r-project.org/web/packages/multicore/index.html

fortunately available on RForge

http://www.rforge.net/multicore/files/

Revolution did make DoSnow AND foreach available on CRAN

see http://blog.revolutionanalytics.com/2009/08/parallel-programming-with-foreach-and-snow.html

but the documentation in SNOW is overwhelming (hint- I use Windows , what does that tell you about my tech acumen)

http://sekhon.berkeley.edu/snow/html/makeCluster.html and

http://www.stat.uiowa.edu/~luke/R/cluster/cluster.html

what is a PVM or MPI? and SOCKS are for wearing or getting lost in washers till I encountered them in SNOW


Finally I did the following-and made the parallel programming work in Windows using R

require(doSNOW)
cl<-makeCluster(2) # I have two cores
registerDoSNOW(cl)
# create a function to run in each itteration of the loop

check <-function(n) {

+ for(i in 1:1000)

+ {

+ sme <- matrix(rnorm(100), 10,10)

+ solve(sme)

+ }

+ }
times <- 100     # times to run the loop
system.time(x <- foreach(j=1:times ) %dopar% check(j))
user  system elapsed
0.16    0.02   19.17
system.time(for(j in 1:times ) x <- check(j))
user  system elapsed</pre>
39.66    0.00   40.46

stopCluster(cl)

And it works!

Author: Ajay Ohri

http://about.me/ajayohri

9 thoughts on “Parallel Programming using R in Windows”

  1. that time i’m quite sure i pasted it, maybe i’ve hit on a wordpress bug..

    system.time(x <- foreach(j=1:times ) %dopar% check(j))
    user system elapsed
    0.10 0.01 22.01
    ========
    system.time(for(j in 1:times ) x <- check(j))
    user system elapsed
    21.98 0.02 22.03
    ========

    1. It is interesting to see 64 bit OS’s effects on processing time as there is hardly any improvement. I redid the same using Amazon ec2 environment also.

      In fact in my next blog post I am using the same example but on Amazon large instance (dual core 64 bit 7.5 gb RAM) and I find it is still faster than doing the foreach loop locally on a 32 bit 3gb RAM machine.

      1. if it were a four or eight core machine, would it make a difference? or does the 64bit windows r build just completely squelch out the possibility of any time improvement from parallel processing?

        thanks!

  2. i have two cores also, but when i run it, i get

    > system.time(x system.time(for(j in 1:times ) x <- check(j))
    user system elapsed
    21.89 0.01 22.00

    i am running it in windows r 2.11.1 x64..

  3. Hi Ajay,

    I didn’t know about “doSNOW”, that’s cool to know.

    I’ve been able to make doSMP work fine (R 2.11.1 with win-7, and also on 2.10 with win XP), but I am glad there are more solutions out there for windows.

    Cheers,
    Tal

    1. I am using Windows XP SP2 R 2.11.1 – I did the same, copy folders from R Enterprise, and downloaded the RI.. from the site, still it didnt recognize doSMP as a package. Yes doSNOW is a cool package and REVO is to be thanked

Leave a comment