So many R Packages Everywhere, which one do I use? #rstats

Some thoughts on R Packages

  • CRAN is no longer the sole repository for many useful R packages. This includes R Forge, Google Code and increasingly Github
  • CRAN lacks the flexibility and social aspect of Github.
  • CRAN Views is the only thing that lists subject wide listing of R packages. The categorization is however done more on methods than on use cases or business domains.
  • Multiple R packages for the same thing. Which one do I use? Only Stack Overflow helps with that. No rating , no recommendation system
  • The packages suggested by R package feature needs better and automatic association analysis . Right now it is manual and dependent on package author and maintainer.
  • Quis custodiet ipsos custodes? Who guards the guardians of R packages. In an era of cyber security, we need better transparency on security measures within R packages especially given the international nature of the project.  I am very sure I ( or anyone) can create R code to communicate discretely especially on Windows

  • I would rather not install anything on my local machine, and read the package directly from the CRAN . CRAN was designed in an era of low bandwidth- this needs to be upgraded.
  • Note I am refraining respectfully from the atrocious nature of aesthetics in the home website. Many statisticians feel no use of making R user friendly. My professors at U tenn (from which I dropped out in 2 sems) were horrified when I took courses in graphic design as I wanted to know more on the A and B, which make the A/B testing of statistical design. Now that I am getting older, I get horrified by the lack of HTML, CSS and JQuery by some of the brightest programmers in this project.
  • Please comment below.


Author: Ajay Ohri

3 thoughts on “So many R Packages Everywhere, which one do I use? #rstats”

  1. are all the 5000 packages on CRAN server on China the same as the package on CRAN server on America?

    that’s what I am asking. I trust R Core. I dont trust all the world.

  2. Hej. Many valid points. I think I would agree with most of them, except the security issue. Obviously, this is always true. No matter to what lengths you go (external auditing, whatever), there will always remain one guardian that is not guarded (that almost sounds theological). So I think open source is about as much security and transparency as you can get. Perhaps checksums on or signing of the packages would be a clever addition to make potential tampering at mirrors more difficult, if they are not already in place. But ultimately yes, there is trust involved, just as when reading forum posts, and just as when using commercial software. Anyone remembering Amazon remote deleting e-books.

    Regarding your wish to not locally install R packages, I guess that’s possible by using e.g. a R-Studio webserver. For the truly paranoid (or those working with truly sensitive data) you could always have an R instance run in a virtual machine without network access and delete it once you are done. But quite frankly, if someone is working on Windows, I think there are more pressing security issues to consider, than potentially malicious R packages.

    Cheers, Christoph

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s