Changes in R software

The newest version of R is now available for download. R 2.13 is ready !!

 

http://cran.at.r-project.org/bin/windows/base/CHANGES.R-2.13.0.html

 

Windows-specific changes to R

CHANGES IN R VERSION 2.13.0

 

WINDOWS VERSION

 

  • Windows 2000 is no longer supported. (It went end-of-life in July 2010.)

 

 

 

NEW FEATURES

 

  • win_iconv has been updated: this version has a change in the behaviour with BOMs on UTF-16 and UTF-32 files – it removes BOMs when reading and adds them when writing. (This is consistent with Microsoft applications, but Unix versions of iconv usually ignore them.) 

     

  • Support for repository type win64.binary (used for 64-bit Windows binaries for R 2.11.x only) has been removed. 

     

  • The installers no longer put an ‘Uninstall’ item on the start menu (to conform to current Microsoft UI guidelines). 

     

  • Running R always sets the environment variable R_ARCH (as it does on a Unix-alike from the shell-script front-end). 

     

  • The defaults for options("browser") and options("pdfviewer") are now set from environment variables R_BROWSER and R_PDFVIEWER respectively (as on a Unix-alike). A value of "false" suppresses display (even if there is no false.exe present on the path). 

     

  • If options("install.lock") is set to TRUE, binary package installs are protected against failure similar to the way source package installs are protected. 

     

  • file.exists() and unlink() have more support for files > 2GB. 

     

  • The versions of R.exe in ‘R_HOME/bin/i386,x64/bin’ now support options such as R --vanilla CMD: there is no comparable interface for ‘Rcmd.exe’. 

     

  • A few more file operations will now work with >2GB files. 

     

  • The environment variable R_HOME in an R session now uses slash as the path separator (as it always has when set by Rcmd.exe). 

     

  • Rgui has a new menu item for the PDF ‘Sweave User Manual’.

 

 

 

DEPRECATED

 

  • zip.unpack() is deprecated: use unzip().

 

INSTALLATION

 

  • There is support for libjpeg-turbo via setting JPEGDIR to that value in ‘MkRules.local’. 

    Support for jpeg-6b has been removed.

     

  • The sources now work with libpng-1.5.1, jpegsrc.v8c (which are used in the CRAN builds) and tiff-4.0.0beta6 (CRAN builds use 3.9.1). It is possible that they no longer work with older versions than libpng-1.4.5.

 

 

 

BUG FIXES

 

  • Workaround for the incorrect values given by Windows’ casinh function on the branch cuts.
  • Bug fixes for drawing raster objects on windows(). The symptom was the occasional raster image not being drawn, especially when drawing multiple raster images in a single expression. Thanks to Michael Sumner for report and testing.
  • Printing extremely long string values could overflow the stack and cause the GUI to crash. (PR#14543)

Tonnes of changes!!

http://cran.at.r-project.org/src/base/NEWS

CHANGES IN R VERSION 2.13.0:

  SIGNIFICANT USER-VISIBLE CHANGES:

    • replicate() (by default) and vapply() (always) now return a
      higher-dimensional array instead of a matrix in the case where
      the inner function value is an array of dimension >= 2.

    • Printing and formatting of floating point numbers is now using
      the correct number of digits, where it previously rarely differed
      by a few digits. (See “scientific” entry below.)  This affects
      _many_ *.Rout.save checks in packages.

  NEW FEATURES:

    • normalizePath() has been moved to the base package (from utils):
      this is so it can be used by library() and friends.

      It now does tilde expansion.

      It gains new arguments winslash (to select the separator on
      Windows) and mustWork to control the action if a canonical path
      cannot be found.

    • The previously barely documented limit of 256 bytes on a symbol
      name has been raised to 10,000 bytes (a sanity check).  Long
      symbol names can sometimes occur when deparsing expressions (for
      example, in model.frame).

    • reformulate() gains a intercept argument.

    • cmdscale(add = FALSE) now uses the more common definition that
      there is a representation in n-1 or less dimensions, and only
      dimensions corresponding to positive eigenvalues are used.
      (Avoids confusion such as PR#14397.)

    • Names used by c(), unlist(), cbind() and rbind() are marked with
      an encoding when this can be ascertained.

    • R colours are now defined to refer to the sRGB color space.

      The PDF, PostScript, and Quartz graphics devices record this
      fact.  X11 (and Cairo) and Windows just assume that your screen
      conforms.

    • system.file() gains a mustWork argument (suggestion of Bill
      Dunlap).

    • new.env(hash = TRUE) is now the default.

    • list2env(envir = NULL) defaults to hashing (with a suitably sized
      environment) for lists of more than 100 elements.

    • text() gains a formula method.

    • IQR() now has a type argument which is passed to quantile().

    • as.vector(), as.double() etc duplicate less when they leave the
      mode unchanged but remove attributes.

      as.vector(mode = "any") no longer duplicates when it does not
      remove attributes.  This helps memory usage in matrix() and
      array().

      matrix() duplicates less if data is an atomic vector with
      attributes such as names (but no class).

      dim(x) <- NULL duplicates less if x has neither dimensions nor
      names (since this operation removes names and dimnames).

    • setRepositories() gains an addURLs argument.

    • chisq.test() now also returns a stdres component, for
      standardized residuals (which have unit variance, unlike the
      Pearson residuals).

    • write.table() and friends gain a fileEncoding argument, to
      simplify writing files for use on other OSes (e.g. a spreadsheet
      intended for Windows or Mac OS X Excel).

    • Assignment expressions of the form foo::bar(x) <- y and
      foo:::bar(x) <- y now work; the replacement functions used are
      foo::`bar<-` and foo:::`bar<-`.

    • Sys.getenv() gains a names argument so Sys.getenv(x, names =
      FALSE) can replace the common idiom of as.vector(Sys.getenv()).
      The default has been changed to not name a length-one result.

    • Lazy loading of environments now preserves attributes and locked
      status. (The locked status of bindings and active bindings are
      still not preserved; this may be addressed in the future).

    • options("install.lock") may be set to FALSE so that
      install.packages() defaults to --no-lock installs, or (on
      Windows) to TRUE so that binary installs implement locking.

    • sort(partial = p) for large p now tries Shellsort if quicksort is
      not appropriate and so works for non-numeric atomic vectors.

    • sapply() gets a new option simplify = "array" which returns a
      “higher rank” array instead of just a matrix when FUN() returns a
      dim() length of two or more.

      replicate() has this option set by default, and vapply() now
      behaves that way internally.

    • aperm() becomes S3 generic and gets a table method which
      preserves the class.

    • merge() and as.hclust() methods for objects of class "dendrogram"
      are now provided.

    • as.POSIXlt.factor() now passes ... to the character method
      (suggestion of Joshua Ulrich).

    • The character method of as.POSIXlt() now tries to find a format
      that works for all non-NA inputs, not just the first one.

    • str() now has a method for class "Date" analogous to that for
      class "POSIXt".

    • New function file.link() to create hard links on those file
      systems (POSIX, NTFS but not FAT) that support them.

    • New Summary() group method for class "ordered" implements min(),
      max() and range() for ordered factors.

    • mostattributes<-() now consults the "dim" attribute and not the
      dim() function, making it more useful for objects (such as data
      frames) from classes with methods for dim().  It also uses
      attr<-() in preference to the generics name<-(), dim<-() and
      dimnames<-().  (Related to PR#14469.)

    • There is a new option "browserNLdisabled" to disable the use of
      an empty (e.g. via the ‘Return’ key) as a synonym for c in
      browser() or n under debug().  (Wish of PR#14472.)

    • example() gains optional new arguments character.only and
      give.lines enabling programmatic exploration.

    • serialize() and unserialize() are no longer described as
      ‘experimental’.  The interface is now regarded as stable,
      although the serialization format may well change in future
      releases.  (serialize() has a new argument version which would
      allow the current format to be written if that happens.)

      New functions saveRDS() and readRDS() are public versions of the
      ‘internal’ functions .saveRDS() and .readRDS() made available for
      general use.  The dot-name versions remain available as several
      package authors have made use of them, despite the documentation.

      saveRDS() supports compress = "xz".

    • Many functions when called with a not-open connection will now
      ensure that the connection is left not-open in the event of
      error.  These include read.dcf(), dput(), dump(), load(),
      parse(), readBin(), readChar(), readLines(), save(), writeBin(),
      writeChar(), writeLines(), .readRDS(), .saveRDS() and
      tools::parse_Rd(), as well as functions calling these.

    • Public functions find.package() and path.package() replace the
      internal dot-name versions.

    • The default method for terms() now looks for a "terms" attribute
      if it does not find a "terms" component, and so works for model
      frames.

    • httpd() handlers receive an additional argument containing the
      full request headers as a raw vector (this can be used to parse
      cookies, multi-part forms etc.). The recommended full signature
      for handlers is therefore function(url, query, body, headers,
      ...).

    • file.edit() gains a fileEncoding argument to specify the encoding
      of the file(s).

    • The format of the HTML package listings has changed.  If there is
      more than one library tree , a table of links to libraries is
      provided at the top and bottom of the page.  Where a library
      contains more than 100 packages, an alphabetic index is given at
      the top of the section for that library.  (As a consequence,
      package names are now sorted case-insensitively whatever the
      locale.)

    • isSeekable() now returns FALSE on connections which have
      non-default encoding.  Although documented to record if ‘in
      principle’ the connection supports seeking, it seems safer to
      report FALSE when it may not work.

    • R CMD REMOVE and remove.packages() now remove file R.css when
      removing all remaining packages in a library tree.  (Related to
      the wish of PR#14475: note that this file is no longer
      installed.)

    • unzip() now has a unzip argument like zip.file.extract().  This
      allows an external unzip program to be used, which can be useful
      to access features supported by Info-ZIP's unzip version 6 which
      is now becoming more widely available.

    • There is a simple zip() function, as wrapper for an external zip
      command.

    • bzfile() connections can now read from concatenated bzip2 files
      (including files written with bzfile(open = "a")) and files
      created by some other compressors (such as the example of
      PR#14479).

    • The primitive function c() is now of type BUILTIN.

    • plot(<dendrogram>, .., nodePar=*) now obeys an optional xpd
      specification (allowing clipping to be turned off completely).

    • nls(algorithm="port") now shares more code with nlminb(), and is
      more consistent with the other nls() algorithms in its return
      value.

    • xz has been updated to 5.0.1 (very minor bugfix release).

    • image() has gained a logical useRaster argument allowing it to
      use a bitmap raster for plotting a regular grid instead of
      polygons. This can be more efficient, but may not be supported by
      all devices. The default is FALSE.

    • list.files()/dir() gains a new argument include.dirs() to include
      directories in the listing when recursive = TRUE.

    • New function list.dirs() lists all directories, (even empty
      ones).

    • file.copy() now (by default) copies read/write/execute
      permissions on files, moderated by the current setting of
      Sys.umask().

    • Sys.umask() now accepts mode = NA and returns the current umask
      value (visibly) without changing it.

    • There is a ! method for classes "octmode" and "hexmode": this
      allows xor(a, b) to work if both a and b are from one of those
      classes.

    • as.raster() no longer fails for vectors or matrices containing
      NAs.

    • New hook "before.new.plot" allows functions to be run just before
      advancing the frame in plot.new, which is potentially useful for
      custom figure layout implementations.

    • Package tools has a new function compactPDF() to try to reduce
      the size of PDF files _via_ qpdf or gs.

    • tar() has a new argument extra_flags.

    • dotchart() accepts more general objects x such as 1D tables which
      can be coerced by as.numeric() to a numeric vector, with a
      warning since that might not be appropriate.

    • The previously internal function create.post() is now exported
      from utils, and the documentation for bug.report() and
      help.request() now refer to that for create.post().

      It has a new method = "mailto" on Unix-alikes similar to that on
      Windows: it invokes a default mailer via open (Mac OS X) or
      xdg-open or the default browser (elsewhere).

      The default for ccaddress is now getOption("ccaddress") which is
      by default unset: using the username as a mailing address
      nowadays rarely works as expected.

    • The default for options("mailer") is now "mailto" on all
      platforms.

    • unlink() now does tilde-expansion (like most other file
      functions).

    • file.rename() now allows vector arguments (of the same length).

    • The "glm" method for logLik() now returns an "nobs" attribute
      (which stats4::BIC() assumed it did).

      The "nls" method for logLik() gave incorrect results for zero
      weights.

    • There is a new generic function nobs() in package stats, to
      extract from model objects a suitable value for use in BIC
      calculations.  An S4 generic derived from it is defined in
      package stats4.

    • Code for S4 reference-class methods is now examined for possible
      errors in non-local assignments.

    • findClasses, getGeneric, findMethods and hasMethods are revised
      to deal consistently with the package= argument and be consistent
      with soft namespace policy for finding objects.

    • tools::Rdiff() now has the option to return not only the status
      but a character vector of observed differences (which are still
      by default sent to stdout).

    • The startup environment variables R_ENVIRON_USER, R_ENVIRON,
      R_PROFILE_USER and R_PROFILE are now treated more consistently.
      In all cases an empty value is considered to be set and will stop
      the default being used, and for the last two tilde expansion is
      performed on the file name.  (Note that setting an empty value is
      probably impossible on Windows.)

    • Using R --no-environ CMD, R --no-site-file CMD or R
      --no-init-file CMD sets environment variables so these settings
      are passed on to child R processes, notably those run by INSTALL,
      check and build. R --vanilla CMD sets these three options (but
      not --no-restore).

    • smooth.spline() is somewhat faster.  With cv=NA it allows some
      leverage computations to be skipped,

    • The internal (C) function scientific(), at the heart of R's
      format.info(x), format(x), print(x), etc, for numeric x, has been
      re-written in order to provide slightly more correct results,
      fixing PR#14491, notably in border cases including when digits >=
      16, thanks to substantial contributions (code and experiments)
      from Petr Savicky.  This affects a noticable amount of numeric
      output from R.

    • A new function grepRaw() has been introduced for finding subsets
      of raw vectors. It supports both literal searches and regular
      expressions.

    • Package compiler is now provided as a standard package.  See
      ?compiler::compile for information on how to use the compiler.
      This package implements a byte code compiler for R: by default
      the compiler is not used in this release.  See the ‘R
      Installation and Administration Manual’ for how to compile the
      base and recommended packages.

    • Providing an exportPattern directive in a NAMESPACE file now
      causes classes to be exported according to the same pattern, for
      example the default from package.skeleton() to specify all names
      starting with a letter.  An explicit directive to
      exportClassPattern will still over-ride.

    • There is an additional marked encoding "bytes" for character
      strings.  This is intended to be used for non-ASCII strings which
      should be treated as a set of bytes, and never re-encoded as if
      they were in the encoding of the currrent locale: useBytes = TRUE
      is autmatically selected in functions such as writeBin(),
      writeLines(), grep() and strsplit().

      Only a few character operations are supported (such as substr()).

      Printing, format() and cat() will represent non-ASCII bytes in
      such strings by a \xab escape.

    • The new function removeSource() removes the internally stored
      source from a function.

    • "srcref" attributes now include two additional line number
      values, recording the line numbers in the order they were parsed.

    • New functions have been added for source reference access:
      getSrcFilename(), getSrcDirectory(), getSrcLocation() and
      getSrcref().

    • Sys.chmod() has an extra argument use_umask which defaults to
      true and restricts the file mode by the current setting of umask.
      This means that all the R functions which manipulate
      file/directory permissions by default respect umask, notably R
      CMD INSTALL.

    • tempfile() has an extra argument fileext to create a temporary
      filename with a specified extension.  (Suggestion and initial
      implementation by Dirk Eddelbuettel.)

      There are improvements in the way Sweave() and Stangle() handle
      non-ASCII vignette sources, especially in a UTF-8 locale: see
      ‘Writing R Extensions’ which now has a subsection on this topic.

    • factanal() now returns the rotation matrix if a rotation such as
      "promax" is used, and hence factor correlations are displayed.
      (Wish of PR#12754.)

    • The gctorture2() function provides a more refined interface to
      the GC torture process.  Environment variables R_GCTORTURE,
      R_GCTORTURE_WAIT, and R_GCTORTURE_INHIBIT_RELEASE can also be
      used to control the GC torture process.

    • file.copy(from, to) no longer regards it as an error to supply a
      zero-length from: it now simply does nothing.

    • rstandard.glm gains a type argument which can be used to request
      standardized Pearson residuals.

    • A start on a Turkish translation, thanks to Murat Alkan.

    • .libPaths() calls normalizePath(winslash = "/") on the paths:
      this helps (usually) present them in a user-friendly form and
      should detect duplicate paths accessed via different symbolic
      links.

  SWEAVE CHANGES:

    • Sweave() has options to produce PNG and JPEG figures, and to use
      a custom function to open a graphics device (see ?RweaveLatex).
      (Based in part on the contribution of PR#14418.)

    • The default for Sweave() is to produce only PDF figures (rather
      than both EPS and PDF).

    • Environment variable SWEAVE_OPTIONS can be used to supply
      defaults for existing or new options to be applied after the
      Sweave driver setup has been run.

    • The Sweave manual is now included as a vignette in the utils
      package.

    • Sweave() handles keep.source=TRUE much better: it could duplicate
      some lines and omit comments. (Reported by John Maindonald and
      others.)

  C-LEVEL FACILITIES:

    • Because they use a C99 interface which a C++ compiler is not
      required to support, Rvprintf and REvprintf are only defined by
      R_ext/Print.h in C++ code if the macro R_USE_C99_IN_CXX is
      defined when it is included.

    • pythag duplicated the C99 function hypot.  It is no longer
      provided, but is used as a substitute for hypot in the very
      unlikely event that the latter is not available.

    • R_inspect(obj) and R_inspect3(obj, deep, pvec) are (hidden)
      C-level entry points to the internal inspect function and can be
      used for C-level debugging (e.g., in conjunction with the p
      command in gdb).

    • Compiling R with --enable-strict-barrier now also enables
      additional checking for use of unprotected objects. In
      combination with gctorture() or gctorture2() and a C-level
      debugger this can be useful for tracking down memory protection
      issues.

  UTILITIES:

    • R CMD Rdiff is now implemented in R on Unix-alikes (as it has
      been on Windows since R 2.12.0).

    • R CMD build no longer does any cleaning in the supplied package
      directory: all the cleaning is done in the copy.

      It has a new option --install-args to pass arguments to R CMD
      INSTALL for --build (but not when installing to rebuild
      vignettes).

      There is new option, --resave-data, to call
      tools::resaveRdaFiles() on the data directory, to compress
      tabular files (.tab, .csv etc) and to convert .R files to .rda
      files.  The default, --resave-data=gzip, is to do so in a way
      compatible even with years-old versions of R, but better
      compression is given by --resave-data=best, requiring R >=
      2.10.0.

      It now adds a datalist file for data directories of more than
      1Mb.

      Patterns in .Rbuildignore are now also matched against all
      directory names (including those of empty directories).

      There is a new option, --compact-vignettes, to try reducing the
      size of PDF files in the inst/doc directory.  Currently this
      tries qpdf: other options may be used in future.

      When re-building vignettes and a inst/doc/Makefile file is found,
      make clean is run if the makefile has a clean: target.

      After re-building vignettes the default clean-up operation will
      remove any directories (and not just files) created during the
      process: e.g. one package created a .R_cache directory.

      Empty directories are now removed unless the option
      --keep-empty-dirs is given (and a few packages do deliberately
      include empty directories).

      If there is a field BuildVignettes in the package DESCRIPTION
      file with a false value, re-building the vignettes is skipped.

    • R CMD check now also checks for filenames that are
      case-insensitive matches to Windows' reserved file names with
      extensions, such as nul.Rd, as these have caused problems on some
      Windows systems.

      It checks for inefficiently saved data/*.rda and data/*.RData
      files, and reports on those large than 100Kb.  A more complete
      check (including of the type of compression, but potentially much
      slower) can be switched on by setting environment variable
      _R_CHECK_COMPACT_DATA2_ to TRUE.

      The types of files in the data directory are now checked, as
      packages are _still_ misusing it for non-R data files.

      It now extracts and runs the R code for each vignette in a
      separate directory and R process: this is done in the package's
      declared encoding.  Rather than call tools::checkVignettes(), it
      calls tool::buildVignettes() to see if the vignettes can be
      re-built as they would be by R CMD build.  Option --use-valgrind
      now applies only to these runs, and not when running code to
      rebuild the vignettes.  This version does a much better job of
      suppressing output from successful vignette tests.

      The 00check.log file is a more complete record of what is output
      to stdout: in particular contains more details of the tests.

      It now check all syntactically valid Rd usage entries, and warns
      about assignments (unless these give the usage of replacement
      functions).

      .tar.xz compressed tarballs are now allowed, if tar supports them
      (and setting environment variable TAR to internal ensures so on
      all platforms).

    • R CMD check now warns if it finds inst/doc/makefile, and R CMD
      build renames such a file to inst/doc/Makefile.

  INSTALLATION:

    • Installing R no longer tries to find perl, and R CMD no longer
      tries to substitute a full path for awk nor perl - this was a
      legacy from the days when they were used by R itself.  Because a
      couple of packages do use awk, it is set as the make (rather than
      environment) variable AWK.

    • make check will now fail if there are differences from the
      reference output when testing package examples and if environment
      variable R_STRICT_PACKAGE_CHECK is set to a true value.

    • The C99 double complex type is now required.

      The C99 complex trigonometric functions (such as csin) are not
      currently required (FreeBSD lacks most of them): substitutes are
      used if they are missing.

    • The C99 system call va_copy is now required.

    • If environment variable R_LD_LIBRARY_PATH is set during
      configuration (for example in config.site) it is used unchanged
      in file etc/ldpaths rather than being appended to.

    • configure looks for support for OpenMP and if found compiles R
      with appropriate flags and also makes them available for use in
      packages: see ‘Writing R Extensions’.

      This is currently experimental, and is only used in R with a
      single thread for colSums() and colMeans().  Expect it to be more
      widely used in later versions of R.

      This can be disabled by the --disable-openmp flag.

  PACKAGE INSTALLATION:

    • R CMD INSTALL --clean now removes copies of a src directory which
      are created when multiple sub-architectures are in use.
      (Following a comment from Berwin Turlach.)

    • File R.css is now installed on a per-package basis (in the
      package's html directory) rather than in each library tree, and
      this is used for all the HTML pages in the package.  This helps
      when installing packages with static HTML pages for use on a
      webserver.  It will also allow future versions of R to use
      different stylesheets for the packages they install.

    • A top-level file .Rinstignore in the package sources can list (in
      the same way as .Rbuildignore) files under inst that should not
      be installed.  (Why should there be any such files?  Because all
      the files needed to re-build vignettes need to be under inst/doc,
      but they may not need to be installed.)

    • R CMD INSTALL has a new option --compact-docs to compact any PDFs
      under the inst/doc directory.  Currently this uses qpdf, which
      must be installed (see ‘Writing R Extensions’).

    • There is a new option --lock which can be used to cancel the
      effect of --no-lock or --pkglock earlier on the command line.

    • Option --pkglock can now be used with more than one package, and
      is now the default if only one package is specified.

    • Argument lock of install.packages() can now be use for Mac binary
      installs as well as for Windows ones.  The value "pkglock" is now
      accepted, as well as TRUE and FALSE (the default).

    • There is a new option --no-clean-on-error for R CMD INSTALL to
      retain a partially installed package for forensic analysis.

    • Packages with names ending in . are not portable since Windows
      does not work correctly with such directory names.  This is now
      warned about in R CMD check, and will not be allowed in R 2.14.x.

    • The vignette indices are more comprehensive (in the style of
      browseVignetttes()).

  DEPRECATED & DEFUNCT:

    • require(save = TRUE) is defunct, and use of the save argument is
      deprecated.

    • R CMD check --no-latex is defunct: use --no-manual instead.

    • R CMD Sd2Rd is defunct.

    • The gamma argument to hsv(), rainbow(), and rgb2hsv() is
      deprecated and no longer has any effect.

    • The previous options for R CMD build --binary (--auto-zip,
      --use-zip-data and --no-docs) are deprecated (or defunct): use
      the new option --install-args instead.

    • When a character value is used for the EXPR argument in switch(),
      only a single unnamed alternative value is now allowed.

    • The wrapper utils::link.html.help() is no longer available.

    • Zip-ing data sets in packages (and hence R CMD INSTALL options
      --use-zip-data and --auto-zip, as well as the ZipData: yes field
      in a DESCRIPTION file) is defunct.

      Installed packages with zip-ed data sets can still be used, but a
      warning that they should be re-installed will be given.

    • The ‘experimental’ alternative specification of a name space via
      .Export() etc is now defunct.

    • The option --unsafe to R CMD INSTALL is deprecated: use the
      identical option --no-lock instead.

    • The entry point pythag in Rmath.h is deprecated in favour of the
      C99 function hypot.  A wrapper for hypot is provided for R 2.13.x
      only.

    • Direct access to the "source" attribute of functions is
      deprecated; use deparse(fn, control="useSource") to access it,
      and removeSource(fn) to remove it.

    • R CMD build --binary is now formally deprecated: R CMD INSTALL
      --build has long been the preferred alternative.

    • Single-character package names are deprecated (and R is already
      disallowed to avoid confusion in Depends: fields).

  BUG FIXES:

    • drop.terms and the [ method for class "terms" no longer add back
      an intercept.  (Reported by Niels Hansen.)

    • aggregate preserves the class of a column (e.g. a date) under
      some circumstances where it discarded the class previously.

    • p.adjust() now always returns a vector result, as documented.  In
      previous versions it copied attributes (such as dimensions) from
      the p argument: now it only copies names.

    • On PDF and PostScript devices, a line width of zero was recorded
      verbatim and this caused problems for some viewers (a very thin
      line combined with a non-solid line dash pattern could also cause
      a problem).  On these devices, the line width is now limited at
      0.01 and for very thin lines with complex dash patterns the
      device may force the line dash pattern to be solid.  (Reported by
      Jari Oksanen.)

    • The str() method for class "POSIXt" now gives sensible output for
      0-length input.

    • The one- and two-argument complex maths functions failed to warn
      if NAs were generated (as their numeric analogues do).

    • Added .requireCachedGenerics to the dont.mind list for library()
      to avoid warnings about duplicates.

    • $<-.data.frame messed with the class attribute, breaking any S4
      subclass.  The S4 data.frame class now has its own $<- method,
      and turns dispatch on for this primitive.

    • Map() did not look up a character argument f in the correct
      frame, thanks to lazy evaluation.  (PR#14495)

    • file.copy() did not tilde-expand from and to when to was a
      directory.  (PR#14507)

    • It was possible (but very rare) for the loading test in R CMD
      INSTALL to crash a child R process and so leave around a lock
      directory and a partially installed package.  That test is now
      done in a separate process.

    • plot(<formula>, data=<matrix>,..) now works in more cases;
      similarly for points(), lines() and text().

    • edit.default() contained a manual dispatch for matrices (the
      "matrix" class didn't really exist when it was written).  This
      caused an infinite recursion in the no-GUI case and has now been
      removed.

    • data.frame(check.rows = TRUE) sometimes worked when it should
      have detected an error.  (PR#14530)

    • scan(sep= , strip.white=TRUE) sometimes stripped trailing spaces
      from within quoted strings.  (The real bug in PR#14522.)

    • The rank-correlation methods for cor() and cov() with use =
      "complete.obs" computed the ranks before removing missing values,
      whereas the documentation implied incomplete cases were removed
      first.  (PR#14488)

      They also failed for 1-row matrices.

    • The perpendicular adjustment used in placing text and expressions
      in the margins of plots was not scaled by par("mex"). (Part of
      PR#14532.)

    • Quartz Cocoa device now catches any Cocoa exceptions that occur
      during the creation of the device window to prevent crashes.  It
      also imposes a limit of 144 ft^2 on the area used by a window to
      catch user errors (unit misinterpretation) early.

    • The browser (invoked by debug(), browser() or otherwise) would
      display attributes such as "wholeSrcref" that were intended for
      internal use only.

    • R's internal filename completion now properly handles filenames
      with spaces in them even when the readline library is used.  This
      resolves PR#14452 provided the internal filename completion is
      used (e.g., by setting rc.settings(files = TRUE)).

    • Inside uniroot(f, ...), -Inf function values are now replaced by
      a maximally *negative* value.

    • rowsum() could silently over/underflow on integer inputs
      (reported by Bill Dunlap).

    • as.matrix() did not handle "dist" objects with zero rows.

CHANGES IN R VERSION 2.12.2 patched:

  NEW FEATURES:

    • max() and min() work harder to ensure that NA has precedence over
      NaN, so e.g. min(NaN, NA) is NA.  (This was not previously
      documented except for within a single numeric vector, where
      compiler optimizations often defeated the code.)

  BUG FIXES:

    • A change to the C function R_tryEval had broken error messages in
      S4 method selection; the error message is now printed.

    • PDF output with a non-RGB color model used RGB for the line
      stroke color.  (PR#14511)

    • stats4::BIC() assumed without checking that an object of class
      "logLik" has an "nobs" attribute: glm() fits did not and so BIC()
      failed for them.

    • In some circumstances a one-sided mantelhaen.test() reported the
      p-value for the wrong tail.  (PR#14514)

    • Passing the invalid value lty = NULL to axis() sent an invalid
      value to the graphics device, and might cause the device to
      segfault.

    • Sweave() with concordance=TRUE could lead to invalid PDF files;
      Sweave.sty has been updated to avoid this.

    • Non-ASCII characters in the titles of help pages were not
      rendered properly in some locales, and could cause errors or
      warnings.    • checkRd() gave a spurious error if the \href macro was used.

 

 

Using Views in R and comparing functions across multiple packages

Some RDF hacking relating to updating probabil...
Image via Wikipedia

R has almost 2923 available packages

This makes the task of searching among these packages and comparing functions for the same analytical task across different packages a bit tedious and prone to manual searching (of reading multiple Pdfs of help /vignette of packages) or sending an email to the R help list.

However using R Views is a slightly better way of managing all your analytical requirements for software rather than the large number of packages (see Graphics view below).

CRAN Task Views allow you to browse packages by topic and provide tools to automatically install all packages for special areas of interest. Currently, 28 views are available. http://cran.r-project.org/web/views/

Bayesian Bayesian Inference
ChemPhys Chemometrics and Computational Physics
ClinicalTrials Clinical Trial Design, Monitoring, and Analysis
Cluster Cluster Analysis & Finite Mixture Models
Distributions Probability Distributions
Econometrics Computational Econometrics
Environmetrics Analysis of Ecological and Environmental Data
ExperimentalDesign Design of Experiments (DoE) & Analysis of Experimental Data
Finance Empirical Finance
Genetics Statistical Genetics
Graphics Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization
gR gRaphical Models in R
HighPerformanceComputing High-Performance and Parallel Computing with R
MachineLearning Machine Learning & Statistical Learning
MedicalImaging Medical Image Analysis
Multivariate Multivariate Statistics
NaturalLanguageProcessing Natural Language Processing
OfficialStatistics Official Statistics & Survey Methodology
Optimization Optimization and Mathematical Programming
Pharmacokinetics Analysis of Pharmacokinetic Data
Phylogenetics Phylogenetics, Especially Comparative Methods
Psychometrics Psychometric Models and Methods
ReproducibleResearch Reproducible Research
Robust Robust Statistical Methods
SocialSciences Statistics for the Social Sciences
Spatial Analysis of Spatial Data
Survival Survival Analysis
TimeSeries Time Series Analysis

To automatically install these views, the ctv package needs to be installed, e.g., via

install.packages("ctv")
library("ctv")
Created by Pretty R at inside-R.org


and then the views can be installed via install.views or update.views (which first assesses which of the packages are already installed and up-to-date), e.g.,

install.views("Econometrics")
 update.views("Econometrics")
 Created by Pretty R at inside-R.org

CRAN Task View: Graphic Displays & Dynamic Graphics & Graphic Devices & Visualization

Maintainer: Nicholas Lewin-Koh
Contact: nikko at hailmail.net
Version: 2009-10-28

R is rich with facilities for creating and developing interesting graphics. Base R contains functionality for many plot types including coplots, mosaic plots, biplots, and the list goes on. There are devices such as postscript, png, jpeg and pdf for outputting graphics as well as device drivers for all platforms running R. lattice and grid are supplied with R’s recommended packages and are included in every binary distribution. lattice is an R implementation of William Cleveland’s trellis graphics, while grid defines a much more flexible graphics environment than the base R graphics.

R’s base graphics are implemented in the same way as in the S3 system developed by Becker, Chambers, and Wilks. There is a static device, which is treated as a static canvas and objects are drawn on the device through R plotting commands. The device has a set of global parameters such as margins and layouts which can be manipulated by the user using par() commands. The R graphics engine does not maintain a user visible graphics list, and there is no system of double buffering, so objects cannot be easily edited without redrawing a whole plot. This situation may change in R 2.7.x, where developers are working on double buffering for R devices. Even so, the base R graphics can produce many plots with extremely fine graphics in many specialized instances.

One can quickly run into trouble with R’s base graphic system if one wants to design complex layouts where scaling is maintained properly on resizing, nested graphs are desired or more interactivity is needed. grid was designed by Paul Murrell to overcome some of these limitations and as a result packages like latticeggplot2vcd or hexbin (on Bioconductor ) use grid for the underlying primitives. When using plots designed with grid one needs to keep in mind that grid is based on a system of viewports and graphic objects. To add objects one needs to use grid commands, e.g., grid.polygon() rather than polygon(). Also grid maintains a stack of viewports from the device and one needs to make sure the desired viewport is at the top of the stack. There is a great deal of explanatory documentation included with grid as vignettes.

The graphics packages in R can be organized roughly into the following topics, which range from the more user oriented at the top to the more developer oriented at the bottom. The categories are not mutually exclusive but are for the convenience of presentation:

  • Plotting : Enhancements for specialized plots can be found in plotrix, for polar plotting, vcd for categorical data, hexbin (on Bioconductor ) for hexagon binning, gclus for ordering plots and gplots for some plotting enhancements. Some specialized graphs, like Chernoff faces are implemented in aplpack, which also has a nice implementation of Tukey’s bag plot. For 3D plots latticescatterplot3d and misc3d provide a selection of plots for different kinds of 3D plotting. scatterplot3d is based on R’s base graphics system, while misc3d is based on rgl. The package onion for visualizing quaternions and octonions is well suited to display 3D graphics based on derived meshes.
  • Graphic Applications : This area is not much different from the plotting section except that these packages have tools that may not for display, but can aid in creating effective displays. Also included are packages with more esoteric plotting methods. For specific subject areas, like maps, or clustering the excellent task views contributed by other dedicated useRs is an excellent place to start.
    • Effect ordering : The gclus package focuses on the ordering of graphs to accentuate cluster structure or natural ordering in the data. While not for graphics directly cba and seriation have functions for creating 1 dimensional orderings from higher dimensional criteria. For ordering an array of displays, biclust can be useful.
    • Large Data Sets : Large data sets can present very different challenges from moderate and small datasets. Aside from overplotting, rendering 1,000,000 points can tax even modern GPU’s. For univariate datalvplot produces letter value boxplots which alleviate some of the problems that standard boxplots exhibit for large data sets. For bivariate data ash can produce a bivariate smoothed histogram very quickly, and hexbin, on Bioconductor , can bin bivariate data onto a hexagonal lattice, the advantage being that the irregular lines and orientation of hexagons do not create linear artifacts. For multivariate data, hexbin can be used to create a scatterplot matrix, combined with lattice. An alternative is to use scagnostics to produce a scaterplot matrix of “data about the data”, and look for interesting combinations of variables.
    • Trees and Graphs ape and ade4 have functions for plotting phylogenetic trees, which can be used for plotting dendrograms from clustering procedures. While these packages produce decent graphics, they do not use sophisticated algorithms for node placement, so may not be useful for very large trees. igraph has the Tilford-Rheingold algorithm implementead and is useful for plotting larger trees. diagram as facilities for flow diagrams and simple graphs. For more sophisticated graphs Rgraphviz and igraph have functions for plotting and layout, especially useful for representing large networks.
  • Graphics Systems lattice is built on top of the grid graphics system and is an R implementation of William Cleveland’s trellis system for S-PLUS. lattice allows for building many types of plots with sophisticated layouts based on conditioning. ggplot2 is an R implementation of the system described in “A Grammar of Graphics” by Leland Wilkinson. Like latticeggplot (also built on top of grid) assists in trellis-like graphics, but allows for much more. Since it is built on the idea of a semantics for graphics there is much more emphasis on reshaping data, transformation, and assembling the elements of a plot.
  • Devices : Whereas grid is built on top of the R graphics engine, many in the R community have found the R graphics engine somewhat inflexible and have written separate device drivers that either emphasize interactivity or plotting in various graphics formats. R base supplies devices for PostScript, PDF, JPEG and other formats. Devices on CRAN include cairoDevice which is a device based libcairo, which can actually render to many device types. The cairo device is desgned to work with RGTK2, which is an interface to the Gimp Tool Kit, similar to pyGTK2. GDD provides device drivers for several bitmap formats, including GIF and BMP. RSvgDevice is an SVG device driver and interfaces well with with vector drawing programs, or R web development packages, such as Rpad. When SVG devices are for web display developers should be aware that internet explorer does not support SVG, but has their own standard. Trust Microsoft. rgl provides a device driver based on OpenGL, and is good for 3D and interactive development. Lastly, the Augsburg group supplies a set of packages that includes a Java-based device, JavaGD.
  • Colors : The package colorspace provides a set of functions for transforming between color spaces and mixcolor() for mixing colors within a color space. Based on the HCL colors provided in colorspacevcdprovides a set of functions for choosing color palettes suitable for coding categorical variables ( rainbow_hcl()) and numerical information ( sequential_hcl()diverge_hcl()). Similar types of palettes are provided in RColorBrewer and dichromat is focused on palettes for color-impaired viewers.
  • Interactive Graphics : There are several efforts to implement interactive graphics systems that interface well with R. In an interactive system the user can interactively query the graphics on the screen with the mouse, or a moveable brush to zoom, pan and query on the device as well as link with other views of the data. rggobi embeds the GGobi interactive graphics system within R, so that one can display a data frame or several in GGobi directly from R. The package has functions to support longitudinal data, and graphs using GGobi’s edge set functionality. The RoSuDA repository maintained and developed by the University of Augsburg group has two packages, iplots and iwidgets as well as their Java development environment including a Java device, JavaGD. Their interactive graphics tools contain functions for alpha blending, which produces darker shading around areas with more data. This is exceptionally useful for parallel coordinate plots where many lines can quickly obscure patterns. playwith has facilities for building interactive versions of R graphics using the cairoDevice and RGtk2. Lastly, the rgl package has mechanisms for interactive manipulation of plots, especially 3D rotations and surfaces.
  • Development : For development of specialized graphics packages in R, grid should probably be the first consideration for any new plot type. rgl has better tools for 3D graphics, since the device is interactive, though it can be slow. An alternative is to use Java and the Java device in the RoSuDA packages, though Java has its own drawbacks. For porting plotting code to grid, using the package gridBase presents a nice intermediate step to embed base graphics in grid graphics and vice versa.

R Commander Plugins-20 and growing!

First graphical user interface in 1973.
Image via Wikipedia
R Commander Extensions: Enhancing a Statistical Graphical User Interface by extending menus to statistical packages

R Commander ( see paper by Prof J Fox at http://www.jstatsoft.org/v14/i09/paper ) is a well known and established graphical user interface to the R analytical environment.
While the original GUI was created for a basic statistics course, the enabling of extensions (or plug-ins  http://www.r-project.org/doc/Rnews/Rnews_2007-3.pdf ) has greatly enhanced the possible use and scope of this software. Here we give a list of all known R Commander Plugins and their uses along with brief comments.

  1. DoE – http://cran.r-project.org/web/packages/RcmdrPlugin.DoE/RcmdrPlugin.DoE.pdf
  2. doex
  3. EHESampling
  4. epack- http://cran.r-project.org/web/packages/RcmdrPlugin.epack/RcmdrPlugin.epack.pdf
  5. Export- http://cran.r-project.org/web/packages/RcmdrPlugin.Export/RcmdrPlugin.Export.pdf
  6. FactoMineR
  7. HH
  8. IPSUR
  9. MAc- http://cran.r-project.org/web/packages/RcmdrPlugin.MAc/RcmdrPlugin.MAc.pdf
  10. MAd
  11. orloca
  12. PT
  13. qcc- http://cran.r-project.org/web/packages/RcmdrPlugin.qcc/RcmdrPlugin.qcc.pdf and http://cran.r-project.org/web/packages/qcc/qcc.pdf
  14. qual
  15. SensoMineR
  16. SLC
  17. sos
  18. survival-http://cran.r-project.org/web/packages/RcmdrPlugin.survival/RcmdrPlugin.survival.pdf
  19. SurvivalT
  20. Teaching Demos

Note the naming convention for above e plugins is always with a Prefix of “RCmdrPlugin.” followed by the names above
Also on loading a Plugin, it must be already installed locally to be visible in R Commander’s list of load-plugin, and R Commander loads the e-plugin after restarting.Hence it is advisable to load all R Commander plugins in the beginning of the analysis session.

However the notable E Plugins are
1) DoE for Design of Experiments-
Full factorial designs, orthogonal main effects designs, regular and non-regular 2-level fractional
factorial designs, central composite and Box-Behnken designs, latin hypercube samples, and simple D-optimal designs can currently be generated from the GUI. Extensions to cover further latin hypercube designs as well as more advanced D-optimal designs (with blocking) are planned for the future.
2) Survival- This package provides an R Commander plug-in for the survival package, with dialogs for Cox models, parametric survival regression models, estimation of survival curves, and testing for differences in survival curves, along with data-management facilities and a variety of tests, diagnostics and graphs.
3) qcc -GUI for  Shewhart quality control charts for continuous, attribute and count data. Cusum and EWMA charts. Operating characteristic curves. Process capability analysis. Pareto chart and cause-and-effect chart. Multivariate control charts
4) epack- an Rcmdr “plug-in” based on the time series functions. Depends also on packages like , tseries, abind,MASS,xts,forecast. It covers Log-Exceptions garch
and following Models -Arima, garch, HoltWinters
5)Export- The package helps users to graphically export Rcmdr output to LaTeX or HTML code,
via xtable() or Hmisc::latex(). The plug-in was originally intended to facilitate exporting Rcmdr
output to formats other than ASCII text and to provide R novices with an easy-to-use,
easy-to-access reference on exporting R objects to formats suited for printed output. The
package documentation contains several pointers on creating reports, either by using
conventional word processors or LaTeX/LyX.
6) MAc- This is an R-Commander plug-in for the MAc package (Meta-Analysis with
Correlations). This package enables the user to conduct a meta-analysis in a menu-driven,
graphical user interface environment (e.g., SPSS), while having the full statistical capabilities of
R and the MAc package. The MAc package itself contains a variety of useful functions for
conducting a research synthesis with correlational data. One of the unique features of the MAc
package is in its integration of user-friendly functions to complete the majority of statistical steps
involved in a meta-analysis with correlations. It uses recommended procedures as described in
The Handbook of Research Synthesis and Meta-Analysis (Cooper, Hedges, & Valentine, 2009).

A query to help for ??Rcmdrplugins reveals the following information which can be quite overwhelming given that almost 20 plugins are now available-

RcmdrPlugin.DoE::DoEGlossary
Glossary for DoE terminology as used in
RcmdrPlugin.DoE
RcmdrPlugin.DoE::Menu.linearModelDesign
RcmdrPlugin.DoE Linear Model Dialog for
experimental data
RcmdrPlugin.DoE::Menu.rsm
RcmdrPlugin.DoE response surface model Dialog
for experimental data
RcmdrPlugin.DoE::RcmdrPlugin.DoE-package
R-Commander plugin package that implements
design of experiments facilities from packages
DoE.base, FrF2 and DoE.wrapper into the
R-Commander
RcmdrPlugin.DoE::RcmdrPlugin.DoEUndocumentedFunctions
Functions used in menus
RcmdrPlugin.doex::ranblockAnova
Internal RcmdrPlugin.doex objects
RcmdrPlugin.doex::RcmdrPlugin.doex-package
Install the DOEX Rcmdr Plug-In
RcmdrPlugin.EHESsampling::OpenSampling1
Internal functions for menu system of
RcmdrPlugin.EHESsampling
RcmdrPlugin.EHESsampling::RcmdrPlugin.EHESsampling-package
Help with EHES sampling
RcmdrPlugin.Export::RcmdrPlugin.Export-package
Graphically export objects to LaTeX or HTML
RcmdrPlugin.FactoMineR::defmacro
Internal RcmdrPlugin.FactoMineR objects
RcmdrPlugin.FactoMineR::RcmdrPlugin.FactoMineR
Graphical User Interface for FactoMineR
RcmdrPlugin.IPSUR::IPSUR-package
An IPSUR Plugin for the R Commander
RcmdrPlugin.MAc::RcmdrPlugin.MAc-package
Meta-Analysis with Correlations (MAc) Rcmdr
Plug-in
RcmdrPlugin.MAd::RcmdrPlugin.MAd-package
Meta-Analysis with Mean Differences (MAd) Rcmdr
Plug-in
RcmdrPlugin.orloca::activeDataSetLocaP
RcmdrPlugin.orloca: A GUI for orloca-package
(internal functions)
RcmdrPlugin.orloca::RcmdrPlugin.orloca-package
RcmdrPlugin.orloca: A GUI for orloca-package
RcmdrPlugin.orloca::RcmdrPlugin.orloca.es
RcmdrPlugin.orloca.es: Una interfaz grafica
para el paquete orloca
RcmdrPlugin.qcc::RcmdrPlugin.qcc-package
Install the Demos Rcmdr Plug-In
RcmdrPlugin.qual::xbara
Internal RcmdrPlugin.qual objects
RcmdrPlugin.qual::RcmdrPlugin.qual-package
Install the quality Rcmdr Plug-In
RcmdrPlugin.SensoMineR::defmacro
Internal RcmdrPlugin.SensoMineR objects
RcmdrPlugin.SensoMineR::RcmdrPlugin.SensoMineR
Graphical User Interface for SensoMineR
RcmdrPlugin.SLC::Rcmdr.help.RcmdrPlugin.SLC
RcmdrPlugin.SLC: A GUI for slc-package
(internal functions)
RcmdrPlugin.SLC::RcmdrPlugin.SLC-package
RcmdrPlugin.SLC: A GUI for SLC R package
RcmdrPlugin.sos::RcmdrPlugin.sos-package
Efficiently search R Help pages
RcmdrPlugin.steepness::Rcmdr.help.RcmdrPlugin.steepness
RcmdrPlugin.steepness: A GUI for
steepness-package (internal functions)
RcmdrPlugin.steepness::RcmdrPlugin.steepness
RcmdrPlugin.steepness: A GUI for steepness R
package
RcmdrPlugin.survival::allVarsClusters
Internal RcmdrPlugin.survival Objects
RcmdrPlugin.survival::RcmdrPlugin.survival-package
Rcmdr Plug-In Package for the survival Package
RcmdrPlugin.TeachingDemos::RcmdrPlugin.TeachingDemos-package
Install the Demos Rcmdr Plug-In

 

GrapheR

GNU General Public License
Image via Wikipedia

GrapherR

GrapheR is a Graphical User Interface created for simple graphs.

Depends: R (>= 2.10.0), tcltk, mgcv
Description: GrapheR is a multiplatform user interface for drawing highly customizable graphs in R. It aims to be a valuable help to quickly draw publishable graphs without any knowledge of R commands. Six kinds of graphs are available: histogram, box-and-whisker plot, bar plot, pie chart, curve and scatter plot.
License: GPL-2
LazyLoad: yes
Packaged: 2011-01-24 17:47:17 UTC; Maxime
Repository: CRAN
Date/Publication: 2011-01-24 18:41:47

More information about GrapheR at CRAN
Path: /cran/newpermanent link

Advantages of using GrapheR

  • It is bi-lingual (English and French) and can import in text and csv files
  • The intention is for even non users of R, to make the simple types of Graphs.
  • The user interface is quite cleanly designed. It is thus aimed as a data visualization GUI, but for a more basic level than Deducer.
  • Easy to rename axis ,graph titles as well use sliders for changing line thickness and color

Disadvantages of using GrapheR

  • Lack of documentation or help. Especially tips on mouseover of some options should be done.
  • Some of the terms like absicca or ordinate axis may not be easily understood by a business user.
  • Default values of color are quite plain (black font on white background).
  • Can flood terminal with lots of repetitive warnings (although use of warnings() function limits it to top 50)
  • Some of axis names can be auto suggested based on which variable s being chosen for that axis.
  • Package name GrapheR refers to a graphical calculator in Mac OS – this can hinder search engine results

Using GrapheR

  • Data Input -Data Input can be customized for CSV and Text files.
  • GrapheR gives information on loaded variables (numeric versus Factors)
  • It asks you to choose the type of Graph 
  • It then asks for usual Graph Inputs (see below). Note colors can be customized (partial window). Also number of graphs per Window can be easily customized 
  • Graph is ready for publication



Chapman/Hall announces new series on R

Rice University, Houston, Texas, USA - Cohen H...
Image via Wikipedia
R Authors get more choice and variety now-
http://www.mail-archive.com/r-help@r-project.org/msg122965.html
We are pleased to announce the launch of a new series of books on R. 

Chapman & Hall/CRC: The R Series

Aims and Scope
This book series reflects the recent rapid growth in the development and 
application of R, the programming language and software environment for 
statistical computing and graphics. R is now widely used in academic research, 
education, and industry. It is constantly growing, with new versions of the 
core software released regularly and more than 2,600 packages available. It is 
difficult for the documentation to keep pace with the expansion of the 
software, and this vital book series provides a forum for the publication of 
books covering many aspects of the development and application of R.

The scope of the series is wide, covering three main threads:
• Applications of R to specific disciplines such as biology, epidemiology, 
genetics, engineering, finance, and the social sciences.
• Using R for the study of topics of statistical methodology, such as linear 
and mixed modeling, time series, Bayesian methods, and missing data.
• The development of R, including programming, building packages, and graphics.

The books will appeal to programmers and developers of R software, as well as 
applied statisticians and data analysts in many fields. The books will feature 
detailed worked examples and R code fully integrated into the text, ensuring 
their usefulness to researchers, practitioners and students.

Series Editors
John M. Chambers (Department of Statistics, Stanford University, USA; 
j...@stat.stanford.edu)
Torsten Hothorn (Institut für Statistik, Ludwig-Maximilians-Universität, 
München, Germany; torsten.hoth...@stat.uni-muenchen.de)
Duncan Temple Lang (Department of Statistics, University of California, Davis, 
USA; dun...@wald.ucdavis.edu)
Hadley Wickham (Department of Statistics, Rice University, Houston, Texas, USA; 
had...@rice.edu)

Call for Proposals
We are interested in books covering all aspects of the development and 
application of R software. If you have an idea for a book, please contact one 
of the series editors above or one of the Chapman & Hall/CRC statistics 
acquisitions editors below. Please provide brief details of topic, audience, 
aims and scope, and include an outline if possible.

We look forward to hearing from you.

Best regards,Rob Calver (rob.cal...@informa.com)
David Grubbs (david.gru...@taylorandfrancis.com)
John Kimmel (john.kim...@taylorandfrancis.com)

 

Handling time and date in R

John Harrison's famous chronometer
Image via Wikipedia

One of the most frustrating things I had to do while working as financial business analysts was working with Data Time Formats in Base SAS. The syntax was simple enough and SAS was quite good with handing queries to the Oracle data base that the client was using, but remembering the different types of formats in SAS language was a challenge (there was a date9. and date6 and mmddyy etc )

Data and Time variables are particularly important variables in financial industry as almost everything is derived variable from the time (which varies) while other inputs are mostly constants. This includes interest as well as late fees and finance fees.

In R, date and time are handled quite simply-

Use the strptime( dataset, format) function to convert the character into string

For example if the variable dob is “01/04/1977) then following will convert into a date object

z=strptime(dob,”%d/%m/%Y”)

and if the same date is 01Apr1977

z=strptime(dob,"%d%b%Y")

 

does the same

For troubleshooting help with date and time, remember to enclose the formats

%d,%b,%m and % Y in the same exact order as the original string- and if there are any delimiters like ” -” or “/” then these delimiters are entered in exactly the same order in the format statement of the strptime

Sys.time() gives you the current date-time while the function difftime(time1,time2) gives you the time intervals( say if you have two columns as date-time variables)

 

What are the various formats for inputs in date time?

%a
Abbreviated weekday name in the current locale. (Also matches full name on input.)
%A
Full weekday name in the current locale. (Also matches abbreviated name on input.)
%b
Abbreviated month name in the current locale. (Also matches full name on input.)
%B
Full month name in the current locale. (Also matches abbreviated name on input.)
%c
Date and time. Locale-specific on output, "%a %b %e %H:%M:%S %Y" on input.
%d
Day of the month as decimal number (01–31).
%H
Hours as decimal number (00–23).
%I
Hours as decimal number (01–12).
%j
Day of year as decimal number (001–366).
%m
Month as decimal number (01–12).
%M
Minute as decimal number (00–59).
%p
AM/PM indicator in the locale. Used in conjunction with %I and not with %H. An empty string in some locales.
%S
Second as decimal number (00–61), allowing for up to two leap-seconds (but POSIX-compliant implementations will ignore leap seconds).
%U
Week of the year as decimal number (00–53) using Sunday as the first day 1 of the week (and typically with the first Sunday of the year as day 1 of week 1). The US convention.
%w
Weekday as decimal number (0–6, Sunday is 0).
%W
Week of the year as decimal number (00–53) using Monday as the first day of week (and typically with the first Monday of the year as day 1 of week 1). The UK convention.
%x
Date. Locale-specific on output, "%y/%m/%d" on input.
%X
Time. Locale-specific on output, "%H:%M:%S" on input.
%y
Year without century (00–99). Values 00 to 68 are prefixed by 20 and 69 to 99 by 19 – that is the behaviour specified by the 2004 POSIX standard, but it does also say ‘it is expected that in a future version the default century inferred from a 2-digit year will change’.
%Y
Year with century.
%z
Signed offset in hours and minutes from UTC, so -0800 is 8 hours behind UTC.
%Z
(output only.) Time zone as a character string (empty if not available).

Also to read the helpful documentation (especially for time zone level, and leap year seconds and differences)
http://stat.ethz.ch/R-manual/R-patched/library/base/html/difftime.html
http://stat.ethz.ch/R-manual/R-patched/library/base/html/strptime.html
http://stat.ethz.ch/R-manual/R-patched/library/base/html/Ops.Date.html
http://stat.ethz.ch/R-manual/R-patched/library/base/html/Dates.html