Predictive Analytics World Conference –New York City and London, UK

Please use the following code  to get a 15% discount on the 2 Day Conference Pass:  AJAYNY11.

Predictive Analytics World Conference –New York City and London, UK

October 17-21, 2011 – New York City, NY (pawcon.com/nyc)
Nov 30 – Dec 1, 2011 – London, UK (pawcon.com/london)

Predictive Analytics World (pawcon.com) is the business-focused event for predictive analytics
professionals, managers and commercial practitioners, covering today’s commercial deployment of
predictive analytics, across industries and across software vendors. The conference delivers case
studies, expertise, and resources to achieve two objectives:

1) Bigger wins: Strengthen the business impact delivered by predictive analytics

2) Broader capabilities: Establish new opportunities with predictive analytics

Case Studies: How the Leading Enterprises Do It

Predictive Analytics World focuses on concrete examples of deployed predictive analytics. The leading
enterprises have signed up to tell their stories, so you can hear from the horse’s mouth precisely how
Fortune 500 analytics competitors and other top practitioners deploy predictive modeling, and what
kind of business impact it delivers.

PAW NEW YORK CITY 2011

PAW’s NYC program is the richest and most diverse yet, featuring over 40 sessions across three tracks
– including both X and Y tracks, and an “Expert/Practitioner” track — so you can witness how predictive
analytics is applied at major companies.

PAW NYC’s agenda covers hot topics and advanced methods such as ensemble models, social data,
search marketing, crowdsourcing, blackbox trading, fraud detection, risk management, survey analysis,
and other innovative applications that benefit organizations in new and creative ways.

WORKSHOPS: PAW NYC also features five full-day pre- and post-conference workshops that
complement the core conference program. Workshop agendas include advanced predictive modeling
methods, hands-on training, an intro to R (the open source analytics system), and enterprise decision
management.

For more see http://www.predictiveanalyticsworld.com/newyork/2011/

PAW LONDON 2011

PAW London’s agenda covers hot topics and advanced methods such as risk management, uplift
(incremental lift) modeling, open source analytics, and crowdsourcing data mining. Case study
presentations cover campaign targeting, churn modeling, next-best-offer, selecting marketing channels,
global analytics deployment, email marketing, HR candidate search, and other innovative applications
that benefit organizations in new and creative ways.

Join PAW and access the best keynotes, sessions, workshops, exposition, expert panel, live demos,
networking coffee breaks, reception, birds-of-a-feather lunches, brand-name enterprise leaders, and

industry heavyweights in the business.

For more see http://www.predictiveanalyticsworld.com/london

CROSS-INDUSTRY APPLICATIONS

Predictive Analytics World is the only conference of its kind, delivering vendor-neutral sessions across
verticals such as banking, financial services, e-commerce, education, government, healthcare, high
technology, insurance, non-profits, publishing, social gaming, retail and telecommunications

And PAW covers the gamut of commercial applications of predictive analytics, including response
modeling, customer retention with churn modeling, product recommendations, fraud detection, online
marketing optimization, human resource decision-making, law enforcement, sales forecasting, and
credit scoring.

Why bring together such a wide range of endeavors? No matter how you use predictive analytics, the
story is the same: Predicatively scoring customers optimizes business performance. Predictive analytics
initiatives across industries leverage the same core predictive modeling technology, share similar project
overhead and data requirements, and face common process challenges and analytical hurdles.

RAVE REVIEWS:

“Hands down, best applied, analytics conference I have ever attended. Great exposure to cutting-edge
predictive techniques and I was able to turn around and apply some of those learnings to my work
immediately. I’ve never been able to say that after any conference I’ve attended before!”

Jon Francis
Senior Statistician
T-Mobile

Read more: Articles and blog entries about PAW can be found at http://www.predictiveanalyticsworld.com/
pressroom.php

VENDORS. Meet the vendors and learn about their solutions, software and service. Discover the best
predictive analytics vendors available to serve your needs – learn what they do and see how they
compare

COLLEAGUES. Mingle, network and hang out with your best and brightest colleagues. Exchange
experiences over lunch, coffee breaks and the conference reception connecting with those professionals
who face the same challenges as you.

GET STARTED. If you’re new to predictive analytics, kicking off a new initiative, or exploring new ways
to position it at your organization, there’s no better place to get your bearings than Predictive Analytics
World. See what other companies are doing, witness vendor demos, participate in discussions with the
experts, network with your colleagues and weigh your options!

For more information:
http://www.predictiveanalyticsworld.com

View videos of PAW Washington DC, Oct 2010 — now available on-demand:
http://www.predictiveanalyticsworld.com/online-video.php

What is predictive analytics? See the Predictive Analytics Guide:
http://www.predictiveanalyticsworld.com/predictive_analytics.php

If you’d like our informative event updates, sign up at:
http://www.predictiveanalyticsworld.com/signup-us.php

To sign up for the PAW group on LinkedIn, see:
http://www.linkedin.com/e/gis/1005097

For inquiries e-mail regsupport@risingmedia.com or call (717) 798-3495.

Contest for SAS Users and Students

Heres a new contest for SAS users. The prizes are books, so students should be interested as well.

From http://www.sascommunity.org/mwiki/images/b/bc/PointsforprizesRules.pdf

HOW TO ENTER: To qualify for entry, go to the sasCommunity.org web site located at http://www.sascommunity.org/wiki/Main_Page
between April 11, 2011 and May 9, 2011 and either add or edit valid content as described herein to earn award points.
Creation of a first time profile on www.sascommunity.org will earn 1,000 points. For each valid article creation or edit, 100
points will be earned. Articles and subsequent edits should adhere to the sasCommunity.org terms of use as outlined on
http://www.sascommunity.org/wiki/sasCommunity:Terms_of_Use. All points’ accumulation will end at 5:00 PM GMT on
May 9, 2011 and only those points earned between 8:00 AM GMT on April 11, 2011 and 5:00 PM GMT on May 9, 2011
will be counted in this contest. Contest entries made through the Internet will be declared made by the registered user of
the sasCommunity.org profile account. Sponsor is not responsible for phone, technical, network, electronic, computer
hardware or software failures of any kind, misdirected, incomplete, garbled or delayed transmissions. Sponsor will not be
responsible for incorrect or inaccurate entry information, whether caused by entrants or by any of the equipment or
programming associated with or utilized in the contest.
ELIGIBILITY: The contest is open to all sasCommunity.org members 18 year of age or older on the start date of the
contest. Void where prohibited by law. Employees (including immediate family members and/or those living in the same 
household of each), the Sponsor, members of the sasCommunity.org Advisory Board, SAS Global Users Group Executive 
Board, their advertising, promotion and production agencies, the affiliated companies of each, and the immediate family 
members of each are not eligible. 

PRIZE: Three (3) prizes will be awarded based on total points accumulated during the contest as follows:
 1stPlace: 3 SAS®Press books - not to exceed $250 in combined retail value;
 2ndPlace: 2 SAS®Press books - not to exceed $150 in combined retail value; and
 3rdPlace: 1 SAS®Press book - not to exceed $100 in retail value.

What’s New

http://www.sascommunity.org/wiki/Main_Page

New Points for Prizes Contest
Points for Prizes Contest
Win SAS books!
Contribute content or SAS code to sasCommunity.org for your chance to WIN! To qualify, simply add or edit articles between April 11, 2011 and May 9, 2011 (GMT). Creation of a first-time profile on sasCommunity.org gives you 1,000 points. For each valid article creation or edit, 100 points will be earned. The user with the most points collected during this time wins SAS Press Books!

Become a sasCommunity Guru
Thanks for Contributing to sasCommunity.org!
New sasCommunity.org Point System
The sasCommunity support team has been hard at work adding new features and is pleased to announce a points system that recognizes each user’s contributions to the site. Every time you contribute by creating a page, updating it, or just doing a little wiki gardening, you earn points.Earning points is automatic and simple – all you have to do is contribute! Creating your account starts you with 1000 points and all the current users have been credited with points dating back to the site coming online in April 2007.

Augustus- a PMML model producer and consumer. Scoring engine.

A Bold GNU Head
Image via Wikipedia

I just checked out this new software for making PMML models. It is called Augustus and is created by the Open Data Group (http://opendatagroup.com/) , which is headed by Robert Grossman, who was the first proponent of using R on Amazon Ec2.

Probably someone like Zementis ( http://adapasupport.zementis.com/ ) can use this to further test , enhance or benchmark on the Ec2. They did have a joint webinar with Revolution Analytics recently.

https://code.google.com/p/augustus/

Recent News

  • Augustus v 0.4.3.1 has been released
  • Added a guide (pdf) for including Augustus in the Windows System Properties.
  • Updated the install documentation.
  • Augustus 2010.II (Summer) release is available. This is v 0.4.2.0. More information is here.
  • Added performance discussion concerning the optional cyclic garbage collection.

See Recent News for more details and all recent news.

Augustus

Augustus is a PMML 4-compliant scoring engine that works with segmented models. Augustus is designed for use with statistical and data mining models. The new release provides Baseline, Tree and Naive-Bayes producers and consumers.

There is also a version for use with PMML 3 models. It is able to produce and consume models with 10,000s of segments and conforms to a PMML draft RFC for segmented models and ensembles of models. It supports Baseline, Regression, Tree and Naive-Bayes.

Augustus is written in Python and is freely available under the GNU General Public License, version 2.

See the page Which version is right for me for more details regarding the different versions.

PMML

Predictive Model Markup Language (PMML) is an XML mark up language to describe statistical and data mining models. PMML describes the inputs to data mining models, the transformations used to prepare data for data mining, and the parameters which define the models themselves. It is used for a wide variety of applications, including applications in finance, e-business, direct marketing, manufacturing, and defense. PMML is often used so that systems which create statistical and data mining models (“PMML Producers”) can easily inter-operate with systems which deploy PMML models for scoring or other operational purposes (“PMML Consumers”).

Change Detection using Augustus

For information regarding using Augustus with Change Detection and Health and Status Monitoring, please see change-detection.

Open Data

Open Data Group provides management consulting services, outsourced analytical services, analytic staffing, and expert witnesses broadly related to data and analytics. It has experience with customer data, supplier data, financial and trading data, and data from internal business processes.

It has staff in Chicago and San Francisco and clients throughout the U.S. Open Data Group began operations in 2002.


Overview

The above example contains plots generated in R of scoring results from Augustus. Each point on the graph represents a use of the scoring engine and a chart is an aggregation of multiple Augustus runs. A Baseline (Change Detection) model was used to score data with multiple segments.

Typical Use

Augustus is typically used to construct models and score data with models. Augustus includes a dedicated application for creating, or producing, predictive models rendered as PMML-compliant files. Scoring is accomplished by consuming PMML-compliant files describing an appropriate model. Augustus provides a dedicated application for scoring data with four classes of models, Baseline (Change Detection) ModelsTree ModelsRegression Models and Naive Bayes Models. The typical model development and use cycle with Augustus is as follows:

  1. Identify suitable data with which to construct a new model.
  2. Provide a model schema which proscribes the requirements for the model.
  3. Run the Augustus producer to obtain a new model.
  4. Run the Augustus consumer on new data to effect scoring.

Separate consumer and producer applications are supplied for Baseline (Change Detection) models, Tree models, Regression models and for Naive Bayes models. The producer and consumer applications require configuration with XML-formatted files. The specification of the configuration files and model schema are detailed below. The consumers provide for some configurability of the output but users will often provide additional post-processing to render the output according to their needs. A variety of mechanisms exist for transmitting data but user’s may need to provide their own preprocessing to accommodate their particular data source.

In addition to the producer and consumer applications, Augustus is conceptually structured and provided with libraries which are relevant to the development and use of Predictive Models. Broadly speaking, these consist of components that address the use of PMML and components that are specific to Augustus.

Post Processing

Augustus can accommodate a post-processing step. While not necessary, it is often useful to

  • Re-normalize the scoring results or performing an additional transformation.
  • Supplements the results with global meta-data such as timestamps.
  • Formatting of the results.
  • Select certain interesting values from the results.
  • Restructure the data for use with other applications.

Changes in R software

The newest version of R is now available for download. R 2.13 is ready !!

 

http://cran.at.r-project.org/bin/windows/base/CHANGES.R-2.13.0.html

 

Windows-specific changes to R

CHANGES IN R VERSION 2.13.0

 

WINDOWS VERSION

 

  • Windows 2000 is no longer supported. (It went end-of-life in July 2010.)

 

 

 

NEW FEATURES

 

  • win_iconv has been updated: this version has a change in the behaviour with BOMs on UTF-16 and UTF-32 files – it removes BOMs when reading and adds them when writing. (This is consistent with Microsoft applications, but Unix versions of iconv usually ignore them.) 

     

  • Support for repository type win64.binary (used for 64-bit Windows binaries for R 2.11.x only) has been removed. 

     

  • The installers no longer put an ‘Uninstall’ item on the start menu (to conform to current Microsoft UI guidelines). 

     

  • Running R always sets the environment variable R_ARCH (as it does on a Unix-alike from the shell-script front-end). 

     

  • The defaults for options("browser") and options("pdfviewer") are now set from environment variables R_BROWSER and R_PDFVIEWER respectively (as on a Unix-alike). A value of "false" suppresses display (even if there is no false.exe present on the path). 

     

  • If options("install.lock") is set to TRUE, binary package installs are protected against failure similar to the way source package installs are protected. 

     

  • file.exists() and unlink() have more support for files > 2GB. 

     

  • The versions of R.exe in ‘R_HOME/bin/i386,x64/bin’ now support options such as R --vanilla CMD: there is no comparable interface for ‘Rcmd.exe’. 

     

  • A few more file operations will now work with >2GB files. 

     

  • The environment variable R_HOME in an R session now uses slash as the path separator (as it always has when set by Rcmd.exe). 

     

  • Rgui has a new menu item for the PDF ‘Sweave User Manual’.

 

 

 

DEPRECATED

 

  • zip.unpack() is deprecated: use unzip().

 

INSTALLATION

 

  • There is support for libjpeg-turbo via setting JPEGDIR to that value in ‘MkRules.local’. 

    Support for jpeg-6b has been removed.

     

  • The sources now work with libpng-1.5.1, jpegsrc.v8c (which are used in the CRAN builds) and tiff-4.0.0beta6 (CRAN builds use 3.9.1). It is possible that they no longer work with older versions than libpng-1.4.5.

 

 

 

BUG FIXES

 

  • Workaround for the incorrect values given by Windows’ casinh function on the branch cuts.
  • Bug fixes for drawing raster objects on windows(). The symptom was the occasional raster image not being drawn, especially when drawing multiple raster images in a single expression. Thanks to Michael Sumner for report and testing.
  • Printing extremely long string values could overflow the stack and cause the GUI to crash. (PR#14543)

Tonnes of changes!!

http://cran.at.r-project.org/src/base/NEWS

CHANGES IN R VERSION 2.13.0:

  SIGNIFICANT USER-VISIBLE CHANGES:

    • replicate() (by default) and vapply() (always) now return a
      higher-dimensional array instead of a matrix in the case where
      the inner function value is an array of dimension >= 2.

    • Printing and formatting of floating point numbers is now using
      the correct number of digits, where it previously rarely differed
      by a few digits. (See “scientific” entry below.)  This affects
      _many_ *.Rout.save checks in packages.

  NEW FEATURES:

    • normalizePath() has been moved to the base package (from utils):
      this is so it can be used by library() and friends.

      It now does tilde expansion.

      It gains new arguments winslash (to select the separator on
      Windows) and mustWork to control the action if a canonical path
      cannot be found.

    • The previously barely documented limit of 256 bytes on a symbol
      name has been raised to 10,000 bytes (a sanity check).  Long
      symbol names can sometimes occur when deparsing expressions (for
      example, in model.frame).

    • reformulate() gains a intercept argument.

    • cmdscale(add = FALSE) now uses the more common definition that
      there is a representation in n-1 or less dimensions, and only
      dimensions corresponding to positive eigenvalues are used.
      (Avoids confusion such as PR#14397.)

    • Names used by c(), unlist(), cbind() and rbind() are marked with
      an encoding when this can be ascertained.

    • R colours are now defined to refer to the sRGB color space.

      The PDF, PostScript, and Quartz graphics devices record this
      fact.  X11 (and Cairo) and Windows just assume that your screen
      conforms.

    • system.file() gains a mustWork argument (suggestion of Bill
      Dunlap).

    • new.env(hash = TRUE) is now the default.

    • list2env(envir = NULL) defaults to hashing (with a suitably sized
      environment) for lists of more than 100 elements.

    • text() gains a formula method.

    • IQR() now has a type argument which is passed to quantile().

    • as.vector(), as.double() etc duplicate less when they leave the
      mode unchanged but remove attributes.

      as.vector(mode = "any") no longer duplicates when it does not
      remove attributes.  This helps memory usage in matrix() and
      array().

      matrix() duplicates less if data is an atomic vector with
      attributes such as names (but no class).

      dim(x) <- NULL duplicates less if x has neither dimensions nor
      names (since this operation removes names and dimnames).

    • setRepositories() gains an addURLs argument.

    • chisq.test() now also returns a stdres component, for
      standardized residuals (which have unit variance, unlike the
      Pearson residuals).

    • write.table() and friends gain a fileEncoding argument, to
      simplify writing files for use on other OSes (e.g. a spreadsheet
      intended for Windows or Mac OS X Excel).

    • Assignment expressions of the form foo::bar(x) <- y and
      foo:::bar(x) <- y now work; the replacement functions used are
      foo::`bar<-` and foo:::`bar<-`.

    • Sys.getenv() gains a names argument so Sys.getenv(x, names =
      FALSE) can replace the common idiom of as.vector(Sys.getenv()).
      The default has been changed to not name a length-one result.

    • Lazy loading of environments now preserves attributes and locked
      status. (The locked status of bindings and active bindings are
      still not preserved; this may be addressed in the future).

    • options("install.lock") may be set to FALSE so that
      install.packages() defaults to --no-lock installs, or (on
      Windows) to TRUE so that binary installs implement locking.

    • sort(partial = p) for large p now tries Shellsort if quicksort is
      not appropriate and so works for non-numeric atomic vectors.

    • sapply() gets a new option simplify = "array" which returns a
      “higher rank” array instead of just a matrix when FUN() returns a
      dim() length of two or more.

      replicate() has this option set by default, and vapply() now
      behaves that way internally.

    • aperm() becomes S3 generic and gets a table method which
      preserves the class.

    • merge() and as.hclust() methods for objects of class "dendrogram"
      are now provided.

    • as.POSIXlt.factor() now passes ... to the character method
      (suggestion of Joshua Ulrich).

    • The character method of as.POSIXlt() now tries to find a format
      that works for all non-NA inputs, not just the first one.

    • str() now has a method for class "Date" analogous to that for
      class "POSIXt".

    • New function file.link() to create hard links on those file
      systems (POSIX, NTFS but not FAT) that support them.

    • New Summary() group method for class "ordered" implements min(),
      max() and range() for ordered factors.

    • mostattributes<-() now consults the "dim" attribute and not the
      dim() function, making it more useful for objects (such as data
      frames) from classes with methods for dim().  It also uses
      attr<-() in preference to the generics name<-(), dim<-() and
      dimnames<-().  (Related to PR#14469.)

    • There is a new option "browserNLdisabled" to disable the use of
      an empty (e.g. via the ‘Return’ key) as a synonym for c in
      browser() or n under debug().  (Wish of PR#14472.)

    • example() gains optional new arguments character.only and
      give.lines enabling programmatic exploration.

    • serialize() and unserialize() are no longer described as
      ‘experimental’.  The interface is now regarded as stable,
      although the serialization format may well change in future
      releases.  (serialize() has a new argument version which would
      allow the current format to be written if that happens.)

      New functions saveRDS() and readRDS() are public versions of the
      ‘internal’ functions .saveRDS() and .readRDS() made available for
      general use.  The dot-name versions remain available as several
      package authors have made use of them, despite the documentation.

      saveRDS() supports compress = "xz".

    • Many functions when called with a not-open connection will now
      ensure that the connection is left not-open in the event of
      error.  These include read.dcf(), dput(), dump(), load(),
      parse(), readBin(), readChar(), readLines(), save(), writeBin(),
      writeChar(), writeLines(), .readRDS(), .saveRDS() and
      tools::parse_Rd(), as well as functions calling these.

    • Public functions find.package() and path.package() replace the
      internal dot-name versions.

    • The default method for terms() now looks for a "terms" attribute
      if it does not find a "terms" component, and so works for model
      frames.

    • httpd() handlers receive an additional argument containing the
      full request headers as a raw vector (this can be used to parse
      cookies, multi-part forms etc.). The recommended full signature
      for handlers is therefore function(url, query, body, headers,
      ...).

    • file.edit() gains a fileEncoding argument to specify the encoding
      of the file(s).

    • The format of the HTML package listings has changed.  If there is
      more than one library tree , a table of links to libraries is
      provided at the top and bottom of the page.  Where a library
      contains more than 100 packages, an alphabetic index is given at
      the top of the section for that library.  (As a consequence,
      package names are now sorted case-insensitively whatever the
      locale.)

    • isSeekable() now returns FALSE on connections which have
      non-default encoding.  Although documented to record if ‘in
      principle’ the connection supports seeking, it seems safer to
      report FALSE when it may not work.

    • R CMD REMOVE and remove.packages() now remove file R.css when
      removing all remaining packages in a library tree.  (Related to
      the wish of PR#14475: note that this file is no longer
      installed.)

    • unzip() now has a unzip argument like zip.file.extract().  This
      allows an external unzip program to be used, which can be useful
      to access features supported by Info-ZIP's unzip version 6 which
      is now becoming more widely available.

    • There is a simple zip() function, as wrapper for an external zip
      command.

    • bzfile() connections can now read from concatenated bzip2 files
      (including files written with bzfile(open = "a")) and files
      created by some other compressors (such as the example of
      PR#14479).

    • The primitive function c() is now of type BUILTIN.

    • plot(<dendrogram>, .., nodePar=*) now obeys an optional xpd
      specification (allowing clipping to be turned off completely).

    • nls(algorithm="port") now shares more code with nlminb(), and is
      more consistent with the other nls() algorithms in its return
      value.

    • xz has been updated to 5.0.1 (very minor bugfix release).

    • image() has gained a logical useRaster argument allowing it to
      use a bitmap raster for plotting a regular grid instead of
      polygons. This can be more efficient, but may not be supported by
      all devices. The default is FALSE.

    • list.files()/dir() gains a new argument include.dirs() to include
      directories in the listing when recursive = TRUE.

    • New function list.dirs() lists all directories, (even empty
      ones).

    • file.copy() now (by default) copies read/write/execute
      permissions on files, moderated by the current setting of
      Sys.umask().

    • Sys.umask() now accepts mode = NA and returns the current umask
      value (visibly) without changing it.

    • There is a ! method for classes "octmode" and "hexmode": this
      allows xor(a, b) to work if both a and b are from one of those
      classes.

    • as.raster() no longer fails for vectors or matrices containing
      NAs.

    • New hook "before.new.plot" allows functions to be run just before
      advancing the frame in plot.new, which is potentially useful for
      custom figure layout implementations.

    • Package tools has a new function compactPDF() to try to reduce
      the size of PDF files _via_ qpdf or gs.

    • tar() has a new argument extra_flags.

    • dotchart() accepts more general objects x such as 1D tables which
      can be coerced by as.numeric() to a numeric vector, with a
      warning since that might not be appropriate.

    • The previously internal function create.post() is now exported
      from utils, and the documentation for bug.report() and
      help.request() now refer to that for create.post().

      It has a new method = "mailto" on Unix-alikes similar to that on
      Windows: it invokes a default mailer via open (Mac OS X) or
      xdg-open or the default browser (elsewhere).

      The default for ccaddress is now getOption("ccaddress") which is
      by default unset: using the username as a mailing address
      nowadays rarely works as expected.

    • The default for options("mailer") is now "mailto" on all
      platforms.

    • unlink() now does tilde-expansion (like most other file
      functions).

    • file.rename() now allows vector arguments (of the same length).

    • The "glm" method for logLik() now returns an "nobs" attribute
      (which stats4::BIC() assumed it did).

      The "nls" method for logLik() gave incorrect results for zero
      weights.

    • There is a new generic function nobs() in package stats, to
      extract from model objects a suitable value for use in BIC
      calculations.  An S4 generic derived from it is defined in
      package stats4.

    • Code for S4 reference-class methods is now examined for possible
      errors in non-local assignments.

    • findClasses, getGeneric, findMethods and hasMethods are revised
      to deal consistently with the package= argument and be consistent
      with soft namespace policy for finding objects.

    • tools::Rdiff() now has the option to return not only the status
      but a character vector of observed differences (which are still
      by default sent to stdout).

    • The startup environment variables R_ENVIRON_USER, R_ENVIRON,
      R_PROFILE_USER and R_PROFILE are now treated more consistently.
      In all cases an empty value is considered to be set and will stop
      the default being used, and for the last two tilde expansion is
      performed on the file name.  (Note that setting an empty value is
      probably impossible on Windows.)

    • Using R --no-environ CMD, R --no-site-file CMD or R
      --no-init-file CMD sets environment variables so these settings
      are passed on to child R processes, notably those run by INSTALL,
      check and build. R --vanilla CMD sets these three options (but
      not --no-restore).

    • smooth.spline() is somewhat faster.  With cv=NA it allows some
      leverage computations to be skipped,

    • The internal (C) function scientific(), at the heart of R's
      format.info(x), format(x), print(x), etc, for numeric x, has been
      re-written in order to provide slightly more correct results,
      fixing PR#14491, notably in border cases including when digits >=
      16, thanks to substantial contributions (code and experiments)
      from Petr Savicky.  This affects a noticable amount of numeric
      output from R.

    • A new function grepRaw() has been introduced for finding subsets
      of raw vectors. It supports both literal searches and regular
      expressions.

    • Package compiler is now provided as a standard package.  See
      ?compiler::compile for information on how to use the compiler.
      This package implements a byte code compiler for R: by default
      the compiler is not used in this release.  See the ‘R
      Installation and Administration Manual’ for how to compile the
      base and recommended packages.

    • Providing an exportPattern directive in a NAMESPACE file now
      causes classes to be exported according to the same pattern, for
      example the default from package.skeleton() to specify all names
      starting with a letter.  An explicit directive to
      exportClassPattern will still over-ride.

    • There is an additional marked encoding "bytes" for character
      strings.  This is intended to be used for non-ASCII strings which
      should be treated as a set of bytes, and never re-encoded as if
      they were in the encoding of the currrent locale: useBytes = TRUE
      is autmatically selected in functions such as writeBin(),
      writeLines(), grep() and strsplit().

      Only a few character operations are supported (such as substr()).

      Printing, format() and cat() will represent non-ASCII bytes in
      such strings by a \xab escape.

    • The new function removeSource() removes the internally stored
      source from a function.

    • "srcref" attributes now include two additional line number
      values, recording the line numbers in the order they were parsed.

    • New functions have been added for source reference access:
      getSrcFilename(), getSrcDirectory(), getSrcLocation() and
      getSrcref().

    • Sys.chmod() has an extra argument use_umask which defaults to
      true and restricts the file mode by the current setting of umask.
      This means that all the R functions which manipulate
      file/directory permissions by default respect umask, notably R
      CMD INSTALL.

    • tempfile() has an extra argument fileext to create a temporary
      filename with a specified extension.  (Suggestion and initial
      implementation by Dirk Eddelbuettel.)

      There are improvements in the way Sweave() and Stangle() handle
      non-ASCII vignette sources, especially in a UTF-8 locale: see
      ‘Writing R Extensions’ which now has a subsection on this topic.

    • factanal() now returns the rotation matrix if a rotation such as
      "promax" is used, and hence factor correlations are displayed.
      (Wish of PR#12754.)

    • The gctorture2() function provides a more refined interface to
      the GC torture process.  Environment variables R_GCTORTURE,
      R_GCTORTURE_WAIT, and R_GCTORTURE_INHIBIT_RELEASE can also be
      used to control the GC torture process.

    • file.copy(from, to) no longer regards it as an error to supply a
      zero-length from: it now simply does nothing.

    • rstandard.glm gains a type argument which can be used to request
      standardized Pearson residuals.

    • A start on a Turkish translation, thanks to Murat Alkan.

    • .libPaths() calls normalizePath(winslash = "/") on the paths:
      this helps (usually) present them in a user-friendly form and
      should detect duplicate paths accessed via different symbolic
      links.

  SWEAVE CHANGES:

    • Sweave() has options to produce PNG and JPEG figures, and to use
      a custom function to open a graphics device (see ?RweaveLatex).
      (Based in part on the contribution of PR#14418.)

    • The default for Sweave() is to produce only PDF figures (rather
      than both EPS and PDF).

    • Environment variable SWEAVE_OPTIONS can be used to supply
      defaults for existing or new options to be applied after the
      Sweave driver setup has been run.

    • The Sweave manual is now included as a vignette in the utils
      package.

    • Sweave() handles keep.source=TRUE much better: it could duplicate
      some lines and omit comments. (Reported by John Maindonald and
      others.)

  C-LEVEL FACILITIES:

    • Because they use a C99 interface which a C++ compiler is not
      required to support, Rvprintf and REvprintf are only defined by
      R_ext/Print.h in C++ code if the macro R_USE_C99_IN_CXX is
      defined when it is included.

    • pythag duplicated the C99 function hypot.  It is no longer
      provided, but is used as a substitute for hypot in the very
      unlikely event that the latter is not available.

    • R_inspect(obj) and R_inspect3(obj, deep, pvec) are (hidden)
      C-level entry points to the internal inspect function and can be
      used for C-level debugging (e.g., in conjunction with the p
      command in gdb).

    • Compiling R with --enable-strict-barrier now also enables
      additional checking for use of unprotected objects. In
      combination with gctorture() or gctorture2() and a C-level
      debugger this can be useful for tracking down memory protection
      issues.

  UTILITIES:

    • R CMD Rdiff is now implemented in R on Unix-alikes (as it has
      been on Windows since R 2.12.0).

    • R CMD build no longer does any cleaning in the supplied package
      directory: all the cleaning is done in the copy.

      It has a new option --install-args to pass arguments to R CMD
      INSTALL for --build (but not when installing to rebuild
      vignettes).

      There is new option, --resave-data, to call
      tools::resaveRdaFiles() on the data directory, to compress
      tabular files (.tab, .csv etc) and to convert .R files to .rda
      files.  The default, --resave-data=gzip, is to do so in a way
      compatible even with years-old versions of R, but better
      compression is given by --resave-data=best, requiring R >=
      2.10.0.

      It now adds a datalist file for data directories of more than
      1Mb.

      Patterns in .Rbuildignore are now also matched against all
      directory names (including those of empty directories).

      There is a new option, --compact-vignettes, to try reducing the
      size of PDF files in the inst/doc directory.  Currently this
      tries qpdf: other options may be used in future.

      When re-building vignettes and a inst/doc/Makefile file is found,
      make clean is run if the makefile has a clean: target.

      After re-building vignettes the default clean-up operation will
      remove any directories (and not just files) created during the
      process: e.g. one package created a .R_cache directory.

      Empty directories are now removed unless the option
      --keep-empty-dirs is given (and a few packages do deliberately
      include empty directories).

      If there is a field BuildVignettes in the package DESCRIPTION
      file with a false value, re-building the vignettes is skipped.

    • R CMD check now also checks for filenames that are
      case-insensitive matches to Windows' reserved file names with
      extensions, such as nul.Rd, as these have caused problems on some
      Windows systems.

      It checks for inefficiently saved data/*.rda and data/*.RData
      files, and reports on those large than 100Kb.  A more complete
      check (including of the type of compression, but potentially much
      slower) can be switched on by setting environment variable
      _R_CHECK_COMPACT_DATA2_ to TRUE.

      The types of files in the data directory are now checked, as
      packages are _still_ misusing it for non-R data files.

      It now extracts and runs the R code for each vignette in a
      separate directory and R process: this is done in the package's
      declared encoding.  Rather than call tools::checkVignettes(), it
      calls tool::buildVignettes() to see if the vignettes can be
      re-built as they would be by R CMD build.  Option --use-valgrind
      now applies only to these runs, and not when running code to
      rebuild the vignettes.  This version does a much better job of
      suppressing output from successful vignette tests.

      The 00check.log file is a more complete record of what is output
      to stdout: in particular contains more details of the tests.

      It now check all syntactically valid Rd usage entries, and warns
      about assignments (unless these give the usage of replacement
      functions).

      .tar.xz compressed tarballs are now allowed, if tar supports them
      (and setting environment variable TAR to internal ensures so on
      all platforms).

    • R CMD check now warns if it finds inst/doc/makefile, and R CMD
      build renames such a file to inst/doc/Makefile.

  INSTALLATION:

    • Installing R no longer tries to find perl, and R CMD no longer
      tries to substitute a full path for awk nor perl - this was a
      legacy from the days when they were used by R itself.  Because a
      couple of packages do use awk, it is set as the make (rather than
      environment) variable AWK.

    • make check will now fail if there are differences from the
      reference output when testing package examples and if environment
      variable R_STRICT_PACKAGE_CHECK is set to a true value.

    • The C99 double complex type is now required.

      The C99 complex trigonometric functions (such as csin) are not
      currently required (FreeBSD lacks most of them): substitutes are
      used if they are missing.

    • The C99 system call va_copy is now required.

    • If environment variable R_LD_LIBRARY_PATH is set during
      configuration (for example in config.site) it is used unchanged
      in file etc/ldpaths rather than being appended to.

    • configure looks for support for OpenMP and if found compiles R
      with appropriate flags and also makes them available for use in
      packages: see ‘Writing R Extensions’.

      This is currently experimental, and is only used in R with a
      single thread for colSums() and colMeans().  Expect it to be more
      widely used in later versions of R.

      This can be disabled by the --disable-openmp flag.

  PACKAGE INSTALLATION:

    • R CMD INSTALL --clean now removes copies of a src directory which
      are created when multiple sub-architectures are in use.
      (Following a comment from Berwin Turlach.)

    • File R.css is now installed on a per-package basis (in the
      package's html directory) rather than in each library tree, and
      this is used for all the HTML pages in the package.  This helps
      when installing packages with static HTML pages for use on a
      webserver.  It will also allow future versions of R to use
      different stylesheets for the packages they install.

    • A top-level file .Rinstignore in the package sources can list (in
      the same way as .Rbuildignore) files under inst that should not
      be installed.  (Why should there be any such files?  Because all
      the files needed to re-build vignettes need to be under inst/doc,
      but they may not need to be installed.)

    • R CMD INSTALL has a new option --compact-docs to compact any PDFs
      under the inst/doc directory.  Currently this uses qpdf, which
      must be installed (see ‘Writing R Extensions’).

    • There is a new option --lock which can be used to cancel the
      effect of --no-lock or --pkglock earlier on the command line.

    • Option --pkglock can now be used with more than one package, and
      is now the default if only one package is specified.

    • Argument lock of install.packages() can now be use for Mac binary
      installs as well as for Windows ones.  The value "pkglock" is now
      accepted, as well as TRUE and FALSE (the default).

    • There is a new option --no-clean-on-error for R CMD INSTALL to
      retain a partially installed package for forensic analysis.

    • Packages with names ending in . are not portable since Windows
      does not work correctly with such directory names.  This is now
      warned about in R CMD check, and will not be allowed in R 2.14.x.

    • The vignette indices are more comprehensive (in the style of
      browseVignetttes()).

  DEPRECATED & DEFUNCT:

    • require(save = TRUE) is defunct, and use of the save argument is
      deprecated.

    • R CMD check --no-latex is defunct: use --no-manual instead.

    • R CMD Sd2Rd is defunct.

    • The gamma argument to hsv(), rainbow(), and rgb2hsv() is
      deprecated and no longer has any effect.

    • The previous options for R CMD build --binary (--auto-zip,
      --use-zip-data and --no-docs) are deprecated (or defunct): use
      the new option --install-args instead.

    • When a character value is used for the EXPR argument in switch(),
      only a single unnamed alternative value is now allowed.

    • The wrapper utils::link.html.help() is no longer available.

    • Zip-ing data sets in packages (and hence R CMD INSTALL options
      --use-zip-data and --auto-zip, as well as the ZipData: yes field
      in a DESCRIPTION file) is defunct.

      Installed packages with zip-ed data sets can still be used, but a
      warning that they should be re-installed will be given.

    • The ‘experimental’ alternative specification of a name space via
      .Export() etc is now defunct.

    • The option --unsafe to R CMD INSTALL is deprecated: use the
      identical option --no-lock instead.

    • The entry point pythag in Rmath.h is deprecated in favour of the
      C99 function hypot.  A wrapper for hypot is provided for R 2.13.x
      only.

    • Direct access to the "source" attribute of functions is
      deprecated; use deparse(fn, control="useSource") to access it,
      and removeSource(fn) to remove it.

    • R CMD build --binary is now formally deprecated: R CMD INSTALL
      --build has long been the preferred alternative.

    • Single-character package names are deprecated (and R is already
      disallowed to avoid confusion in Depends: fields).

  BUG FIXES:

    • drop.terms and the [ method for class "terms" no longer add back
      an intercept.  (Reported by Niels Hansen.)

    • aggregate preserves the class of a column (e.g. a date) under
      some circumstances where it discarded the class previously.

    • p.adjust() now always returns a vector result, as documented.  In
      previous versions it copied attributes (such as dimensions) from
      the p argument: now it only copies names.

    • On PDF and PostScript devices, a line width of zero was recorded
      verbatim and this caused problems for some viewers (a very thin
      line combined with a non-solid line dash pattern could also cause
      a problem).  On these devices, the line width is now limited at
      0.01 and for very thin lines with complex dash patterns the
      device may force the line dash pattern to be solid.  (Reported by
      Jari Oksanen.)

    • The str() method for class "POSIXt" now gives sensible output for
      0-length input.

    • The one- and two-argument complex maths functions failed to warn
      if NAs were generated (as their numeric analogues do).

    • Added .requireCachedGenerics to the dont.mind list for library()
      to avoid warnings about duplicates.

    • $<-.data.frame messed with the class attribute, breaking any S4
      subclass.  The S4 data.frame class now has its own $<- method,
      and turns dispatch on for this primitive.

    • Map() did not look up a character argument f in the correct
      frame, thanks to lazy evaluation.  (PR#14495)

    • file.copy() did not tilde-expand from and to when to was a
      directory.  (PR#14507)

    • It was possible (but very rare) for the loading test in R CMD
      INSTALL to crash a child R process and so leave around a lock
      directory and a partially installed package.  That test is now
      done in a separate process.

    • plot(<formula>, data=<matrix>,..) now works in more cases;
      similarly for points(), lines() and text().

    • edit.default() contained a manual dispatch for matrices (the
      "matrix" class didn't really exist when it was written).  This
      caused an infinite recursion in the no-GUI case and has now been
      removed.

    • data.frame(check.rows = TRUE) sometimes worked when it should
      have detected an error.  (PR#14530)

    • scan(sep= , strip.white=TRUE) sometimes stripped trailing spaces
      from within quoted strings.  (The real bug in PR#14522.)

    • The rank-correlation methods for cor() and cov() with use =
      "complete.obs" computed the ranks before removing missing values,
      whereas the documentation implied incomplete cases were removed
      first.  (PR#14488)

      They also failed for 1-row matrices.

    • The perpendicular adjustment used in placing text and expressions
      in the margins of plots was not scaled by par("mex"). (Part of
      PR#14532.)

    • Quartz Cocoa device now catches any Cocoa exceptions that occur
      during the creation of the device window to prevent crashes.  It
      also imposes a limit of 144 ft^2 on the area used by a window to
      catch user errors (unit misinterpretation) early.

    • The browser (invoked by debug(), browser() or otherwise) would
      display attributes such as "wholeSrcref" that were intended for
      internal use only.

    • R's internal filename completion now properly handles filenames
      with spaces in them even when the readline library is used.  This
      resolves PR#14452 provided the internal filename completion is
      used (e.g., by setting rc.settings(files = TRUE)).

    • Inside uniroot(f, ...), -Inf function values are now replaced by
      a maximally *negative* value.

    • rowsum() could silently over/underflow on integer inputs
      (reported by Bill Dunlap).

    • as.matrix() did not handle "dist" objects with zero rows.

CHANGES IN R VERSION 2.12.2 patched:

  NEW FEATURES:

    • max() and min() work harder to ensure that NA has precedence over
      NaN, so e.g. min(NaN, NA) is NA.  (This was not previously
      documented except for within a single numeric vector, where
      compiler optimizations often defeated the code.)

  BUG FIXES:

    • A change to the C function R_tryEval had broken error messages in
      S4 method selection; the error message is now printed.

    • PDF output with a non-RGB color model used RGB for the line
      stroke color.  (PR#14511)

    • stats4::BIC() assumed without checking that an object of class
      "logLik" has an "nobs" attribute: glm() fits did not and so BIC()
      failed for them.

    • In some circumstances a one-sided mantelhaen.test() reported the
      p-value for the wrong tail.  (PR#14514)

    • Passing the invalid value lty = NULL to axis() sent an invalid
      value to the graphics device, and might cause the device to
      segfault.

    • Sweave() with concordance=TRUE could lead to invalid PDF files;
      Sweave.sty has been updated to avoid this.

    • Non-ASCII characters in the titles of help pages were not
      rendered properly in some locales, and could cause errors or
      warnings.    • checkRd() gave a spurious error if the \href macro was used.

 

 

Oracle launches XBRL extension for financial domains

What is XBRL and how does it work?

http://www.xbrl.org/HowXBRLWorks/

How XBRL Works
XBRL is a member of the family of languages based on XML, or Extensible Markup Language, which is a standard for the electronic exchange of data between businesses and on the internet.  Under XML, identifying tags are applied to items of data so that they can be processed efficiently by computer software.

XBRL is a powerful and flexible version of XML which has been defined specifically to meet the requirements of business and financial information.  It enables unique identifying tags to be applied to items of financial data, such as ‘net profit’.  However, these are more than simple identifiers.  They provide a range of information about the item, such as whether it is a monetary item, percentage or fraction.  XBRL allows labels in any language to be applied to items, as well as accounting references or other subsidiary information.

XBRL can show how items are related to one another.  It can thus represent how they are calculated.  It can also identify whether they fall into particular groupings for organisational or presentational purposes.  Most importantly, XBRL is easily extensible, so companies and other organisations can adapt it to meet a variety of special requirements.

The rich and powerful structure of XBRL allows very efficient handling of business data by computer software.  It supports all the standard tasks involved in compiling, storing and using business data.  Such information can be converted into XBRL by suitable mapping processes or generated in XBRL by software.  It can then be searched, selected, exchanged or analysed by computer, or published for ordinary viewing.

also see

http://www.xbrl.org/Example1/

 

 

 

and from-

http://www.oracle.com/us/dm/xbrlextension-354972.html?msgid=3-3856862107

With more than 7,000 new U.S. companies facing extensible business reporting language (XBRL) filing mandates in 2011, Oracle has released a free XBRL extension on top of the latest release of Oracle Database.

Oracle’s XBRL extension leverages Oracle Database 11g Release 2 XML to manage the collection, validation, storage, and analysis of XBRL data. It enables organizations to create one or more back-end XBRL repositories based on Oracle Database, providing secure XBRL storage and query-ability with a set of XBRL-specific services.

In addition, the extension integrates easily with Oracle Business Intelligence Suite Enterprise Edition to provide analytics, plus interactive development environments (IDEs) and design tools for creating and editing XBRL taxonomies.

The Other Side of XBRL
“While the XBRL mandate continues to grow, the feedback we keep hearing from the ‘other side’ of XRBL—regulators, academics, financial analysts, and investors—is that they lack sufficient tools and historic data to leverage the full potential of XBRL,” says John O’Rourke, vice president of product marketing, Oracle.

However, O’Rourke says this is quickly changing as XBRL mandates enter their third year—and more and more companies have to comply. While the new extension should be attractive to organizations that produce XBRL filings, O’Rourke expects it will prove particularly valuable to regulators, stock exchanges, universities, and other organizations that need to collect, analyze, and disseminate XBRL-based filings.

Outsourcing, a Bolt-on Solution, or Integrated XBRL Tagging
Until recently, reporting organizations had to choose between expensive third-party outsourcing or manual, in-house tagging with bolt-on solutions— both of which introduce the possibility of error.

In response, Oracle launched Oracle Hyperion Disclosure Management, which provides an XBRL tagging solution that is integrated with the financial close and reporting process for fast and reliable XBRL report submission—without relying on third-party providers. The solution enables organizations to

  • Author regulatory filings in Microsoft Office and “hot link” them directly to financial reporting systems so they can be easily updated
  • Graphically perform XBRL tagging at several levels—within Microsoft Office, within EPM system reports, or in the data source metadata
  • Modify or extend XBRL taxonomies before the mapping process, as well as set up multiple taxonomies
  • Create and validate final XBRL instance documents before submission

 

Divorced Reality

"Perchance some kindlier ark may come wit...
Image via Wikipedia

if I could just shut my eyes tight
escape the world for a while I might
me and my divorced reality
enhanced by enfeebled diminished mortality

must come to terms with this news
having played my cards I accept and lose
surrender asunder to all those events
over whom i have no power to peruse

take a gun and shoot me in the head
watch me twitch until I am dead
perhaps that would gladden those estranged
historical affection now much deranged

melancholy I must reminiscence I will
to sleep perchance one more pop a pill

The long tail of the internet

On a whim, I took the all time stats of my blog posts (more than 1000 posts) , and tried to plot their distribution.

Basically I copied and pasted all the data in a Google docs spreadsheet. and I created dummy codes (like URL1, URL2…. URL 500)

Next I  downloaded the….

I wasnt in the mood for downloading and uploading stuff so I decided to use GGPLOT using Jeroen’s Application at http://www.stat.ucla.edu/~jeroen/

I used the mirror server that Dataspora provides as I have had latency issues with Jeroen’s website.

I got this error while trying to connect the Dataspora App to my Google spreadsheet

The page you have requested cannot be displayed. Another site was requesting access to your Google Account, but sent a malformed request. Please contact the site that you were trying to use when you received this message to inform them of the error. A detailed error message follows:

The site “http://dataspora.com&#8221; has not been registered.

Oh dear! Back to Jeroen’s /UCLA’s page.

http://rweb.stat.ucla.edu/ggplot2/

I get this warning but it still manages to log in

This website has not registered with Google to establish a secure connection for authorization requests. We recommend that you continue the process only if you trust the following destination:

http://rweb.stat.ucla.edu/R/googleLogin?domain=rweb.stat.ucla.edu

wow it works! thats cloud computing now so I wonder why Google and Amazon continue to ignore the rApache, and Jeroen’s cloud app . Surely their Google Fusion Tables can be always improved or tweaked. Not to mention the next gen version of R which will have its own server

Pretty cool screenshot (but click to see more)

I get the following pretty graph. Hadley Wickham would be ashamed of me by now.

What went wrong- well one page has 36000 views . Scale is the key to graphical coherence . So I redo- delete home page in Google spreadsheet ,reimport replot. ( I didnt know how to modify data in the cloud app, maybe we need a cloud PlyR) I redo it again as I have a big outlier-The top 10 Statistical GUI article which ironically has only 5 GUIs in that article but hush dont tell to high quality search engine)

So again Belatedly I discover something called layer in ggplot.

Base Graphics engine has really spoilt me to write short functions for plots.

I give up. I rather prefer hist() I go to my favorite GUI Rattle, but it has some dating issues with the dll of GTK+

So I go to John Fox’s simple GUI. R Commander- is the best GUI if you use Occam’s Razor, and I am using Occam’s Chainsaw now.

I get the analysis I want in 12 secs


Summary- GGPLot is more complicated than base graphics engine.

Deducer GUI is not as simple too

R Commander is the best GUI because it retains simplicity

Ignore long tail of internet only at your peril

Almost 2/3 rds of my daily traffic of 400+ comes from old archived content That is why Search Engine Optimization and Alerts for Keywords are CRITICAL for any poor soul trying to write on a blog (which has no journal like prestige nor rewards)

If you make life easier for the search engine, it being a fair chap, rewards you well

Existing web traffic estimates like Comscore and Google Trends ignore this long tail

Comments are welcome (Data is pasted below of 500 rows X 2 columns if you can come up with a better analysis)

Since SAS has ignored web analytics and Google Analytics is hmm hmm,  this could be an area of opportunity for R developers as well to create a web analytics package.

Title
Views
Home page 36,185
Top 10 Graphical User Interfaces in Statistical Software More stats 8,264
Matlab-Mathematica-R and GPU Computing More stats 2,166
Wealth = function (numeracy, memory recall) More stats 2,162
The Top Statistical Softwares (GUI) More stats 2,118
About DecisionStats More stats 1,902
Libre Office More stats 1,770
Using Facebook Analytics (Updated) More stats 1,446
Windows Azure vs Amazon EC2 (and Google Storage) More stats 1,386
Interview Hadley Wickham R Project Data Visualization Guru More stats 1,204
Test drive a Chrome notebook. More stats 1,201
Interview Professor John Fox Creator R Commander More stats 1,190
Top ten RRReasons R is bad for you ? More stats 1,178
SAS Institute files first lawsuit against WPS- Episode 1 More stats 1,131
R Package Creating More stats 1,104
Interfaces to R More stats 1,039
Using Red R- R with a Visual Interface More stats 950
Google Maps – Jet Ski across Pacific Ocean More stats 922
Norman Nie: R GUI and More More stats 851
Not so AWkward after all: R GUI RKWard More stats 805
Running R on Amazon EC2 More stats 786
Startups for Geeks More stats 785
Creating a Blog Aggregator for free More stats 749
Cloud Computing with R More stats 676
Rapid Miner- R Extension More stats 671
Parallel Programming using R in Windows More stats 664
Revolution R for Linux More stats 645
Red R 1.8- Pretty GUI More stats 638
John Sall sets JMP 9 free to tango with R More stats 601
Wordle.net More stats 597
Funny Images from India More stats 571
R is an epic fail or is it just overhyped More stats 568
Great article on Notepad++ and R in R Journal More stats 564
Certifications in Analytics and Business Intelligence More stats 548
R Excel :Updated More stats 542
Enterprise Linux rises rapidly:New Report More stats 537
So which software is the best analytical software? Sigh- It depends More stats 520
Funny Photo :It happens only In India More stats 518
Creating 3D Graphs with Data in R More stats 507
SPSS /PASW Certification – Free until Sept 15 More stats 497
Interview :Dr Graham Williams More stats 476
GNU PSPP- The Open Source SPSS More stats 474
Professors and Patches: For a Betterrrr R More stats 467
Running R on Amazon EC2 :Windows More stats 462
WPS response to SAS Lawsuit More stats 458
R language on the GPU More stats 450
KXEN and a Data Mining Survey More stats 449
News on R Commercial Development -Rattle- R Data Mining Tool More stats 449
WPS ( Alternative SAS Language Software) Pricing More stats 447
Kill R? Wait a sec More stats 445
SAS Institute lawsuit against WPS Episode 2 The Clone Wars More stats 442
How to be a BAD blogger? More stats 435
ROC Curve More stats 431
Bulls ,Bears ,Tigers and Asses More stats 424
Trrrouble in land of R…and Open Source Suggestions More stats 422
Interview- BI Dashboards dMINE Sanjay Patel More stats 417
Top Seven Reasons :Why Outsourcing is Bad for India More stats 408
Interviews @Decisionstats More stats 407
Running a R GUI,and parallel programming on Amazon EC2 More stats 394
Unbreakable Oracle Linux- and Unshakable-Libre Office- More stats 393
IBM SPSS 19: Marketing Analytics and RFM More stats 387
Analyzing SAS Institute-WPS Lawsuit More stats 377
Hive Tutorial: Cloud Computing More stats 377
R and Hadoop More stats 374
Graphics Presentations More stats 373
Sector/ Sphere – Faster than Hadoop/Mapreduce at Terasort More stats 370
Benchmarking GNU R: DirkE’s view and a Ninja wishlist More stats 363
Webfocus RStat: Pervasive BI using R More stats 363
Open Source Business Intelligence: Pentaho and Jaspersoft More stats 362
How to do Logistic Regression More stats 362
CommeRcial R- Integration in software More stats 359
So what’s new in R 2.12.0 More stats 357
Interview Michael J. A. Berry Data Miners, Inc More stats 356
Data Mining through the Android More stats 352
Newer version of Alternative SAS / WPS 2.4 launched More stats 350
How to Analyze Wikileaks Data – R SPARQL More stats 348
JMP 9 releasing on Oct 12 More stats 343
The R Online WikiBook More stats 340
Hadley’s tutorials on R Visualization More stats 340
Interview Tasso Argyros CTO Aster Data Systems More stats 339
Parsing XML files easily More stats 337
A Software Called Rattle More stats 335
Which software do we buy? -It depends More stats 329
Jim Goodnight on Open Source- and why he is right -sigh More stats 328
SAS/Blades/Servers/ GPU Benchmarks More stats 326
R Commander Plugins-20 and growing! More stats 324
10 iPhone Apps you can actually use ( and dont have to pay for) More stats 316
R Modeling with huge data More stats 315
The Popularity of Data Analysis Software More stats 315
Interview Donald Farmer Microsoft More stats 307
Learning SAS for free More stats 305
Comparing Base SAS and SPSS More stats 304
Towards better Statistical Interfaces More stats 302
Making NeW R More stats 301
Using Code Snippets in Revolution R More stats 300
R Apache – The next frontier of R Computing More stats 298
Using JMP 9 and R together More stats 297
Doing Time Series using a R GUI More stats 295
Amazon announces Micro Instances for cloud computing More stats 295
Top 5 Free Music Websites More stats 295
Web R- Elastic R and RevoDeploy R More stats 291
R for Stats : Updated More stats 290
Heritage Health Prize- Data Mining Contest for 3mill USD More stats 289
Google AppInventor -Android and Business Intelligence More stats 281
Top R Interviews More stats 278
An Introduction to Data Mining-online book More stats 272
Interview Jim Davis SAS Institute More stats 272
Economic: Indian Caste System -Simplification More stats 271
Rattle Re-Introduced More stats 271
KXEN – Automated Regression Modeling More stats 267
Movie Review- Inglorious Basterds More stats 267
Interview :Doug Savage ,Creator SavageChickens.com More stats 261
IPSUR – A Free R Textbook More stats 258
SAS with the GUI Enterprise Guide (Updated) More stats 256
Trying out Google Prediction API from R More stats 256
Segmenting Models : When and Why More stats 253
Using R and Excel Together More stats 253
R Oracle Data Mining More stats 253
KNIME More stats 253
Using PostgreSQL and MySQL databases in R 2.12 for Windows More stats 250
Fighting Back -The Net, Social Media, Spam, Identity Theft, Terrorism More stats 249
Libre Office (Beta) 3 Launched More stats 248
India to make own DoS -citing cyber security More stats 247
Interview Dominic Pouzin Data Applied More stats 242
R releases new version R 2.9.2 More stats 240
SAS to launch SAS/IML with R ( updated) More stats 239
Playing with Playwith- R Package for Interactive Data Visualizations More stats 234
Predictive Analytics World Conference More stats 231
Analytics and BI for small biz More stats 231
Interview Jeanne Harris Co-Author -Analytics at Work and Competing with Analytics More stats 230
Using R for Time Series in SAS More stats 228
General Electric ‘s breach of the spirit and letter of integrity More stats 227
Interview Luis Torgo Author Data Mining with R More stats 222
Browser Based Model Creation More stats 222
Interview James Dixon Pentaho More stats 221
Thoughts on WPS, SAS , R More stats 220
Choosing R for business – What to consider? More stats 220
Buying SAS Institute More stats 219
Google: Prediction API and other cool stuff More stats 218
Interview : R For Stata Users More stats 216
Viva Libre Office More stats 216
Top 10 Games on Linux -sudo update More stats 214
When China overtook India- using DEDUCER More stats 214
KDD 2009 : Demos More stats 211
Interview Dean Abbott Abbott Analytics More stats 210
Statistically Speaking More stats 203
Data Visualization using Tableau More stats 203
SAS and JMP : Visual Data Discovery More stats 203
High Performance Computing and R More stats 200
Troubleshooting Rattle Installation- Data Mining R GUI More stats 194
Google Realtime Live Updates on Egypt Yemen Tunisia Jordan.. More stats 192
New Deal in Statistical Training More stats 191
Interview Ken O Connor Business Intelligence Consultant More stats 190
Karmic Koala versus Windows 7 More stats 189
Interview Shawn Kung Sr Director Aster Data More stats 189
Pun on Putin More stats 189
Towards better analytical software More stats 188
Dryad- Microsoft’s answer to MR More stats 188
Analyzing Indian – Chinese Relationships More stats 188
LibreOffice News and Google Musings More stats 186
Special Issue of JSS on R GUIs More stats 184
Using Google Docs for Web Scraping More stats 181
Using Reshape2 for transposing datasets in R More stats 180
IBM Buys Netezza More stats 180
Libreoffice 3.3 released More stats 180
Google moving on from MapReduce: rest of world still catching up More stats 179
Linux= Who did what and how much? More stats 176
Interview Carole Jesse Experienced Analytics Professional More stats 176
HIRE ME More stats 175
Test Drive a Google Chrome Notebook: Last Two Days left More stats 174
Q&A with David Smith, Revolution Analytics. More stats 174
R , Ubuntu, RCmdr Updates More stats 173
Interview KNIME Fabian Dill More stats 173
Big Data and R: New Product Release by Revolution Analytics More stats 173
Automated Content Aggregation More stats 173
R or SAS —– R and SAS ? More stats 170
Graphs More stats 169
How to use Oracle for Data Mining More stats 169
Carolina and SAS More stats 166
Interview John Sall Founder JMP/SAS Institute More stats 165
Aster Data hires Quentin Gallivan as CEO More stats 165
Oracle for possible takeover of REvolution Computing More stats 164
The Best and Worst Graphs Ever More stats 163
Statistical Analysis with R- by John M Quick More stats 163
Growing Rapidly: Rapid Miner 4.5 More stats 161
SAP and BI on Demand More stats 161
Google Snappy More stats 161
Google Refine More stats 161
Scoring SAS and SPSS Models in the cloud More stats 159
Hey Professor, I am not a Monkey More stats 157
REVolution Computing fails to create a Revolution More stats 156
SAS Lawsuit against WPS- Application Dismissed More stats 156
KDNuggets Poll on SAS: Churn in Analytics Users More stats 154
SAS Early Days More stats 154
Interview James Taylor Decision Management Expert (Updated) More stats 151
Google Books Ngram Viewer More stats 148
Review – R for SAS and SPSS Users More stats 148
New R Journal Edition More stats 146
Here comes PySpread- 85,899,345 rows and 14,316,555 columns More stats 145
Interview Karl Rexer -Rexer Analytics More stats 144
Poem: The Extroverted Engineer More stats 144
Hearst DataMining Challenge More stats 144
This Is It More stats 142
Interview Timo Elliott SAP More stats 141
The Blind Side – Movie Review More stats 141
Data Mining Survey Results :Tools and Offshoring More stats 140
Going Deap : Algols in Python More stats 140
ADVERTISE More stats 139
Interview Jeff Bass, Bass Institute (Part 2) More stats 139
Interview Jim Harris Data Quality Expert OCDQ Blog More stats 139
Do Monkeys Pay for Sex? More stats 138
Privacy Browsing Extensions in Google Chrome More stats 137
China biggest threat to Indian Software in 5 years: Indian Tech CEO More stats 136
Software HIStory: Bass Institute Part 1 More stats 135
Grenier’s Theory for Competitiveness More stats 134
Interview Charlie Berger Oracle Data Mining More stats 134
Karmic Koala Ubuntu/Linux 9.2 Preview More stats 133
Analytics and Journals More stats 133
Using Code Editors in R More stats 132
Interview Stephanie McReynolds Director Product Marketing, AsterData More stats 132
Amcharts- Cool Charts Web Editor More stats 130
Mapreduce Book More stats 128
Interesting R competition at Reddit More stats 127
Color of Statistics More stats 127
Amazon goes free for users next month More stats 127
#3443 (deleted) More stats 127
Interview Sarah Blow – Girly Geekdom Founder More stats 126
Social Network Analysis: Using R More stats 126
Interview Thomas C. Redman Author Data Driven More stats 126
Audio Interview Anne Milley , Part 1 More stats 124
Advanced Analytics on Multi-Terabyte Datasets- Conferences More stats 123
Geek Humour More stats 123
John M. Chambers Statistical Software Award – 2011 More stats 122
My friend -The Computer More stats 120
M2009 Interview Peter Pawlowski AsterData More stats 118
R Journal Dec 2010 and R for Business Analytics More stats 118
Top ten RRReasons R is bad for you ? More stats 116
Interview Michael Zeller,CEO Zementis on PMML More stats 115
Fast R Graphics More stats 114
New Google Ad Planner More stats 114
Making Sense: Hadoop and MapReduce More stats 114
Using SAS/IML with R More stats 114
Facebook App by SAP Crystal Reports More stats 113
Whats behind that pretty SAS Blog? More stats 113
Interview Alison Bolen SAS.com More stats 113
Ajay @ arts More stats 112
My latest creation More stats 112
Indian Crabs – A story More stats 112
Open Source’s worst enemy is itself not Microsoft/SAS/SAP/Oracle More stats 112
Google Cloud Print -print documents from the internet More stats 111
WPS and SAS- A rah-rah comparison More stats 110
Facebook Gmail Killer Threatens to commit Hara Kari live on AOL Techcrunch if unsucessful More stats 110
Open Source and Software Strategy More stats 109
Windows Azure and Amazon Free offer More stats 108
R for Analytics is now live More stats 108
Open Source Compiler for SAS language/ GNU -DAP More stats 107
Using Chromium /Chrome on Ubuntu Linux More stats 107
Interview John Moore CTO, Swimfish More stats 106
Nice BI Tutorials More stats 106
Creating Customized Packages in SAS Software More stats 106
Business Analytics Analyst Relations /Ethics/White Papers More stats 105
Web Crawling Automation More stats 105
The SAS-WPS Lawsuit- Preliminary Hearing More stats 105
Handling time and date in R More stats 105
KXEN Update More stats 104
MapReduce Analytics Apps- AsterData’s Developer Express Plugin More stats 104
+ 1 your website -updated More stats 103
Movie Review- Peepli Live More stats 103
Better Data Visualization in WordPress.com Stats More stats 102
Customizing your R software startup More stats 102
LibreOffice Beta 2 (Office Fork off Oracle) launches! More stats 102
KXEN Case Studies : Financial Sector More stats 102
Deleting Twitter, Facebook,LinkedIn- Accepting Life More stats 102
Google Street View shows Gladiators fighting More stats 101
Carole-Ann’s 2011 Predictions for Decision Management More stats 101
Amazon goes HPC and GPU: Dirk E to revise his R HPC book More stats 101
Happy Thanksgiving Id More stats 101
Interview Phil Rack WPS Consultant and Developer More stats 100
SPSS launches two more PASWs More stats 99
Interview David Smith REvolution Computing More stats 99
Data Mining with R More stats 97
Dataset too big for R ? More stats 97
How Jesus saved my Butt More stats 97
Interview Evan Levy Baseline Consulting More stats 97
The Latest GUI for R- BioR More stats 96
WPS Version 2.5.1 Released – can still run SAS language/data and R More stats 96
SAS legal falls flat against WPS again: Technical Grounds More stats 95
World Programming System:300 pounds for The power of SAS language More stats 94
KNIME and Zementis shake hands More stats 93
Interview Eric Siegel, Phd President Prediction Impact More stats 93
Interview Sarah Burnett BI Analyst,Ovum group More stats 92
Quantifying Analytics ROI More stats 92
PSPP – SPSS ‘s Open Source Counterpart More stats 91
PySpread Magic More stats 91
Interview SPSS Olivier Jouve More stats 91
Interesting Data Visualization:Friendwheels More stats 91
R on Windows HPC Server More stats 90
The declining market for Telecommunication Churn Models More stats 90
Getting Inside R More stats 90
The Big Data Summit Agenda More stats 90
Review: Clash of the Titans More stats 89
Red Hat worth 7.8 Billion now More stats 89
Movie Review : Rajneeti (Politics) More stats 89
3 Idiots: Insight to Indian Engineer Campus Life More stats 89
The Comic Water Games (aka Common Wealth Games) More stats 88
Computer Education grants from Google More stats 88
Challenges of Analyzing a dataset (with R) More stats 87
Input Data in R using the top 3 R GUI More stats 86
Complex Event Processing- SASE Language More stats 85
Interview with Anne Milley, SAS II More stats 85
Data Mining Presentation at M2009 by Dr Vincent Granville More stats 85
Brief Interview Timo Elliott More stats 85
Mapping Health Statistics at CDC.gov More stats 85
Amazon’s Turks Mturk.com More stats 84
Business Intelligence and Stat Computing: The White Man’s Last Stand More stats 84
Movie Review- Dabangg More stats 84
Movie Review: Sherlock Holmes More stats 84
SAS Data Mining 2009 Las Vegas More stats 83
Chinese Fortune Cookies More stats 83
SPSS and R More stats 83
Manjunath- A Batchmate on my mine More stats 82
Data Mining 2010:SAS Conference in Vegas More stats 81
DirkE and JD swoon about Shane’s MOM in Room 106 while writing R code More stats 81
SAS to R Challenge: Unique benchmarking More stats 81
S A S GOOD LIFE UNDER SIEGE – NYT More stats 81
Pentaho and R: working together More stats 81
Interview John F Moore CEO The Lab More stats 80
Ways to use both Windows and Linux together More stats 80
Brief Interview with James G Kobielus More stats 80
For R Writers- Inside R More stats 79
Using Ipod and Iphone with your Ubuntu Laptop More stats 79
Webcasts: Oracle Data Mining More stats 79
The Cloud OS is finally here or is it?: Karmic Koala More stats 79
Movie Review: Lafangey Parinday (Rouge Birds) More stats 79
SAS announcement in education initiatives More stats 78
Using R from within Python More stats 78
Event: Predictive analytics with R, PMML and ADAPA More stats 78
Interesting R and BI Web Event More stats 78
Bruno Aziza, Microsoft Global BI Lead joins PAW Keynote More stats 77
Common Analytical Tasks More stats 77
RWui :Creating R Web Interfaces on the go More stats 77
R Successor Language ‘Tea’ announced More stats 76
Learning SPSS for SAS users More stats 76
Protovis a graphical toolkit for visualization More stats 76
Interview Paul van Eikeren Inference for R More stats 75
Data Visualization: Central Banks More stats 75
Oracle Data Mining 11 G R2 More stats 75
Interview Peter J Thomas -Award Winning BI Expert More stats 75
Weak Security in Internet Databases for Statisticians More stats 74
Open Source Cartoon More stats 74
Top Ten Graphs for Business Analytics -Pie Charts (1/10) More stats 74
SAS Sentiment Analysis wins Award More stats 74
JMP Genomics 5 released More stats 74
Short Interview Jill Dyche More stats 73
Interview David Katz ,Dataspora /David Katz Consulting More stats 73
PMML 4.0 More stats 73
Ponder This: IBM Research More stats 72
PAW Videos More stats 71
PASW 13 :The preview More stats 71
Cisco SocialMiner More stats 70
Review-The Dark knight More stats 70
MapReduce Patent Granted More stats 70
Cloud Computing and GPU ( and some stats softwares) More stats 70
IBM Business Analytics Forum More stats 70
And now- The Business Analytics Summit More stats 70
Creating an Anonymous Bot More stats 69
R and SAS in Twitter Land More stats 69
Interview:Richard Schultz , CEO REvolution Computing More stats 69
China -United States -The Third Opium War More stats 68
Quick-R and Statmethods.net More stats 68
R Node- and other Web Interfaces to R More stats 68
Life Mojo – A Health Startup More stats 68
Using Views in R and comparing functions across multiple packages More stats 68
Another R Tutorial More stats 67
Interview Karen Lopez Data Modeling Expert More stats 67
QGIS and R More stats 66
Christmas Carol: The Best Software (BI-Stats-Analytics) More stats 66
Software Lawsuits :Ergo More stats 66
STEM is cool More stats 65
Date Night More stats 65
More Advanced SAS Modeling Procs More stats 65
The Big Data Event- Why am I here? More stats 65
Interview Gary Cokins SAS Institute More stats 65
Browser based Music Creation More stats 64
Interview Steve Sarsfield Author The Data Governance Imperative More stats 63
GrapheR More stats 63
Google Web Intelligence (Beta) More stats 61
Data Mining 2009 Interviews- Terry Whitlock, BlueCross BlueShield of TN More stats 60
Audio Interviews -Dr. Colleen McCue National Security Expert More stats 60
Red R- A new beginning More stats 59
YouTube Features: Audio Swap, Mobile posts and Themes More stats 59
R for Predictive Modeling:Workshop More stats 59
KDD2009: Papers Research and Industrial More stats 58
Chapman/Hall announces new series on R More stats 58
Data Visualization and Politics More stats 58
T Shirts Design More stats 58
Jump to JMP: Using Data Analysis in a visual manner More stats 58
Aster Analytics and MapReduce.org More stats 57
OK Cupid Data Visualization- Flow Chart to your Heart More stats 57
R for SAS and SPSS Users More stats 57
Carbon Footprints in the snow More stats 57
Summer School on Uncertainty Quantification More stats 57
High Performance Computing within R: Tutorial More stats 57
Running Stats Softwares on Clouds More stats 57
Amazing Data Visualization- UN Counter Terrorism More stats 56
Cloud MapReduce More stats 56
Statistical Features in WPS More stats 56
An R Package only for SAS Users More stats 56
R is Ready for Business™ More stats 55
A Google App for Sales- ERPLY More stats 55
Rexer Analytics Annual Data Miner Survey More stats 55
Cartoons on R More stats 55
American Decline- Why outsourcing doesnt make sense More stats 55
Friday Cartoon Series- New More stats 55
What softwares do you plan to use/learn in the next one year? More stats 54
Great App for Online Sketching More stats 54
September Roundup by Revolution More stats 54
Using Firesheep on Campus, Caltrain and beyond More stats 54
Decisionstats Interview at Big Data Summit, AsterData More stats 53
Learning Hadoop More stats 53
The White Man’s Burden-Poem More stats 53
Curt Monash on Analytics with MapReduce More stats 53
To R or Not to R : Data Mining and CRM for Free More stats 52
Algorithms and Ads: No Free Lunches and Hill Climbing More stats 52
Interview: Roger Haddad, Founder of KXEN Automated Modeling Software More stats 52
Google and Me on Privacy and Openness More stats 52
MapReduce.org More stats 52
Why do bloggers blog ? More stats 52
Live Streaming for Free : UStream More stats 51
Light Cycle of Tron review More stats 51
Lyx Releases 2 More stats 51
Interview – Anne Milley, SAS Part 1 More stats 51
SAS News More stats 51
KXEN EMEA User Conference 2010-Success in Business Analytics More stats 51
2011 Forecast-ying More stats 51
Kill Analytics More stats 50
Social Media Analysis Toolkit More stats 50
Multi State Models More stats 50
R and Cloud Computing More stats 50
Dataists shake up R community with a rocking contest More stats 50
Interview Anne Milley JMP More stats 49
Movie Review: Between the Folds More stats 49
Jokes in Economics More stats 49
Interview Ajay Ohri Decisionstats.com with DMR More stats 49
One more Y Tube Video More stats 49
Happy Diwali /Google Music More stats 48
SPSS Directions : Rexer Survey Results More stats 48
Redlining in Internet Access and notes on Regression Models More stats 48
Poem : A Poets Life More stats 48
Predictive Analytics World More stats 48
Interview- Phil Rack More stats 48
Building KXEN Models on Ubuntu More stats 48
New Year Resolution Presentation More stats 48
Adobe gulps Omniture More stats 47
SAS Modeling Procs More stats 47
Oracle Open World/ RODM package More stats 47
KDNuggets Survey on R More stats 47
IBM and Revolution team to create new in-database R More stats 47
SAS Institute invests in R project More stats 46
Not just a Cloud More stats 46
New Version of R released: R 2.10.1 More stats 46
Review- Iron Man2 More stats 46
Online Analytics: Monte Carlo Simulation More stats 45
Predictive Forecasting in Commercial Applications More stats 45
The Race -by D.H Groberg More stats 45
SAS Scoring Accelerators More stats 45
IBM launches Smart Analytics Cloud More stats 45
Reactions to IBM -SPSS takeover. More stats 45
Zementis partners with R Analytics Vendor- Revo More stats 44
A Missing Mandelbrot Who Dun It More stats 44
Downloading your Facebook Photos More stats 44
Android Tutorial More stats 44
The Mommy Track More stats 44
My First You Tube Video: Courtesy the competiton on VOLNIGHT by Univ of Tennessee More stats 44
Born in the USA? More stats 43
Interview Eric A. King President The Modeling Agency More stats 43
Interview Augusto Albeghi (Straycat) —Founder Straysoft More stats 43
Why Cloud? More stats 43
Innovative ways of Calculus: Gifting a comic set for Christmas More stats 43
To find the best chaat or paan shop More stats 43
Google unleashes Fusion Tables More stats 42
Using SAS and C/C++ together More stats 42
Whats new in the latest version of R More stats 42
Bollywood 101 More stats 42
Who will forecast for the forecasters? More stats 42
Learning R Easily :Two GUI’s More stats 41
Harvard DropOut Writes Open Letter- His Startup has 350m users More stats 41
BI Software More stats 41
How to read blogs in Indonesian and Chinese! More stats 41
Window to a Blue Cloud: Azure Pricing More stats 41
China bans Chinese Food for Googleplex More stats 41
SAS Program for Students More stats 41
The Year 2010 More stats 40
What do you want to know in data analytics? More stats 40
America’s Data Book: Census Abstract 2011 More stats 40
Big Data Management and Advanced Analytics More stats 40
AsterData partners with Tableau More stats 40
Using R from other Software More stats 40
SAS on Fraud More stats 40