Revolution #Rstats Webinar

David Smith of Revo presents a nice webinar on the capabilities and abilities of Revolution R- if you are R curious and wonder how the commercial version has matured- you may want to take a look.

click below to view an executive Webinar

——————————————————————————————-

Revolution R Enterprise—presented by author and blogger David Smith:

Revolution R: 100% R and More
On-Demand Webinar

This Webinar covers how R users can upgrade to:

  • Multi-processor speed improvements and parallel processing
  • Productivity and debugging with an integrated development environment (IDE) for the R language
  • “Big Data” analysis, with out-of-memory storage of multi-gigabyte data sets
  • Web Services for R, to integrate R computations and graphics into 3rd-Party applications like Excel and BI Dashboards
  • Expert technical support and consulting services for R

This webinar will be of value to current R users who want to learn more about the additional capabilities of Revolution R Enterprise to enhance the productivity, ease of use, and enterprise readiness of open source R. R users in academia will also find this webinar valuable: we will explain how all members of the academic community can obtain Revolution R Enterprise free of charge.

—————————————————————————————

contact -1-855-GET-REVO or via online form.
info@revolutionanalytics.com | (650) 330-0553 | Twitter @RevolutionR

US-CERT Incident Reporting System

Here are some resources if your cyber resources have been breached. Note the form doesnot use CAPTCHA at all

US-CERT Incident Reporting System (their head Randy Vickers quit last week)

https://forms.us-cert.gov/report/

Using the US-CERT Incident Reporting SystemIn order for us to respond appropriately, please answer the questions as completely and accurately as possible. Questions that must be answered are labeled “Required”. As always, we will protect your sensitive information. This web site uses Secure Sockets Layer (SSL) to provide secure communications. Your browser must allow at least 40-bit encryption. This method of communication is much more secure than unencrypted email.  Continue reading “US-CERT Incident Reporting System”

Interview Mike Boyarski Jaspersoft

Here is an interview with Mike Boyarski , Director Product Marketing at Jaspersoft

.

 

the largest BI community with over 14 million downloads, nearly 230,000 registered members, representing over 175,000 production deployments, 14,000 customers, across 100 countries.

Ajay- Describe your career in science from Biology to marketing great software.
Mike- I studied Biology with the assumption I’d pursue a career in medicine. It took about 2 weeks during an internship at a Los Angeles hospital to determine I should do something else.  I enjoyed learning about life science, but the whole health care environment was not for me.  I was initially introduced to enterprise-level software while at Applied Materials within their Microcontamination group.  I was able to assist with an internal application used to collect contamination data.  I later joined Oracle to work on an Oracle Forms application used to automate the production of software kits (back when documentation and CDs had to be physically shipped to recognize revenue). This gave me hands on experience with Oracle 7, web application servers, and the software development process.
I then transitioned to product management for various products including application servers, software appliances, and Oracle’s first generation SaaS based software infrastructure. In 2006, with the Siebel and PeopleSoft acquisitions underway, I moved on to Ingres to help re-invigorate their solid yet antiquated technology. This introduced me to commercial open source software and the broader Business Intelligence market.  From Ingres I joined Jaspersoft, one of the first and most popular open source Business Intelligence vendors, serving as head of product marketing since mid 2009.
Ajay- Describe some of the new features in Jaspersoft 4.1 that help differentiate it from the rest of the crowd. What are the exciting product features we can expect from Jaspersoft down the next couple of years.
Mike- Jaspersoft 4.1 was an exciting release for our customers because we were able to extend the latest UI advancements in our ad hoc report designer to the data analysis environment. Now customers can use a unified intuitive web-based interface to perform several powerful and interactive analytic functions across any data source, whether its relational, non-relational, or a Big Data source.
 The reality is that most (roughly 70%) of todays BI adoption is in the form of reports and dashboards. These tools are used to drive and measure an organizations business, however, data analysis presents the most strategic opportunity for companies because it can identify new opportunities, efficiencies, and competitive differentiation.  As more data comes online, the difference between those companies that are successful and those that are not will likely be attributed to their ability to harness data analysis techniques to drive and improve business performance. Thus, with Jaspersoft 4.1, and our improved ad hoc reporting and analysis UI we can effectively address a broader set of BI requirements for organizations of all sizes.
Ajay-  What do you think is a good metric to measure influence of an open source software product – is it revenue or is it number of downloads or number of users. How does Jaspersoft do by these counts.
Mike- History has shown that open source software is successful as a “bottoms up” disrupter within IT or the developer market.  Today, many new software projects and startup ventures are birthed on open source software, often initiated with little to no budget. As the organization achieves success with a particular project, the next initiative tends to be larger and more strategic, often displacing what was historically solved with a proprietary solution. These larger deployments strengthen the technology over time.
Thus, the more proven and battle tested an open source solution is, often measured via downloads, deployments, community size, and community activity, usually equates to its long term success. Linux, Tomcat, and MySQL have plenty of statistics to model this lifecycle. This model is no different for open source BI.
The success to date of Jaspersoft is directly tied to its solid proven technology and the vibrancy of the community.  We proudly and openly claim to have the largest BI community with over 14 million downloads, nearly 230,000 registered members, representing over 175,000 production deployments, 14,000 customers, across 100 countries.  Every day, 30,000 developers are using Jaspersoft to build BI applications.  Behind Excel, its hard to imagine a more widely used BI tool in the market.  Jaspersoft could not reach these kind of numbers with crippled or poorly architected software.
Ajay- What are your plans for leveraging cloud computing, mobile and tablet platforms and for making Jaspersoft more easy and global  to use.

The Best of Google Plus Week 3- Top 1/0

 

While the funny GIFs continue in week 3, I find more and more people using this to paste their blog articles- so another channel to create and spread content.

I am waiting for certain features-

  1. Importing my Orkut data seamlessly into Google Plus
  2. The Gaming Channel using Zynga- Open Social Games
  3. Hangout to have screen sharing as well as screen recording (or export to Youtube features)
  4. Better integration of Sparks based activity.
  5. Also if existing Youtube comments/fan communities can utilize G+ accounts too
Anyways, after all that violence and double talk- the best content in Week 3 as per my Google + stream.
Special Mention-

Revolution Analytics Product Launches for #rstats in 2011

Revolution Analytics just launched an roadmap detailing their product plan for 2011.

 

In particular I am excited for the new GUI coming up, the Hadoop packages, new K Means and Data Sort/merge using Revoscaler for bigger datasets, and also the option to offer support for community packages like ggplot2 titled ” More value in Community Version”. Continue reading “Revolution Analytics Product Launches for #rstats in 2011”

#SAS 9.3 and #Rstats 2.13.1 Released

A bit early but the latest editions of both SAS and R were released last week.

SAS 9.3 is clearly a major release with multiple enhancements to make SAS both relevant and pertinent in enterprise software in the age of big data. Also many more R specific, JMP specific and partners like Teradata specific enhancements.

http://support.sas.com/software/93/index.html

Features

Data management

  • Enhanced manageability for improved performance
  • In-database processing (EL-T pushdown)
  • Enhanced performance for loading oracle data
  • New ET-L transforms
  • Data access

Data quality

  • SAS® Data Integration Server includes DataFlux® Data Management Platform for enhanced data quality
  • Master Data Management (DataFlux® qMDM)
    • Provides support for master hub of trusted entity data.

Analytics

  • SAS® Enterprise Miner™
    • New survival analysis predicts when an event will happen, not just if it will happen.
    • New rate making capability for insurance predicts optimal insurance premium for individuals based on attributes known at application time.
    • Time Series Data Mining node (experimental) applies data mining techniques to transactional, time-stamped data.
    • Support Vector Machines node (experimental) provides a supervised machine learning method for prediction and classification.
  • SAS® Forecast Server
    • SAS Forecast Server is integrated with the SAP APO Demand Planning module to provide SAP users with access to a superior forecasting engine and automatic forecasting capabilities.
  • SAS® Model Manager
    • Seamless integration of R models with the ability to register and manage R models in SAS Model Manager.
    • Ability to perform champion/challenger side-by-side comparisons between SAS and R models to see which model performs best for a specific need.
  • SAS/OR® and SAS® Simulation Studio
    • Optimization
    • Simulation
      • Automatic input distribution fitting using JMP with SAS Simulation Studio.

Text analytics

  • SAS® Text Miner
  • SAS® Enterprise Content Categorization
  • SAS® Sentiment Analysis

Scalability and high-performance

  • SAS® Analytics Accelerator for Teradata (new product)
  • SAS® Grid Manager
 and latest from http://www.r-project.org/ I was a bit curious to know why the different licensing for R now (from GPL2 to GPL2- GPL 3)

LICENCE:

No parts of R are now licensed solely under GPL-2. The licences for packages rpart and survival have been changed, which means that the licence terms for R as distributed are GPL-2 | GPL-3.


This is a maintenance release to consolidate various minor fixes to 2.13.0.
CHANGES IN R VERSION 2.13.1:

  NEW FEATURES:

    • iconv() no longer translates NA strings as "NA".

    • persp(box = TRUE) now warns if the surface extends outside the
      box (since occlusion for the box and axes is computed assuming
      the box is a bounding box). (PR#202.)

    • RShowDoc() can now display the licences shipped with R, e.g.
      RShowDoc("GPL-3").

    • New wrapper function showNonASCIIfile() in package tools.

    • nobs() now has a "mle" method in package stats4.

    • trace() now deals correctly with S4 reference classes and
      corresponding reference methods (e.g., $trace()) have been added.

    • xz has been updated to 5.0.3 (very minor bugfix release).

    • tools::compactPDF() gets more compression (usually a little,
      sometimes a lot) by using the compressed object streams of PDF
      1.5.

    • cairo_ps(onefile = TRUE) generates encapsulated EPS on platforms
      with cairo >= 1.6.

    • Binary reads (e.g. by readChar() and readBin()) are now supported
      on clipboard connections.  (Wish of PR#14593.)

    • as.POSIXlt.factor() now passes ... to the character method
      (suggestion of Joshua Ulrich).  [Intended for R 2.13.0 but
      accidentally removed before release.]

    • vector() and its wrappers such as integer() and double() now warn
      if called with a length argument of more than one element.  This
      helps track down user errors such as calling double(x) instead of
      as.double(x).

  INSTALLATION:

    • Building the vignette PDFs in packages grid and utils is now part
      of running make from an SVN checkout on a Unix-alike: a separate
      make vignettes step is no longer required.

      These vignettes are now made with keep.source = TRUE and hence
      will be laid out differently.

    • make install-strip failed under some configuration options.

    • Packages can customize non-standard installation of compiled code
      via a src/install.libs.R script. This allows packages that have
      architecture-specific binaries (beyond the package's shared
      objects/DLLs) to be installed in a multi-architecture setting.

  SWEAVE & VIGNETTES:

    • Sweave() and Stangle() gain an encoding argument to specify the
      encoding of the vignette sources if the latter do not contain a
      \usepackage[]{inputenc} statement specifying a single input
      encoding.

    • There is a new Sweave option figs.only = TRUE to run each figure
      chunk only for each selected graphics device, and not first using
      the default graphics device.  This will become the default in R
      2.14.0.

    • Sweave custom graphics devices can have a custom function
      foo.off() to shut them down.

    • Warnings are issued when non-portable filenames are found for
      graphics files (and chunks if split = TRUE).  Portable names are
      regarded as alphanumeric plus hyphen, underscore, plus and hash
      (periods cause problems with recognizing file extensions).

    • The Rtangle() driver has a new option show.line.nos which is by
      default false; if true it annotates code chunks with a comment
      giving the line number of the first line in the sources (the
      behaviour of R >= 2.12.0).

    • Package installation tangles the vignette sources: this step now
      converts the vignette sources from the vignette/package encoding
      to the current encoding, and records the encoding (if not ASCII)
      in a comment line at the top of the installed .R file.

  DEPRECATED AND DEFUNCT:

    • The internal functions .readRDS() and .saveRDS() are now
      deprecated in favour of the public functions readRDS() and
      saveRDS() introduced in R 2.13.0.

    • Switching off lazy-loading of code _via_ the LazyLoad field of
      the DESCRIPTION file is now deprecated.  In future all packages
      will be lazy-loaded.

    • The off-line help() types "postscript" and "ps" are deprecated.

  UTILITIES:

    • R CMD check on a multi-architecture installation now skips the
      user's .Renviron file for the architecture-specific tests (which
      do read the architecture-specific Renviron.site files).  This is
      consistent with single-architecture checks, which use
      --no-environ.

    • R CMD build now looks for DESCRIPTION fields BuildResaveData and
      BuildKeepEmpty for per-package overrides.  See ‘Writing R
      Extensions’.

  BUG FIXES:

    • plot.lm(which = 5) was intended to order factor levels in
      increasing order of mean standardized residual.  It ordered the
      factor labels correctly, but could plot the wrong group of
      residuals against the label.  (PR#14545)

    • mosaicplot() could clip the factor labels, and could overlap them
      with the cells if a non-default value of cex.axis was used.
      (Related to PR#14550.)

    • dataframe[[row,col]] now dispatches on [[ methods for the
      selected column (spotted by Bill Dunlap).

    • sort.int() would strip the class of an object, but leave its
      object bit set.  (Reported by Bill Dunlap.)

    • pbirthday() and qbirthday() did not implement the algorithm
      exactly as given in their reference and so were unnecessarily
      inaccurate.

      pbirthday() now solves the approximate formula analytically
      rather than using uniroot() on a discontinuous function.

      The description of the problem was inaccurate: the probability is
      a tail probablity (‘2 _or more_ people share a birthday’)

    • Complex arithmetic sometimes warned incorrectly about producing
      NAs when there were NaNs in the input.

    • seek(origin = "current") incorrectly reported it was not
      implemented for a gzfile() connection.

    • c(), unlist(), cbind() and rbind() could silently overflow the
      maximum vector length and cause a segfault.  (PR#14571)

    • The fonts argument to X11(type = "Xlib") was being ignored.

    • Reading (e.g. with readBin()) from a raw connection was not
      advancing the pointer, so successive reads would read the same
      value.  (Spotted by Bill Dunlap.)

    • Parsed text containing embedded newlines was printed incorrectly
      by as.character.srcref().  (Reported by Hadley Wickham.)

    • decompose() used with a series of a non-integer number of periods
      returned a seasonal component shorter than the original series.
      (Reported by Rob Hyndman.)

    • fields = list() failed for setRefClass().  (Reported by Michael
      Lawrence.)

    • Reference classes could not redefine an inherited field which had
      class "ANY". (Reported by Janko Thyson.)

    • Methods that override previously loaded versions will now be
      installed and called.  (Reported by Iago Mosqueira.)

    • addmargins() called numeric(apos) rather than
      numeric(length(apos)).

    • The HTML help search sometimes produced bad links.  (PR#14608)

    • Command completion will no longer be broken if tail.default() is
      redefined by the user. (Problem reported by Henrik Bengtsson.)

    • LaTeX rendering of markup in titles of help pages has been
      improved; in particular, \eqn{} may be used there.

    • isClass() used its own namespace as the default of the where
      argument inadvertently.

    • Rd conversion to latex mis-handled multi-line titles (including
      cases where there was a blank line in the \title section).
Also see this interesting blog
Examples of tasks replicated in SAS and R