Google APIs

You can go to https://code.google.com/apis/console/b/0/

Unlike Android and other free stuff these APIs are very promising for revenue generation as some of them are very unique to Google itself, and already some are being offered on a Pricing Tier. There are 18 APIs in total with 3 APIs having Pricing while the rest are in beta stages.

I am just listing down all the APIs in one place – Continue reading “Google APIs”

Interview Mike Boyarski Jaspersoft

Here is an interview with Mike Boyarski , Director Product Marketing at Jaspersoft

the largest BI community with over 14 million downloads, nearly 230,000 registered members, representing over 175,000 production deployments, 14,000 customers, across 100 countries.

Ajay- Describe your career in science from Biology to marketing great software.

Mike- I studied Biology with the assumption I’d pursue a career in medicine. It took about 2 weeks during an internship at a Los Angeles hospital to determine I should do something else. I enjoyed learning about life science, but the whole health care environment was not for me. I was initially introduced to enterprise-level software while at Applied Materials within their Microcontamination group. I was able to assist with an internal application used to collect contamination data. I later joined Oracle to work on an Oracle Forms application used to automate the production of software kits (back when documentation and CDs had to be physically shipped to recognize revenue). This gave me hands on experience with Oracle 7, web application servers, and the software development process.

I then transitioned to product management for various products including application servers, software appliances, and Oracle’s first generation SaaS based software infrastructure. In 2006, with the Siebel and PeopleSoft acquisitions underway, I moved on to Ingres to help re-invigorate their solid yet antiquated technology. This introduced me to commercial open source software and the broader Business Intelligence market. From Ingres I joined Jaspersoft, one of the first and most popular open source Business Intelligence vendors, serving as head of product marketing since mid 2009.

Ajay- Describe some of the new features in Jaspersoft 4.1 that help differentiate it from the rest of the crowd. What are the exciting product features we can expect from Jaspersoft down the next couple of years.

Mike- Jaspersoft 4.1 was an exciting release for our customers because we were able to extend the latest UI advancements in our ad hoc report designer to the data analysis environment. Now customers can use a unified intuitive web-based interface to perform several powerful and interactive analytic functions across any data source, whether its relational, non-relational, or a Big Data source.

The reality is that most (roughly 70%) of todays BI adoption is in the form of reports and dashboards. These tools are used to drive and measure an organizations business, however, data analysis presents the most strategic opportunity for companies because it can identify new opportunities, efficiencies, and competitive differentiation. As more data comes online, the difference between those companies that are successful and those that are not will likely be attributed to their ability to harness data analysis techniques to drive and improve business performance. Thus, with Jaspersoft 4.1, and our improved ad hoc reporting and analysis UI we can effectively address a broader set of BI requirements for organizations of all sizes.

Ajay- What do you think is a good metric to measure influence of an open source software product – is it revenue or is it number of downloads or number of users. How does Jaspersoft do by these counts.

Mike- History has shown that open source software is successful as a “bottoms up” disrupter within IT or the developer market. Today, many new software projects and startup ventures are birthed on open source software, often initiated with little to no budget. As the organization achieves success with a particular project, the next initiative tends to be larger and more strategic, often displacing what was historically solved with a proprietary solution. These larger deployments strengthen the technology over time.

Thus, the more proven and battle tested an open source solution is, often measured via downloads, deployments, community size, and community activity, usually equates to its long term success. Linux, Tomcat, and MySQL have plenty of statistics to model this lifecycle. This model is no different for open source BI.

The success to date of Jaspersoft is directly tied to its solid proven technology and the vibrancy of the community. We proudly and openly claim to have the largest BI community with over 14 million downloads, nearly 230,000 registered members, representing over 175,000 production deployments, 14,000 customers, across 100 countries. Every day, 30,000 developers are using Jaspersoft to build BI applications. Behind Excel, its hard to imagine a more widely used BI tool in the market. Jaspersoft could not reach these kind of numbers with crippled or poorly architected software.

Ajay- What are your plans for leveraging cloud computing, mobile and tablet platforms and for making Jaspersoft more easy and global to use.

Mike-The cloud Continue reading

Web Analytics Certifications by Google

Google has a whole list of certifications for people wanting to be certified in analytics, and advertising related to internet.

Continue reading “Web Analytics Certifications by Google”

#SAS 9.3 and #Rstats 2.13.1 Released

A bit early but the latest editions of both SAS and R were released last week.

SAS 9.3 is clearly a major release with multiple enhancements to make SAS both relevant and pertinent in enterprise software in the age of big data. Also many more R specific, JMP specific and partners like Teradata specific enhancements.

http://support.sas.com/software/93/index.html

Features

Data management

Enhanced manageability for improved performance
In-database processing (EL-T pushdown)
Enhanced performance for loading oracle data
New ET-L transforms
Data access

Data quality

SAS^® Data Integration Server includes DataFlux^® Data Management Platform for enhanced data quality
Master Data Management (DataFlux® qMDM)
- Provides support for master hub of trusted entity data.

Analytics

SAS^® Enterprise Miner™
- New survival analysis predicts when an event will happen, not just if it will happen.
- New rate making capability for insurance predicts optimal insurance premium for individuals based on attributes known at application time.
- Time Series Data Mining node (experimental) applies data mining techniques to transactional, time-stamped data.
- Support Vector Machines node (experimental) provides a supervised machine learning method for prediction and classification.
SAS^® Forecast Server
- SAS Forecast Server is integrated with the SAP APO Demand Planning module to provide SAP users with access to a superior forecasting engine and automatic forecasting capabilities.
SAS^® Model Manager
- Seamless integration of R models with the ability to register and manage R models in SAS Model Manager.
- Ability to perform champion/challenger side-by-side comparisons between SAS and R models to see which model performs best for a specific need.
SAS/OR^® and SAS^® Simulation Studio
- Optimization
- Simulation
  - Automatic input distribution fitting using JMP with SAS Simulation Studio.

Text analytics

SAS^® Text Miner
SAS^® Enterprise Content Categorization
SAS^® Sentiment Analysis

Scalability and high-performance

SAS^® Analytics Accelerator for Teradata (new product)
SAS^® Grid Manager

and latest from http://www.r-project.org/ I was a bit curious to know why the different licensing for R now (from GPL2 to GPL2- GPL 3)

http://www.gnu.org/licenses/gpl-2.0.html

and http://gplv3.fsf.org/

and http://www.gnu.org/licenses/quick-guide-gplv3.html

LICENCE:

• No parts of R are now licensed solely under GPL-2. The licences for packages rpart and survival have been changed, which means that the licence terms for R as distributed are GPL-2 | GPL-3.

https://stat.ethz.ch/pipermail/r-announce/2011/000541.html

This is a maintenance release to consolidate various minor fixes to 2.13.0.

CHANGES IN R VERSION 2.13.1:

  NEW FEATURES:

    • iconv() no longer translates NA strings as "NA".

    • persp(box = TRUE) now warns if the surface extends outside the
      box (since occlusion for the box and axes is computed assuming
      the box is a bounding box). (PR#202.)

    • RShowDoc() can now display the licences shipped with R, e.g.
      RShowDoc("GPL-3").

    • New wrapper function showNonASCIIfile() in package tools.

    • nobs() now has a "mle" method in package stats4.

    • trace() now deals correctly with S4 reference classes and
      corresponding reference methods (e.g., $trace()) have been added.

    • xz has been updated to 5.0.3 (very minor bugfix release).

    • tools::compactPDF() gets more compression (usually a little,
      sometimes a lot) by using the compressed object streams of PDF
      1.5.

    • cairo_ps(onefile = TRUE) generates encapsulated EPS on platforms
      with cairo >= 1.6.

    • Binary reads (e.g. by readChar() and readBin()) are now supported
      on clipboard connections.  (Wish of PR#14593.)

    • as.POSIXlt.factor() now passes ... to the character method
      (suggestion of Joshua Ulrich).  [Intended for R 2.13.0 but
      accidentally removed before release.]

    • vector() and its wrappers such as integer() and double() now warn
      if called with a length argument of more than one element.  This
      helps track down user errors such as calling double(x) instead of
      as.double(x).

  INSTALLATION:

    • Building the vignette PDFs in packages grid and utils is now part
      of running make from an SVN checkout on a Unix-alike: a separate
      make vignettes step is no longer required.

      These vignettes are now made with keep.source = TRUE and hence
      will be laid out differently.

    • make install-strip failed under some configuration options.

    • Packages can customize non-standard installation of compiled code
      via a src/install.libs.R script. This allows packages that have
      architecture-specific binaries (beyond the package's shared
      objects/DLLs) to be installed in a multi-architecture setting.

  SWEAVE & VIGNETTES:

    • Sweave() and Stangle() gain an encoding argument to specify the
      encoding of the vignette sources if the latter do not contain a
      \usepackage[]{inputenc} statement specifying a single input
      encoding.

    • There is a new Sweave option figs.only = TRUE to run each figure
      chunk only for each selected graphics device, and not first using
      the default graphics device.  This will become the default in R
      2.14.0.

    • Sweave custom graphics devices can have a custom function
      foo.off() to shut them down.

    • Warnings are issued when non-portable filenames are found for
      graphics files (and chunks if split = TRUE).  Portable names are
      regarded as alphanumeric plus hyphen, underscore, plus and hash
      (periods cause problems with recognizing file extensions).

    • The Rtangle() driver has a new option show.line.nos which is by
      default false; if true it annotates code chunks with a comment
      giving the line number of the first line in the sources (the
      behaviour of R >= 2.12.0).

    • Package installation tangles the vignette sources: this step now
      converts the vignette sources from the vignette/package encoding
      to the current encoding, and records the encoding (if not ASCII)
      in a comment line at the top of the installed .R file.

  DEPRECATED AND DEFUNCT:

    • The internal functions .readRDS() and .saveRDS() are now
      deprecated in favour of the public functions readRDS() and
      saveRDS() introduced in R 2.13.0.

    • Switching off lazy-loading of code _via_ the LazyLoad field of
      the DESCRIPTION file is now deprecated.  In future all packages
      will be lazy-loaded.

    • The off-line help() types "postscript" and "ps" are deprecated.

  UTILITIES:

    • R CMD check on a multi-architecture installation now skips the
      user's .Renviron file for the architecture-specific tests (which
      do read the architecture-specific Renviron.site files).  This is
      consistent with single-architecture checks, which use
      --no-environ.

    • R CMD build now looks for DESCRIPTION fields BuildResaveData and
      BuildKeepEmpty for per-package overrides.  See ‘Writing R
      Extensions’.

  BUG FIXES:

    • plot.lm(which = 5) was intended to order factor levels in
      increasing order of mean standardized residual.  It ordered the
      factor labels correctly, but could plot the wrong group of
      residuals against the label.  (PR#14545)

    • mosaicplot() could clip the factor labels, and could overlap them
      with the cells if a non-default value of cex.axis was used.
      (Related to PR#14550.)

    • dataframe[[row,col]] now dispatches on [[ methods for the
      selected column (spotted by Bill Dunlap).

    • sort.int() would strip the class of an object, but leave its
      object bit set.  (Reported by Bill Dunlap.)

    • pbirthday() and qbirthday() did not implement the algorithm
      exactly as given in their reference and so were unnecessarily
      inaccurate.

      pbirthday() now solves the approximate formula analytically
      rather than using uniroot() on a discontinuous function.

      The description of the problem was inaccurate: the probability is
      a tail probablity (‘2 _or more_ people share a birthday’)

    • Complex arithmetic sometimes warned incorrectly about producing
      NAs when there were NaNs in the input.

    • seek(origin = "current") incorrectly reported it was not
      implemented for a gzfile() connection.

    • c(), unlist(), cbind() and rbind() could silently overflow the
      maximum vector length and cause a segfault.  (PR#14571)

    • The fonts argument to X11(type = "Xlib") was being ignored.

    • Reading (e.g. with readBin()) from a raw connection was not
      advancing the pointer, so successive reads would read the same
      value.  (Spotted by Bill Dunlap.)

    • Parsed text containing embedded newlines was printed incorrectly
      by as.character.srcref().  (Reported by Hadley Wickham.)

    • decompose() used with a series of a non-integer number of periods
      returned a seasonal component shorter than the original series.
      (Reported by Rob Hyndman.)

    • fields = list() failed for setRefClass().  (Reported by Michael
      Lawrence.)

    • Reference classes could not redefine an inherited field which had
      class "ANY". (Reported by Janko Thyson.)

    • Methods that override previously loaded versions will now be
      installed and called.  (Reported by Iago Mosqueira.)

    • addmargins() called numeric(apos) rather than
      numeric(length(apos)).

    • The HTML help search sometimes produced bad links.  (PR#14608)

    • Command completion will no longer be broken if tail.default() is
      redefined by the user. (Problem reported by Henrik Bengtsson.)

    • LaTeX rendering of markup in titles of help pages has been
      improved; in particular, \eqn{} may be used there.

    • isClass() used its own namespace as the default of the where
      argument inadvertently.

    • Rd conversion to latex mis-handled multi-line titles (including
      cases where there was a blank line in the \title section).

Also see this interesting blog

http://sas-and-r.blogspot.com/

Examples of tasks replicated in SAS and R

Introducing Radoop

Thats Right- This is Radoop and it is

Hadoop meats Rapid Miner=Radoop

http://prezi.com/bin/preziloader.swf

http://prezi.com/dxx7m50le5hr/radoop-presentation-at-rcomm-2011/

Radoop presentation at RCOMM 2011 on Prezi

What about Hive and Mahout?

Hive is a data warehouse infrastructure built on top of Hadoop, i.e. it uses the distributed file system of Hadoop and the efficient access technologies. Hive was initially developed by Facebook and is now used and developed by many other companies for their distributed data warehouse.

Mahout is a machine learning library already offering many scalable machine learning libraries implemented as well on top of Hadoop and its map & reduce paradigm. Hence, Mahout is one of the first distributed data analytics framework making use of the power of Hadoop.

You will see below that both frameworks will be tightly integrated with RapidMiner.

What can RapidMiner bring into the game?

Hadoop is great for large scale analytics, but it lacks an easy-to-use graphical interface. RapidMiner is an excellent tool for data analytics, but unless the analyst is not performing some nasty tricks, the data size is limited by the memory available. So we have the algorithms, the support for analytical process design, the user interface, and of course the community with a demand for large-scale analytics.

RapidMiner + Hadoop = Radoop

Radoop combines the strengths of RapidMiner and Hadoop. The result is a RapidMiner extension for editing and running ETL, data analytics and machine learning processes over Hadoop. The developers have closely integrated the highly optimized data analytics capabilities of Hive and Mahout, and the user-friendly interface of RapidMiner to form a powerful and easy-to-use data analytics solution for Hadoop.

Source-https://rapid-i.com/component/option,com_myblog/show,Big-data-analytics-made-easy-Radoop.html/Itemid,172/

and http://blog.radoop.eu/

z1sxe

Google Product Launches

So dear G launched a whole new set of Products. Some thoughts-

1) Join up the Social Invite List here – it is called Google Plus. We hope it doesnt end up like Buzz http://www.google.com/buzz or Orkut https://groups.google.com/group/opensocial-api/?pli=1 or Plus One http://www.google.com/webmasters/+1/button/ or Wave (email killer) http://googlewave.blogspot.com/

When the biggest cloud computing company in the world announces a phased rollout to a product- we wonder if they are really sure on launching the product rollout or just were in a hurry again.

Machine learning wont work with social , chaps. Well not everything in social. And the Google Social Blog forgot to write about it http://googlesocialweb.blogspot.com/

Well anyways, even Google Finance’s automated announcements feed failed to pick many of their own product launches (or it does in an automated manner depending on which time period you choose – yes still no social buttons up http://www.google.com/finance?q=google

BACK TO GOOGLE PLUS

https://services.google.com/fb/forms/googleplus/

Thanks for stopping by.We’re still ironing out a few kinks in Google+, so it’s not quite ready for everyone to climb aboard. But, if you want, we’ll let you know the minute the doors are open for real. Cool? Cool.

First Name *
Email *

Google+ Privacy Policy

2) Google Web Fonts- Great product, how and hey http://googlewebfonts.blogspot.com/ when do you plan to monetize uhm web fonts. Not that would be awesome. Not even a single ad on those pages- not even for philanthropy. or poor poets. or even Google Book Authors who self publish . Sound of silence….

http://www.google.com/webfonts/v2

3) Google Analytics gets some groove back. I really want to see much better integration of Google Apps and Google Analytics and Google Desktop search. Ditto for the interface. Enterprise software uses different fonts than retail software, dude. More fries, http://analytics.blogspot.com/ ?

Feature 1- Custom Reports for metrics I can slice and dice on my own

Feature 2 Awesome analytics for In-Page Analytics (beta feature) Beta is boring if overused. Try Theta maybe?

Feature 3 Daily Automated Alerts for Unusual Server /Traffic Activity

Feature 4 event Tracking is cool esp for understanding social media impact

It is still too early for mobile (in terms of traffic) as well as tablet analytics (?)

4) Angry Birds is still the best feature in Chrome (but there are lots others at http://chrome.blogspot.com/) and esp http://googlecode.blogspot.com/2011/06/working-with-chromes-file-browser.html

Try http://chrome.angrybirds.com/

There are ways to make software that are not evil. Very very disappointed at total lack of monetization of this chrome app. Not even a T Shirt for me to buy ad . sighs

Funny thing- the product manager forgot to take off Facebook like button or even add the +1 button or even the Tweet this button.

Quo Vadis ?

5) What do you love?

http://www.wdyl.com/#

Analytics 2011 Conference

From http://www.sas.com/events/analytics/us/

The Analytics 2011 Conference Series combines the power of SAS’s M2010 Data Mining Conference and F2010 Business Forecasting Conference into one conference covering the latest trends and techniques in the field of analytics. Analytics 2011 Conference Series brings the brightest minds in the field of analytics together with hundreds of analytics practitioners. Join us as these leading conferences change names and locations. At Analytics 2011, you’ll learn through a series of case studies, technical presentations and hands-on training. If you are in the field of analytics, this is one conference you can’t afford to miss.

Conference Details

October 24-25, 2011
Grande Lakes Resort
Orlando, FL

Analytics 2011 topic areas include:

Data Mining
Forecasting
Text Analytics
Fraud Detection
Data Visualization Continue reading “Analytics 2011 Conference”

Please share:

Please share:

Please share:

Features

Data management

Data quality

Analytics

Text analytics

Scalability and high-performance

Please share:

Interesting? Sign up here- http://radoop.eu/z1sxe

Please share:

Please share:

Conference Details

Analytics 2011 topic areas include:

Please share: