KDNuggets Survey on R

CRISP-DM
Image via Wikipedia

From http://www.kdnuggets.com/2011/03/new-poll-r-in-analytics-data-mining-work.html?k11n07

A new poll/survey on actual usage of R in Data Mining

R has been steadily growing in popularity among data miners and analytic professionals.

In KDnuggets 2010 Data Mining / Analytic Tools Poll, R was used by 30% of respondents.
In 2010 Rexer Analytics Data Miner SurveyR was the most popular tool, used by 43% of the data miners.

Another aspect of tool usefulness is how much does it help with the entire data mining process from data preparation and cleaning, modeling, evaluation, visualization and presentation (excluding deployment).

New KDnuggets Poll is asking:
What part of your analytics / data mining work in the past 12 months was done in R?

http://www.kdnuggets.com/2011/03/new-poll-r-in-analytics-data-mining-work.html?k11n07

 

Protected: Whats behind that pretty SAS Blog?

This content is password-protected. To view it, please enter the password below.

Interview Anne Milley JMP

Here is an interview with Anne Milley, a notable thought leader in the world of analytics. Anne is now Senior Director, Analytical Strategy in Product Marketing for JMP , the leading data visualization software from the SAS Institute.

Ajay-What do you think are the top 5 unique selling points of JMP compared to other statistical software in its category?

Anne-

JMP combines incredible analytic depth and breadth with interactive data visualization, creating a unique environment optimized for discovery and data-driven innovation.

With an extensible framework using JSL (JMP Scripting Language), and integration with SAS, R, and Excel, JMP becomes your analytic hub.

JMP is accessible to all kinds of users. A novice analyst can dig into an interactive report delivered by a custom JMP application. An engineer looking at his own data can use built-in JMP capabilities to discover patterns, and a developer can write code to extend JMP for herself or others.

State-of-the-art DOE capabilities make it easy for anyone to design and analyze efficient experiments to determine which adjustments will yield the greatest gains in quality or process improvement – before costly changes are made.

Not to mention, JMP products are exceptionally well designed and easy to use. See for yourself and check out the free trial at www.jmp.com.

Download a free 30-day trial of JMP.

Ajay- What are the challenges and opportunities of expanding JMP’s market share? Do you see JMP expanding its conferences globally to engage global audiences?

Anne-

We realized solid global growth in 2010. The release of JMP Pro and JMP Clinical last year along with continuing enhancements to the rest of the JMP family of products (JMP and JMP Genomics) should position us well for another good year.

With the growing interest in analytics as a means to sustained value creation, we have the opportunity to help people along their analytic journey – to get started, take the next step, or adopt new paradigms speeding their time to value. The challenge is doing that as fast as we would like.

We are hiring internationally to offer even more events, training and academic programs globally.

Ajay- What are the current and proposed educational and global academic initiatives of JMP? How can we see more JMP in universities across the world (say India- China etc)?

Anne-

We view colleges and universities both as critical incubators of future JMP users and as places where attitudes about data analysis and statistics are formed. We believe that a positive experience in learning statistics makes a person more likely to eventually want and need a product like JMP.

For most students – and particularly for those in applied disciplines of business, engineering and the sciences – the ability to make a statistics course relevant to their primary area of study fosters a positive experience. Fortunately, there is a trend in statistical education toward a more applied, data-driven approach, and JMP provides a very natural environment for both students and researchers.

Its user-friendly navigation, emphasis on data visualization and easy access to the analytics behind the graphics make JMP a compelling alternative to some of our more traditional competitors.

We’ve seen strong growth in the education markets in the last few years, and JMP is now used in nearly half of the top 200 universities in the US.

Internationally, we are at an earlier stage of market development, but we are currently working with both JMP and SAS country offices and their local academic programs to promote JMP. For example, we are working with members of the JMP China office and faculty at several universities in China to support the use of JMP in the development of a master’s curriculum in Applied Statistics there, touched on in this AMSTAT News article.

Ajay- What future trends do you see for 2011 in this market (say top 5)?

Anne-

Growing complexity of data (text, image, audio…) drives the need for more and better visualization and analysis capabilities to make sense of it all.

More “chief analytics officers” are making better use of analytic talent – people are the most important ingredient for success!

JMP has been on the vanguard of 64-bit development, and users are now catching up with us as 64-bit machines become more common.

Users should demand easy-to-use, exploratory and predictive modeling tools as well as robust tools to experiment and learn to help them make the best decisions on an ongoing basis.

All these factors and more fuel the need for the integration of flexible, extensible tools with popular analytic platforms.

Ajay-You enjoy organic gardening as a hobby. How do you think hobbies and unwind time help people be better professionals?

Anne-

I am lucky to work with so many people who view their work as a hobby. They have other interests too, though, some of which are work-related (statistics is relevant everywhere!). Organic gardening helps me put things in perspective and be present in the moment. More than work defines who you are. You can be passionate about your work as well as passionate about other things. I think it’s important to spend some leisure time in ways that bring you joy and contribute to your overall wellbeing and outlook.

Btw, nice interviews over the past several months—I hadn’t kept up, but will check it out more often!

Biography–  Source- http://www.sas.com/knowledge-exchange/business-analytics/biographies.html

  • Anne Milley

    Anne Milley

    Anne Milley is Senior Director of Analytics Strategy at JMP Product Marketing at SAS. Her ties to SAS began with bank failure prediction at Federal Home Loan Bank Dallas and continued at 7-Eleven Inc. She has authored papers and served on committees for F2006, KDD, SIAM, A2010 and several years of SAS’ annual data mining conference. Milley is a contributing faculty member for the International Institute of Analytics. anne.milley@jmp.com

The Latest GUI for R- BioR

Once more a spanking new shiny software –

Bio7 is a integrated development environment for ecological modelling based on the Rich-Client-Platformconcept of the Java IDE Eclipse. The Bio7 platform contains several perspectives which arrange several views for a special purpose useful for the development and analysis of ecological models. One special perspective bundles a feature rich GUI (Graphical User Interface) for the statistical software R.
For the bidirectional communication between Java and R the Rserve application is used (as a backend to evaluate R code and transfer data from and to Java).
The Bio7 R perspective (see figure below) is divided into a R-Shell view on the left side (conceptual the R side) and a Table view on the right side (conceptual the Java side).
Data can be imported to a spreadsheet, edited and then transferred to the R workspace. Vice versa data from R can be transferred to a sheet of the Table view and then exported e.g. to an Excel or OpenOffice file.

and

General:

Built upon Eclipse 3.6.1.

Now works with the latest Java version! (Windows version bundled with the latest JRE release).

Removed the Soil perspective (now soils can be modeled with ImageJ (float precision). Active images can be displayed in the 3D discrete view (new example available).

Removed the database perspective and the plant layer. You can now built any discrete models without any plant layer.

Removed several controls in the Control view. Added the “Custom Controls” view. In addition ported the Swing component of the Time panel to Swt.

Deleted the avi to swf converter in the ImageJ menu.

Now patterns can be saved with opened Java editor source. If this file is reopened and dragged on Bio7 the pattern is loaded, the source is compiled and the setup method (if available) is executed. In this way model files can be used for presentations ->drag, setup and run. The save actions are located in the Speadsheet view toolbar.

More options available to disable panel painting and recording of values (if not needed for speed!).

New Setup button in the toolbar of Bio7 to trigger a compiled setup method if available.

Removed the load and save pattern buttons from the toolbar of Bio7. Discrete patterns can now be stored with the available action in the spreadsheet view menu.

New P2 Update Manager available in Bio7.

Updated the Janino Compiler.

New HTML perspective added with a view which embeds the TinyMC editor.

New options to disable painting operations for the discrete panels.

New option to explicitly enable scripts at startup (for a faster startup).

Quadgrid (Hexgrid)

Only states are now available which can be created in the “Spreadsheet” view menu easily. Patterns can be stored and restored as usual but are now stored in an *.exml file.

New method to transfer the quadgrid pattern as a matrix to R.

New method to transfer the population data of all quadgrid states to R.

ImageJ:

Update to the latest version (with additional fixes).

Fixed a bug to rename the image.

Thumbnail browser can now open images recursevely(limited to 1000 pics), the magnifiyng glass can be disabled, too.

Plugins can be installed dynamically with a drag and drop operation on the ImageJ view or toolbar (as known from ImageJ).

Installed plugins now extend the plugin menu as submenus or subsubmenus (not finished yet!).

Plugins can now be created with the Java editor. New Bio7 Wizard available to create a plugin template.

Compiled Java files can be added to a *.jar file with a new available action in the Navigator view (if you rightclick on the files in the Navigator). In this way ImageJ plugins can be packaged in a *.jar.

Floweditor:

Fixed a repaint bug in the debug mode of a flow (now draws correctly the active shape in the flow).

Resize with Strg+Scrollwheel works again.

Comments with more than one line works again.

New Test action to verify connections in a flow.

Debug mode now shows all executed Shapes.

Integrated more default tests (for the verification of a regular flow).

A mouse-click now deletes colored shapes in a flow (e.g. in debug mode).

Points panel:

Integrated (dynamic) Voronoi, Delauney visualization (with area and clip to rectangle action).

Points coordinates can now be set in double precision.

Transfer of point coordinates to R now in double precision.

Bio7 Table:

New import and export of Excel 2007 OOXML.

Row headers can now be resized with the mouse device.

R:

Updated R (2.12.1) and Rserve (0.6.3) to the latest version.

New help action in the R-Shell view.

New action to display help for R specific commands in the embedded Bio7 browser (which opens automatically).

New Key actions to copy the selected variable names to the expression dialog (c=cocatenate (+), a=add (,)).

New action to transfer character or numeric vectors horizontally or vertically in an opened spread (Table view) at selection coordinates.

Empty spaces in the filepath are now allowed under Windows if Rserve is started with a system shell or the RGUI (for the tempfile select a location in the Preferences dialog which is writeable) is started.This works also for the RGUI action.

Improved the search for the “Install packages” action (option “Case Sensitive” added).

API:

New API methods available!

And:

Many fixes since the last version!

 

Installation

Important information:

A certain firewall software can corrupt the Bio7 *.zip file (as well as other files).
Please ensure that you have downloaded a functioning Bio7 1.5 version. In addition it is also reported that a certain antivirus software detects the bundled R software (on Windows) as malware. Often the R specific “open.exe” is detected as malware. Please use a different scanner to make sure that the software is not infected if you have any doubts. For more details see:

http://r.789695.n4.nabble.com/trojan-at-current-development-version-td3244348.html

 

QGIS and R

Logo graphic for the Quantum GIS free software...
Image via Wikipedia

Qgis is Quantum GIS http://www.qgis.org/

Quantum GIS (QGIS) is a user friendly Open Source Geographic Information System (GIS) licensed under the GNU General Public License. QGIS is an official project of the Open Source Geospatial Foundation (OSGeo). It runs on Linux, Unix, MacOSX, and Windows and supportsnumerous vector, raster, and database formats and functionalities.

Learn more about QGIS

Quantum GIS provides a continously growing number of capabilities provided by core functions and plugins. You can visualize, manage, edit, analyse data, and compose printable maps

Also you can use both Qgis and R through Python (!!!)

http://www.qgis.org/wiki/HomeRange_plugin#Home-range_analyses_in_QGIS_using_R_through_Python

Interesting app for webs (sometimes better suited than some R map packages)

https://plugins.qgis.org/plugins/HomeRange_plugin/

Based on a Google Summer of Code _

 Also

https://sites.google.com/site/eospansite/introqgis_r

and

HomeRange_plugin

http://hub.qgis.org/projects/quantum-gis/wiki/HomeRange_plugin

 

Also read-

http://blog.qgis.org/node/51

Related Articles-

R Graphs Resources

https://rforanalytics.wordpress.com/r-graphs-resources/

Using R from other Software

https://rforanalytics.wordpress.com/using-r-from-other-software/

and

Visualize NHL Play-by-Play using Tableau Public and R

http://brocktibert.wordpress.com/2011/02/13/visualize-nhl-play-by-play-using-tableau-public-and-r/

Protovis a graphical toolkit for visualization

I just found about a new data visualization tool called Protovis http://vis.stanford.edu/protovis/ex/

Protovis composes custom views of data with simple marks such as bars and dots. Unlike low-level graphics libraries that quickly become tedious for visualization, Protovis defines marks through dynamic properties that encode data, allowing inheritancescales and layouts to simplify construction.

Protovis is free and open-source and is a Stanford project. It has been used in web interface R Node (which I will talk later )

http://squirelove.net/r-node/doku.php

Conventional

While Protovis is designed for custom visualization, it is still easy to create many standard chart types. These simpler examples serve as an introduction to the language, demonstrating key abstractions such as quantitative and ordinal scales, while hinting at more advanced features, including stack layout.

Custom

Many charting libraries provide stock chart designs, but offer only limited customization; Protovis excels at custom visualization design through a concise representation and precise control over graphical marks. These examples, including a few recreations of unusual historical designs, demonstrate the language’s expressiveness.

 

 

Try Protovis today 🙂 http://vis.stanford.edu/protovis/

It uses JavaScript and SVG for web-native visualizations; no plugin required (though you will need a modern web browser)! Although programming experience is helpful, Protovis is mostly declarative and designed to be learned by example.

GrapheR

GNU General Public License
Image via Wikipedia

GrapherR

GrapheR is a Graphical User Interface created for simple graphs.

Depends: R (>= 2.10.0), tcltk, mgcv
Description: GrapheR is a multiplatform user interface for drawing highly customizable graphs in R. It aims to be a valuable help to quickly draw publishable graphs without any knowledge of R commands. Six kinds of graphs are available: histogram, box-and-whisker plot, bar plot, pie chart, curve and scatter plot.
License: GPL-2
LazyLoad: yes
Packaged: 2011-01-24 17:47:17 UTC; Maxime
Repository: CRAN
Date/Publication: 2011-01-24 18:41:47

More information about GrapheR at CRAN
Path: /cran/newpermanent link

Advantages of using GrapheR

  • It is bi-lingual (English and French) and can import in text and csv files
  • The intention is for even non users of R, to make the simple types of Graphs.
  • The user interface is quite cleanly designed. It is thus aimed as a data visualization GUI, but for a more basic level than Deducer.
  • Easy to rename axis ,graph titles as well use sliders for changing line thickness and color

Disadvantages of using GrapheR

  • Lack of documentation or help. Especially tips on mouseover of some options should be done.
  • Some of the terms like absicca or ordinate axis may not be easily understood by a business user.
  • Default values of color are quite plain (black font on white background).
  • Can flood terminal with lots of repetitive warnings (although use of warnings() function limits it to top 50)
  • Some of axis names can be auto suggested based on which variable s being chosen for that axis.
  • Package name GrapheR refers to a graphical calculator in Mac OS – this can hinder search engine results

Using GrapheR

  • Data Input -Data Input can be customized for CSV and Text files.
  • GrapheR gives information on loaded variables (numeric versus Factors)
  • It asks you to choose the type of Graph 
  • It then asks for usual Graph Inputs (see below). Note colors can be customized (partial window). Also number of graphs per Window can be easily customized 
  • Graph is ready for publication