CommeRcial R- Integration in software

Some updates to R on the commercial side.

Revolution Computing is apparently now renamed Revolution Analytics. Hopefully this and the GUI development will help pay more focused attention on working in R in a mainstream office situation. I am still waiting for David Smith’s cheery hey-guys-we-changed-again blog post though at a new site called inside-r.org/ or his old blog site at blog.revolution-computing.com

They probably need to hire more people now – Curt Monash, noted all-things-data software guru has the inside dope here

Techworld writes more here at http://www.techworld.com.au/article/345288/startup_wants_r_alternative_ibm_sas

The company’s software is priced “aggressively” versus IBM and SAS. A single supported workstation costs $2,000 for an annual subscription. Pricing for server-based licenses varies depending on the implementation.

But Revolution Analytics faces a tough challenge from those larger vendors, as well as the likes of XLSolutions, which offers R training and a competing software package, R-Plus.

SPSS though continues to integrate R solidly and also march ahead with Python (which is likely to be the next gen in statistical programming if it keeps up) http://insideout.spss.com/

With the release of Version 18 of IBM SPSS Statistics and the Developer product, easy-to-install versions of the Python and R materials are posted.  In particular, look for the R Essentials link on the main page or from the Plugins page.  It installs the R Plugin, the correct version of R, and a bunch of example R integrations as bundles.  It’s much easier to get going with this now.

Netezza , a business intelligence vendor promises more integration and even a training in R based analytics here

R Modeling for TwinFin i-Class

Objective
Learn how to use TwinFin i-Class for scaling up the R language.

Description
In this class, you’ll learn how to use R to create models using huge data and how to create R algorithms that exploit our asymmetric massively parallel (AMPP®) architecture. Netezza has seamlessly integrated with R to offload the heavy lifting of the computational processing on TwinFin i-Class. This results in higher performance and increased scalability for R. Sign up for this class to learn how to take advantage of TwinFin i-Class for your R modeling. Topics include:

  1. R CRAN package installation on TwinFin i-Class
  2. Creating models using R on TwinFin i-Class
  3. Creating R algorithms for TwinFin i-Class

Format
Hands-on classroom lecture, lab exercises, tour

Audience
Knowledgeable R users – modelers, analytic developers, data miners

Course Length
0.5 day: 12pm-4pm Wednesday, June 23 OR 8am-12pm Thursday, June 24 OR 1pm-5pm Thursday, June 24, 2010

Delivery
Enzee Universe 2010, Boston, MA

Student Prerequisites

  • Working knowledge of R and parallel computing
  • Have analytic, compute-intensive challenges
  • Understanding of data mining and analytics”

My favourite GUI in stats , JMP (also from SAS Institute) is going to deploy R integration as soon as this September – Read more here- http://www.sas.com/news/preleases/JMP-to-R-integrationSGF10.html

Also SAS-IML studio is not lagging behind

The next release of SAS/IML will extend R integration to the server environment – enabling users to deploy results in batch mode and access R from SAS on additional platforms, such as UNIX and Linux.

I am kind of happy at one of the best GUI’s integrating with one of the most innovative stats softwares. It’s like two of your best friends getting married. (see screenshots of the softwares)

All in all- R as a platform making good overall progress from all sides of the corporate software spectrum which can only be good for R developers as well as users/students.

Top 10 Graphical User Interfaces in Statistical Software

Here is a list of top 10 GUIs in Statistical Software. The overall criterion is based on-

  • User Friendly Nature for a New User to begin click and point and learn.
  • Cleanliness of Automated Code or Log generated.
  • Practical application in consulting and corporate world.
  • Cost and Ease of Ownership (including purchase,install,training,maintainability,renewal)
  • Aesthetics (or just plain pretty)

However this list is not in order of ranking- ( as beauty (of GUI) lies in eyes of the beholder). For a list of top 10 GUI in R language only please see –

https://rforanalytics.wordpress.com/graphical-user-interfaces-for-r/

This is only a GUI based list so it excludes notable command line or text editor submit commands based softwares which are also very powerful and user friendly.

  1. JMP –

While critics of SAS Institute often complain on the premium pricing of the basic model (especially AFTER the entry of another SAS language software WPS from http://www.teamwpc.co.uk/products/wps – they should try out JMP from http://jmp.com – it has a 1 month free evaluation, is much less expensive and the GUI makes it very very easy to do basic statistical analysis and testing. The learning curve is surprisingly fast to pick it up (as it should be for well designed interfaces) and it allows for very good quality output graphics as well.

2.SPSS

The original GUI in this class of softwares- it has now expanded to a big portfolio of products. However SPSS 18 is nice with the increasing focus on Python and an early adoptee of R compatible interfaces, SPSS does offer a much affordable solution as well with a free evaluation. See especially http://www.spss.com/statistics/ and http://www.spss.com/software/modeling/modeler-pro/

the screenshot here is of SPSS Modeler

3. WPS

While it offers an alternative to Base SAS and SAS /Access software , I really like the affordability (1 Month Free Evaluation and overall lower cost especially for multiple CPU servers ), speed (on the desktop but not on the IBM OS version ) and the intuitive design as well as extensibility of the Workbench. It may look like an integrated development environment and not a proper GUI, but with all the menu features it does qualify as a GUI in my opinion. Continue reading “Top 10 Graphical User Interfaces in Statistical Software”

Climate Die Oxide ( Updated)


Here is some room for thought in climate control negotiations.

[tweetmeme=”Decisionstats”]

Decisionstats on Facebook

1) What is the expected date of melting of glaciers in Himalayas thus affecting sacred rivers like Ganges and also causing floods in densely populated Asia. How would nation states with shareable resources like Water react on the disputes, dams , hydro electricity and floods.

2) How would you count per capita CO2 consumption- Assume a Factory in China makes 3 tonnes of C02 every year but exports all its products to USA on Indian Cargo ship. Travel contributes another 1 tonne of C02 including air travel, visits etc.

As of now this will be counted as 3 tonne for China, 1 Tonne for India, X tonne for USA ? What is wrong in these assumptions.

3) Some countries that used to be cold will get warmer- will that lead to extra crops. Which countries will that be.

4) It took a world war to create fission. Will it take another World War on Energy to create fusion. How much energy and resources are needed for creating a dedicated project ManHatten 2 for sharing with the world.

5) Most of the bigger data owned by climate change observations is in the Western Hemisphere under National labs not under UN control OR INSPECTION. How sacrosanct is the data to fudging, or infiltration by intelligence agencies of those countries hoping to influence bargaining chips on the climate change table.

6) Are there last action military ways to change climate during wars- like cause glaciers to melt by thermal bombs, earthquakes by seismic sensitive explosions and how high tech are these solutions and which countries have them.

7) If the planet is running out of Resources- why dont we go to Mars. 🙂

Source

http://manyeyes.alphaworks.ibm.com/manyeyes/files/thumbnails/bb09d328-d863-11de-a602-000255111976.wm.png

Note this is from 2006 Data, so assume 2009 CO2 as more than this.

Data Source-

TN guys at ORNL at http://cdiac.ornl.gov/trends/emis/glo.html

Data Visualization: MANY EYES IBM

http://manyeyes.alphaworks.ibm.com/manyeyes/visualizations/2006-co2-emissions-by-country

Ponder This: IBM Research

 

 

 

 

 

 

 

 

 

 

 

 

 

Ponder This Challenge:

 

What is the minimal number, X, of yes/no questions needed to find the smallest (but more than 1*) divisor of a number between 2 and 166 (inclusive)?

We are asking for the exact answer in two cases:

In the worst case, i.e., what is the smallest number X for which we can guarantee finding it in no more than X questions?

On average, i.e., assuming that the number was chosen in uniform distribution from 2 to 166 and we want to minimize the expected number of questions.

* For example, the smallest divisor of 105 is 3, and of 103 is 103.

Update (11/05): You should find the exact divisor without knowing the number and answering “prime” is not a valid

Citation-

http://domino.research.ibm.com/Comm/wwwr_ponder.nsf/pages/index.html

A maths challenge by the boys in Blue above and also in employement news, the parent company of SPSS is opening a centre of advanced analytics right here in Washington D.C.

WASHINGTON – 10 Nov 2009: IBM (NYSE: IBM) today announced the opening of the sixth in a network of analytics solution centers – this one dedicated to helping federal agencies and other public sector organizations extract actionable insights from their data.

The new IBM Analytics Solution Center in Washington, D.C., will draw on the expertise of more than 400 IBM professionals. These will include IBM researchers, experts in advanced software platforms, and consultants with deep industry knowledge in areas such as transportation, social services, public safety, customs and border management, revenue management, defense, logistics, healthcare and education. IBM also plans to add an additional 100 professionals, through retraining or new hiring, as demand grows.

Twitter Cloud and a note on Cloud Computing

That’s what I use twitter for. If you have a twitter account you can follow me here

http://twitter.com/decisionstats

A couple of weeks ago I accidentally deleted many followers using a Twitter App called Refollow- I was trying to clean up people I follow and checked the wrong tick box-

so please if you feel I unfollowed you- it was a mistake. Seriously.

[tweetmeme=”decisionstats”]

 

 

 

 

 

 

 

 

 

 

 

 

On Cloud Computing- and Google- rumours ( 🙂 ) are emerging that Google’s push for cloud computing is to turn desktop computing to IBM like mainframe computing .  Except that there are too many players this time. Where is the Department of Justice and anti trust – does Amazon qualify for being too big in cloud computing currently.

Or the rumours could be spread by Microsoft/ Apple / Amazon competitors etc. Geeks are like that sometimes.

Analytics and BI for small biz

I saw a story on Warren B and Goldman S creating a 500$ million pool for small business owners.

  • The program will contribute $200 million to community colleges, universities and other institutions to provide small- business owners with practical business education.

  • Goldman Sachs repaid the $10 billion it was given last year under the taxpayer-funded Troubled Asset Relief Program, plus dividends. The firm continues to benefit from federal guarantees on about $21 billion of long-term debt.

  • Buffett, known as the “Oracle of Omaha” for his investing prowess, is the second-richest American. Berkshire, which invests in companies ranging from retailers to insurers, paid $5 billion in September 2008 to acquire preferred stock in Goldman Sachs that pays a 10 percent dividend. Berkshire, based in Omaha, Nebraska, also gained five-year warrants to buy $5 billion of common stock at $115 per share.

  • ( NOTE Curent Price of GS shares is 172$ – thats a 50% profit on 5 Billion~ 2.5 Billion for Mr Buffett but he is probably waiting for long term capital gains ax rates to kick in before encashing his patriotic  “Buy American. I am” warrants (see NYT op ed by him  http://www.nytimes.com/2008/10/17/opinion/17buffett.html )
  • A better analysis of the above Bloomberg story was given on Bloomberg itself at http://www.bloomberg.com/apps/news?pid=20601039&sid=asjp51YPDwJU
  • A small thought- could smaller businesses gain from efficiencies of programs like SPSS, SAS and R. Or would they be better off with customized GUI’s linked to their POS data.

Anyways a need for analytics for small businesses in inventory management, and sales planning could help. Joe the Plumber could do with some ETS and Regression Models as well.

However apart for Salesforce.com applications this field seems to be totally vacant for analytics. What are IBM SPSS, SAS, or even other stats packages doing for small businesses. or even developing Salesforce.com applications for their own equivalent software

The market could be an interesting one to atleast do a test in. Unless you don’t believe in test and control.

See below the IBM Cognos by IBM itself and the third party app by Pervasive for SAP Integration-

Citation-

http://sites.force.com/appexchange/listingDetail?listingId=a0N300000016YGYEA2

and

http://sites.force.com/appexchange/listingDetail?listingId=a0N300000016am1EAA

IBM launches Smart Analytics Cloud

From http://www-03.ibm.com/systems/z/solutions/cloud/smart.html, IBM the parent of SPSS announced a Smart Analytics Cloud.

test1