John Sall sets JMP 9 free to tango with R

 

Diagnostic graphs produced by plot.lm() functi...
Image via Wikipedia

 

John Sall, founder SAS AND JMP , has released the latest blockbuster edition of flagship of JMP 9 (JMP Stands for John’s Macintosh Program).

To kill all birds with one software, it is integrated with R and SAS, and the brochure frankly lists all the qualities. Why am I excited for JMP 9 integration with R and with SAS- well it integrates bigger datasets manipulation (thanks to SAS) with R’s superb library of statistical packages and a great statistical GUI (JMP). This makes JMP the latest software apart from SAS/IML, Rapid Miner,Knime, Oracle Data Miner to showcase it’s R integration (without getting into the GPL compliance need for showing source code– it does not ship R- and advises you to just freely download R). I am sure Peter Dalgaard, and Frankie Harell are all overjoyed that R Base and Hmisc packages would be used by fellow statisticians  and students for JMP- which after all is made in the neighborhood state of North Carolina.

Best of all a JMP 30 day trial is free- so no money lost if you download JMP 9 (and no they dont ask for your credit card number, or do they- but they do have a huuuuuuge form to register before you download. Still JMP 9 the software itself is more thoughtfully designed than the email-prospect-leads-form and the extra functionality in the free 30 day trial is worth it.

Also see “New Features  in JMP 9  http://www.jmp.com/software/jmp9/pdf/new_features.pdf

which has this regarding R.

Working with R

R is a programming language and software environment for statistical computing and graphics. JMP now  supports a set of JSL functions to access R. The JSL functions provide the following options:

• open and close a connection between JMP and R

• exchange data between JMP and R

•submit R code for execution

•display graphics produced by R

JMP and R each have their own sets of computational methods.

R has some methods that JMP does not have. Using JSL functions, you can connect to R and use these R computational methods from within JMP.

Textual output and error messages from R appear in the log window.R must be installed on the same computer as JMP.

JMP is not distributed with a copy of R. You can download R from the Comprehensive R Archive Network Web site:http://cran.r-project.org

Because JMP is supported as both a 32-bit and a 64-bit Windows application, you must install the corresponding 32-bit or 64-bit version of R.

For details, see the Scripting Guide book.

and the download trial page ( search optimized URL) –

http://www.sas.com/apps/demosdownloads/jmptrial9_PROD__sysdep.jsp?packageID=000717&jmpflag=Y

In related news (Richest man in North Carolina also ranks nationally(charlotte.news14.com) , Jim Goodnight is now just as rich as Mark Zuckenberg, creator of Facebook-

though probably they are not creating a movie on Jim yet (imagine a movie titled “The Statistical Software” -not just the same dude feel as “The Social Network”)

See John’s latest interview :

The People Behind the Software: John Sall

http://blogs.sas.com/jmp/index.php?/archives/352-The-People-Behind-the-Software-John-Sall.html

Interview John Sall Founder JMP/SAS Institute

https://decisionstats.com/2009/07/28/interview-john-sall-jmp/

SAS Early Days

https://decisionstats.com/2010/06/02/sas-early-days/

So which software is the best analytical software? Sigh- It depends

 

Graph of typical Operating System placement on...
Image via Wikipedia

 

Here is the software matrix that I am trying to develop for analytical software- It should help as a tentative guide for software purchases- it’s independent so unbiased (hopefully)- and it will try and bring as much range or sensitivity as possible. The list (rather than matrix) is of the format-

Type 0f analysis-

  • Data Visualization (Reporting with Pivot Ability to aggregate, disaggregate)
  • Reporting without Pivot Ability
  • Regression -Logistic Regression for Propensity or Risk Models
  • Regression- Linear for Pricing Models
  • Hypothesis Testing
  • A/B Scenario Testing
  • Decision Trees (CART, CHAID)
  • Time Series Forecasting
  • Association Analysis
  • Factor Analysis
  • Survey (Questionnaires)
  • Clustering
  • Segmentation
  • Data Manipulation

Dataset Size-

  • small dataset (upto X mb)
  • big dataset (upto Y gb)
  • enterprise class production BigData datasets (no limit)

Pricing of Software that can be used-

Ease of using Software

  • GUI vs Non GUI
  • Software that require not much extensive training
  • Software that require extensive training

Installation, Customization, Maintainability (or Support) for Software

  • Installation Dependencies- Size- Hardware (costs and  efficiencies)
  • Customization provided for specific use
  • Support Channels (including approximate Turn Around Time)

Software

  • Software I have used personally
  • SAS (Base, Stat,Enterprise,Connect,ETS) WPS KXEN SPSS (Base,Trends),Revolution R,R,Rapid Miner,Knime,JMP,SQL SERVER,Rattle, R Commander,Deducer
  • Software I know by reputation- SAS Enterprise Miner etc etc

Are there any other parameters for judging software?  let me know at http://twitter.com/decisionstats

Which software do we buy? -It depends

Software (novel)
Image via Wikipedia

Often I am asked by clients, friends and industry colleagues on the suitability or unsuitability of particular software for analytical needs.  My answer is mostly-

It depends on-

1) Cost of Type 1 error in purchase decision versus Type 2 error in Purchase Decision. (forgive me if I mix up Type 1 with Type 2 error- I do have some weird childhood learning disabilities which crop up now and then)

Here I define Type 1 error as paying more for a software when there were equivalent functionalities available at lower price, or buying components you do need , like SPSS Trends (when only SPSS Base is required) or SAS ETS, when only SAS/Stat would do.

The first kind is of course due to the presence of free tools with GUI like R, R Commander and Deducer (Rattle does have a 500$ commercial version).

The emergence of software vendors like WPS (for SAS language aficionados) which offer similar functionality as Base SAS, as well as the increasing convergence of business analytics (read predictive analytics), business intelligence (read reporting) has led to somewhat brand clutter in which all softwares promise to do everything at all different prices- though they all have specific strengths and weakness. To add to this, there are comparatively fewer business analytics independent analysts than say independent business intelligence analysts.

2) Type 2 Error- In this case the opportunity cost of delayed projects, business models , or lower accuracy – consequences of buying a lower priced software which had lesser functionality than you required.

To compound the magnitude of error 2, you are probably in some kind of vendor lock-in, your software budget is over because of buying too much or inappropriate software and hardware, and still you could do with some added help in business analytics. The fear of making a business critical error is a substantial reason why open source software have to work harder at proving them competent. This is because writing great software is not enough, we need great marketing to sell it, and great customer support to sustain it.

As Business Decisions are decisions made in the constraints of time, information and money- I will try to create a software purchase matrix based on my knowledge of known softwares (and unknown strengths and weakness), pricing (versus budgets), and ranges of data handling. I will add in basically an optimum approach based on known constraints, and add in flexibility for unknown operational constraints.

I will restrain this matrix to analytics software, though you could certainly extend it to other classes of enterprise software including big data databases, infrastructure and computing.

Noted Assumptions- 1) I am vendor neutral and do not suffer from subjective bias or affection for particular software (based on conferences, books, relationships,consulting etc)

2) All software have bugs so all need customer support.

3) All software have particular advantages , strengths and weakness in terms of functionality.

4) Cost includes total cost of ownership and opportunity cost of business analytics enabled decision.

5) All software marketing people will praise their own software- sometimes over-selling and mis-selling product bundles.

Software compared are SPSS, KXEN, R,SAS, WPS, Revolution R, SQL Server,  and various flavors and sub components within this. Optimized approach will include parallel programming, cloud computing, hardware costs, and dependent software costs.

To be continued-

 

 

 

 

India to make own DoS -citing cyber security

After writing code for the whole world, Indian DoD (Department of Defense) has decided to start making it’s own Operating System citing cyber security. Presumably they know all about embedded code in chips, sneak kill code routines in dependent packages in operating system, and would not be using Linus Trovald’s original kernel (maybe the website was hacked to insert a small call k function 😉

as the ancient Chinese said- May you live in interesting times. Still cyber wars are better than real wars- and StuxNet virus is but a case study why countries can kill enemy plans without indulging in last century tactics.

Source-Manick Sorcar, The great Indian magician

http://www.manicksorcar.com/cartoon33.jpg

http://timesofindia.indiatimes.com/tech/news/software-services/Security-threat-DRDO-to-make-own-OS/articleshow/6719375.cms

BANGALORE: India would develop its own futuristic computer operating system to thwart attempts of cyber attacks and data theft and things of that nature, a top defence scientist said.

Dr V K Saraswat, Scientific Adviser to the Defence Minister, said the DRDO has just set up a software development  centre each here and in Delhi, with the mandate develop such a system. This “national effort” would be spearheaded by the  Defence Research and Development Organisation (DRDO) in partnership with software companies in and around Bangalore,  Hyderabad and Delhi as also academic institutions like Indian Institute of Science Bangalore and IIT Chennai, among others.

“There are many gaps in our software areas; particularly we don’t have our own operating system,” said  Saraswat, also Director General of DRDO and Secretary, Defence R & D. India currently uses operating systems developed by western countries.

Read more: Security threat: DRDO to make own OS – The Times of India http://timesofindia.indiatimes.com/tech/news/software-services/Security-threat-DRDO-to-make-own-OS/articleshow/6719375.cms#ixzz1227Y3oHg

 

Top ten RRReasons R is bad for you ?

This is the original symbol of the Perl progra...
Image via Wikipedia

R stands for programming language based out of www.r-project.org

R is bad for you because –

1) It is slower with bigger datasets than SPSS language and SAS language .If you use bigger datasets, then you should either consider more hardware , or try and wait for some of the ODBC connect packages.

2) It needs more time to learn than SAS language .Much more time to learn how to do much more.

3) R programmers are lesser paid than SAS programmers.They prefer it that way.It equates the satisfaction of creating a package in development with a world wide community with the satisfaction of using a package and earning much more money per hour.

4) It forces you to learn the exact details of what you are doing due to its object oriented structure. Thus you either get no answer or get an exact answer. Your customer pays you by the hour not by the correct answers.

5) You can not push a couple of buttons or refer to a list of top ten most commonly used commands to finish the project.

6) It is free. And open for all. It is socialism expressed in code. Some of the packages are built by university professors. It is free.Free is bad. Who pays for the mortgage of the software programmers if all softwares were free ? Who pays for the Friday picnics. Who pays for the Good Night cruises?

7) It is free. Your organization will not commend you for saving them money- they will question why you did not recommend this before. And why did you approve all those packages that expire in 2011.R is fReeeeee. Customers feel good while spending money.The more software budgets you approve the more your salary is. R thReatens all that.

8) It is impossible to install a package you do not need or want. There is no one calling you on the phone to consider one more package or solution. R can make you lonely.

9) R uses mostly Command line. Command line is from the Seventies. Or the Eighties. The GUI’s RCmdr and Rattle are there but still…..

10) R forces you to learn new stuff by the month. You prefer to only earn by the month. Till the day your job got offshored…

Written by a R user in English language

( which fortunately was not copyrighted otherwise we would be paying Britain for each word)

Ajay- The above post was reprinted by personal request. It was written on Jan 2009- and may not be truly valid now. It is meant to be taken in good humor-not so seriously.

BI Software

Here is the brand new release from Jaspersoft at a groovy price of 9000$. Somebody stop these guys!

It’s a great company to watch for buyouts as well- given their expertise in REPORTING and clientele- especially for anyone looking to im prove thier standing in both open source world and reporting software branding.

From AOL owned Arrogantion’s site http://www.crunchbase.com/company/jaspersoft

 

Total $24.5M
Series D, 8/07 1
Scale Venture Partners
SAP Ventures
Doll Capital Management
Partech International
Morgenthaler Ventures
$12M
Unattributed, 12/08 2
Adams Street Partners
Red Hat
Morgenthaler Ventures
Doll Capital Management
Partech International

 

 

The news-

Announcing JasperReports Server Professional

More Resources

Webinar: Introducing JasperReports Server Professional

Thursday October 14

In this live webinar, learn how a new solution from Jaspersoft combines the world’s favorite reporting server with powerful, mature report server functionality—for about 80% less.

  • Date: Thu, Oct 14
  • Time: 10:00 AM PDT
  • Duration: 60 minutes

The World’s Most Powerful and Affordable Reporting Server

Limited Time Introductory Offer: Starting from $9,000 (restrictions apply)

JasperReports Server is the recommended product for organizations requiring an affordable reporting solution for interactive, operational, and production-based reporting. Deployed as a standalone reporting server or integrated inside another application, JasperReports Server is a flexible, powerful, interactive reporting environment for small or large enterprises.

Powered by the world’s most popular reporting tools in JasperReports and iReport, developers and users can take advantage of more interactivity, security, and scheduling of their reports.

Key Benefits:

  • Affordable: Unlimited reports for unlimited users starting at $9,000
  • Powerful: Report scheduling and distribution to 1,000s of users on a single server
  • Flexible: Web service architecture simplifies application integration
  • Secure: Centralized repository authenticates report access
  • Interactive: Easy to interact, self-serve parameterized-based reports
  • Visual appeal: Flash-based charts and maps engage users and enhance applications
  • Open: Access to any data source including relational, XML, Hibernate, EJB, POJO, and custom

 

Speaking of videos -here is a great video on BI from good ol Tennessee-a great 27 min tutorial on BI for newbies

 

The SEO mess on joining blog aggregators

 

Mug shot of Paris Hilton.
Image via Wikipedia

 

If you are an analytics blogger who writes, and is aggregated on an analytical community- read on- Here’s how blog aggregation communities can help you lose 30% of all future traffic long term, while giving you a short term.

The problem is not created by Blogging Communities (like R-Bloggers, or PlanteR, or Smart Data Collective or AnalyticBridge or even BeyeBlogs )

It is created by the way Google Page Rank is structured- you see given exactly the same content on two different we pages- Google Page Rank will place the higher Page Rank results higher. This is counter intutive and quite simple to rectify- The Google Spider can just use the Time Stamp for choosing which article was published where first (Obviously on your blog, AND then later to the aggregator).

How bad is the mess? Well joining ANY blog aggregation will lead to an instant lift of upto 10-50 % of your current traffic as similar bloggers try and read about you. However you can lose the long term 30% proportion which is a benchmark of search engine created traffic for you.

So do you opt out of blog aggregation? No. It’s a SEO mess and it’s unfair to punish your blog aggregator, most of whom are running on ad-supported sponsors or their own funds on dry fumes to publish your content. Most of the fore mentioned communities are created by excellent people I interacted with heavily- and they are genuinely motivated to give readers an easy way to keep up with blogs. Especially Smart Data Collective, Analyticbridge and R-bloggers whose founders I have known personally.

You can do one thing- create manual summaries in the excerpt feature of your blog posts- it’s just below the WordPress page. And switch your RSS feed to summary rather than full. It avoids losing keyword rank to other websites, it prevents the Blog Aggregation from gaining too much influence in key word related searches, and it keeps your whole eco system happy, Best of All it helps readers of Blog Aggregators- since most of them use a summary on the front page anyways.

An additional thought on Google Page Rank- something I have sulked over but not spoken for a long long time.  It ignores the value of reader- If Bill Gates, Steve Jobs, and 500 ceos from Fortune 500 companies read my blog but do not link to it- it will count daily traffic as 500. Probably it will give more weightage to Paris Hilton fans.

A suggestion-humbly- you can use IP Address lookup of visitors to see if traffic is coming from corporate sources or retail sources -Clicky from GetClicky does this. Use it as feedback in Google Analytics as well as Google Trends.

And maybe PageRank needs to add quantity and quality of visitors as additional variables . Do a A/B test guys some Chi Square juice- its not quite Mad Men Adverting but its still good fun.

 

PageRank
Image via Wikipedia

 

and the world is one big community as per xkcd