The $1 Bailout

Psst ! Wanna save some money for your organization. Lead a shareholder’s rally to accept federal bailout of a dollar. That puts a cap of $500,000 on the CEO’s salary. Easily the most profitable 1$ your company ever borrowed.

Where would all the CEO’s go ?Japan, China ,India pay less than this mostly. Unless you were the original owner of the company.

The top starting salary at my business school used to be $125,000  (that’s for locations out of US or UK). The top guy used to be hated by all the batch as his name got splashed in the newspapers , prompting the rest of parents to ask their child-

Hey , Why does he get so much while you don’t ? ( Now he wont)

On a serious note: If so many companies are declaring losses, that they will carry over on their books,they will actually get a tax benefit for the loss, as well as the bailout money at low rates of interest. Oh well, at least the CEO wont get rich ( if you think that half a million dollars is not rich).

There is something wrong in a world where a tea stall owner ( like in the movie Slumdog Millionaire)……….Wait ! There is something strange in a world where a tea stall owner makes more money than Citigroup and General Motors .Combined. But makes a thousand times less than their boss.

Do you make more money than Bear Sterns and Lehman Brothers ? Write in and share your perspectives.

Interview- Phil Rack

 

Phil Rack is the creator of a Bridge to R and SAS Bridge to R which enables both WPS and SAS softwares to connect to R. He is also a WPS Reseller. WPS is a base SAS equivalent that can take in SAS code , SAS datasets, write SAS code, and create SAS datasets ( and also create its own format)- at the cost of 660 $ a license ( and almost one tenth of a SAS Institute installation on network servers). Having worked in SAS language and analytics consulting for almost 26 years ,Phil runs www.minequest.com besides running the SAS Consultants network that mentors analytics consultant globally ( I am an ex- member :))

Ajay- What has been your career journey. What advice would you give to someone entering a science career after high school?

Phil- I started out consulting full-time in 1983. I left an analytics job with McMillan-McGraw-Hill Publishing because I didn’t believe the company was investing in BI tools and training as it should. That was pretty early in terms of when BI was becoming important.

Many companies at that point saw BI as only two things.

(a) Ability to forecast sales and

(b) ad-hoc reporting with sums/totals and percentages.

It was obvious to me that I had to make a change to do the kind of work I wanted to do. In terms of training, I was formally trained as a demographer and did my graduate studies at Ohio State so I received a pretty good dose of quantitative subject matter as well as a unique perspective on the social implications of markets and geography. If I had to do it over again, I would probably take more course work in the subject area of the “Family.”  I’m always amazed how many times the work I do in banking and finance revolves around the family lifecycle.

 

Ajay- What has been the biggest project success you have seen in your consulting practice?

Phil- This goes back three years to a project where I was working on Basel II compliance with a commercial banking client that I just loved working with.

A few months into the engagement, they pulled me aside and asked me to put together an automotive portfolio stress test for them. This bank had very large loan exposures to the auto market for second and third tier suppliers to the Big Three as well as international auto manufacturers.

The Risk Management group and I sat down for a couple of days and pulled together a project plan and an outline of what we needed to be able implement a dynamic Auto Risk Stress Test Model for this portfolio. The software used was SAS/Base and Excel and the program allowed us to modify 50 to 60 parameters to model different scenarios. All together, it took perhaps three weeks to implement and it was amazingly indicative of the fall out of the auto industry as well as foreshadowing some of the financial carnage in South Eastern Michigan such as lower property values and unemployment.

 

Ajay- “It is not what you know, it is whom you know.” Comment please as an SAS consultant.

Phil- In terms of my business, 80-90% of the work I do is either based on prior work that I’ve done for that company or through referrals.  If you want to have a successful consulting career, you really have to pay attention to developing your network. I’ve taken advantage of social gatherings such as charity events and other social mixers to try to extend my network. I hand out a lot of business cards every year. Formal organizations exist here in Columbus, Ohio such as TechColumbus.org that is a dynamite organization that helps small tech businesses in the area of networking, financing, access to different hardware platforms for testing, etc…  I have mixed emotions about the value of some social networks however.

I see so many individuals on LinkedIn that have 5,000 connections that I have to wonder what it is these folks really do. Who has the time to read all the updates and postings for 5,000 people and still be able to get work out the door? ( Note from Ajay- I have 6300 connections on LinkedIn . Ouch !!)

 

Ajay- What motivated you to write the SAS to R and WPS to R bridges? (Which IS your favorite analytical tool, since you are active in all three?)

Phil- It started out as a “proof-of-concept” exercise and it’s just keeps growing. The WPS to R Bridge is a piece of software that I wrote originally for WPS users to access R from within the WPS Workbench. For those who are unfamiliar with WPS, it’s a SAS/Base alternative that is extremely compatible with your existing SAS/Base software and your code is just plug-and-play. WPS doesn’t have the statistical capabilities of SAS such as SAS/STAT, ETS, OR, etc… so the idea was to write a bridge so that WPS users wouldn’t have to learn a new GUI/IDE to use R. The Bridge gives WPS users access to R graphics as well as any of the R statistical libraries but it has the advantage of the superior data handling of the SAS language. One of the new features is the ability of the WPS to R Bridge to run R programs in parallel. Depending on your hardware, you can easily run six to a dozen R programs simultaneously and collect the R listing and log files back into the WPS listing and log in the order you submitted the programs.

I did write a Bridge to R for SAS users but very few SAS users have expressed interest in it. I suppose that SAS users are happy enough paying the fat licensing fees to SAS that it just doesn’t matter to them. I have to say, my favorite tool at the moment is WPS. I find the interface/workbench to be so superior to what SAS has to offer that I now find myself writing code in WPS and then taking it over to SAS if that’s what the client requires.

 

Ajay- What do you think about internet based delivery and social networking including communities and lists changing the software product cycle?

Phil- This somewhat goes back to question #3 in terms of communities. I think it has its value as a place to share your concerns and find answers to difficult programming issues. Now, Internet delivery and cloud computing I find very interesting. I think there’s some strong advantages to using the cloud to provide services to your clients. If you look at the SAS pricing model, they really take it to you financially if you want to use your license to be a DSP (data service provider) or put your code on an intranet/internet. For some reason, SAS is just hostile when it comes to small and medium sized businesses. Companies like World Programming who license WPS have a much more realistic idea of licensing in that you can expose your WPS license to your intranet/internet and not have to pay 10x the fees that SAS charges. WPS doesn’t charge additional fees for those who are DSP’s either and there are quite a few of them in the Pharma domain.

Beyond security challenges associated with cloud computing, I think SaaS that provides analytical services such as high performance forecasting and name and address cleanup and verification are ripe for the picking. One other issue I see with cloud computing is when you have tens of gigs of data that you have to move from your desktop or server to the cloud. The infrastructure just isn’t fast enough, or let’s say reasonably priced, to allow for moving this amount of data to really scale well.

Ajay- How does MineQuest intend to influence the analytical software paradigm?

Phil- I think the role for MineQuest in the next few years is twofold.

We’ll keep offering services to banks and other financial service firms in the area of Operational Risk and SAS programming.

The other area is to help these large financial service companies realize that they can save millions of dollars by moving their SAS Server licenses to WPS. This
also allows the smaller businesses who have steered away from SAS software because of cost to begin using WPS and not take such a big financial hit. I find it exciting to think how this will also open the job market for the thousands of SAS programmers out there already.

The BI battles are taking place on the desktop and Windows Servers and MineQuest has invested a lot of time and effort in creating macro libraries to help these organizations migrate their code to WPS and access R for advanced statistical capabilities.

We believe that the bread and butter software for almost any financial organization in the BI realm ultimately revolves around the SAS language for reporting, summarization and disbursement of data and we plan to continue to serve that market.

About Minequest –

MineQuest has been providing SAS Consulting and Programming Services for more than 25 years. Our associates and employees are expert SAS programmers and specialize in the Banking and Financial Industries. Our staff has expertise in such areas as Market Analytics, ETL and Reporting Systems, Fraud Detection, and Credit Risk and Operational Risk segments. Validating Operational Risk models using SAS, in support of the Basel II Capital Framework is one of our specialties. We have real world experience developing SAS software to test and validate Credit and Operational Risk Systems like Fair Isaac’s Blaze Advisor which is one of our areas with subject matter expertise.

MineQuest, LLC

SAS & WPS Consulting and WPS Reseller

Tel: (614) 457-3714

Web: www.MineQuest.com

Blog: www.MineQuest.com/WordPress

image001

( Ajay –

SAS language uses mainly Procs and Data step for output and input.Base SAS is a product copyrighted by the SAS Institute (www.sas.com) .SAS Institute has been leading the analytics world since the 70’s.WPS is copyright of World Programming Company (WPC) (www.teamwpc.co.uk/products/wps ) )

As mentioned before, Zementis is at the forefront of using Cloud Computing ( Amazon EC2 ) for open source analytics. Recently I came in contact with Michael Zeller for a business problem , and Mike being the gentleman he is not only helped me out but also agreed on an extensive and exclusive interview.(!)

image

Ajay- What are the traditional rivals to scoring solutions offered by you. How does ADAPA compare to each of them. Case Study- Assume I have 50000 leads daily on a Car buying website. How would ADAPA help me in scoring the model ( created say by KXEN or , R or,SAS, or SPSS).What would my approximate cost advantages be if I intend to mail say the top 5 deciles everyday.

Michael- Some of the traditional scoring solutions used today are based on SAS, in-database scoring like Oracle, MS SQL Server, or very often even custom code.  ADAPA is able to import the models from all tools that support the PMML standard, so any of the above tools, open source or commercial, could serve as an excellent development environment.

The key differentiators for ADAPA are simple and focus on cost-effective deployment:

1) Open Standards – PMML & SOA:

Freedom to select best-of-breed development tools without being locked into a specific vendor;  integrate easily with other systems.

2) SaaS-based Cloud Computing:

Delivers a quantum leap in cost-effectiveness without compromising on scalability.

In your example, I assume that you’d be able to score your 50,000 leads in one hour using one ADAPA engine on Amazon.  Therefore, you could choose to either spend US$100,000 or more on hardware, software, maintenance, IT services, etc., write a project proposal, get it approved by management, and be ready to score your model in 6-12 months…

OR, you could use ADAPA at something around US$1-$2 per day for the scenario above and get started today!  To get my point across here, I am of course simplifying the scenario a little bit, but in essence these are your choices.

Sounds too good to be true?  We often get this response, so please feel free to contact us today [http://www.zementis.com/contact.htm] and we will be happy show you how easy it can be to deploy predictive models with ADAPA!

 

Ajay- The ADAPA solution seems to save money on both hardware and software costs. Comment please. Also any benchmarking tests that you have done on a traditional scoring configuration system versus ADAPA.

Michael-Absolutely, the ADAPA Predictive Analytics Edition [http://www.zementis.com/predictive_analytics_edition.htm] on Amazon’s cloud computing infrastructure (Amazon EC2) eliminates the upfront investment in hardware and software.  It is a true Software as a Service (SaaS) offering on Amazon EC2 [http://www.zementis.com/howtobuy.htm] whereby users only pay for the actual machine time starting at less than US$1 per machine hour.  The ADAPA SaaS model is extremely dynamic, e.g., a user is able to select an instance type most appropriate for the job at hand (small, large, x-large) or launch one or even 100 instances within minutes.

In addition to the above savings in hardware/software, ADAPA also cuts the time-to-market for new models (priceless!) which adds to business agility, something truly critical for the current economic climate.

Regarding a benchmark comparison, it really depends on what is most important to the business.  Business agility, time-to-market, open standards for integration, or pure scoring performance?  ADAPA addresses all of the above.  At its core, it is a highly scalable scoring engine which is able to process thousands of transactions per second.  To tackle even the largest problems, it is easy to scale ADAPA via more CPUs, clustering, or parallel execution on multiple independent instances. 

Need to score lots of data once a month which would take 100 hours on one computer?  Simply launch 10 instances and complete the job in 10 hours over night.  No extra software licenses, no extra hardware to buy — that’s capacity truly on-demand, whenever needed, and cost-effective.

Ajay- What has been your vision for Zementis. What exciting products are we going to see from it next.

Michael – Our vision at Zementis [http://www.zementis.com] has been to make it easier for users to leverage analytics.  The primary focus of our products is on the deployment side, i.e., how to integrate predictive models into the business process and leverage them in real-time.  The complexity of deployment and the cost associated with it has been the main hurdle for a more widespread adoption of predictive analytics. 

Adhering to open standards like the Predictive Model Markup Language (PMML) [http://www.dmg.org/] and SOA-based integration, our ADAPA engine [http://www.zementis.com/products.htm] paves the way for new use cases of predictive analytics — wherever a painless, fast production deployment of models is critical or where the cost of real-time scoring has been prohibitive to date.

We will continue to contribute to the R/PMML export package [http://www.zementis.com/pmml_exporters.htm] and extend our free PMML converter [http://www.zementis.com/pmml_converters.htm] to support the adoption of the standard.  We believe that the analytics industry will benefit from open standards and we are just beginning to grasp what data-driven decision technology can do for us.  Without giving away much of our roadmap, please stay tuned for more exciting products that will make it easier for businesses to leverage the power of predictive analytics!

Ajay- Any India or Asia specific plans for the Zementis.

Michael-Zementis already serves customers in the Asia/Pacific region from its office in Hong Kong.  We expect rapid growth for predictive analytics in the region and we think our cost-effective SaaS solution on Amazon EC2 will be of great service to this market.  I could see various analytics outsourcing and consulting firms benefit from using ADAPA as their primary delivery mechanism to provide clients with predictive  models that are ready to be executed on-demand.

Ajay-What do you believe be the biggest challenges for analytics in 2009. What are the biggest opportunities.

Michael-The biggest challenge for analytics will most likely be the reduction in technology spending in a deep, global recession.  At the same time, companies must take advantage of analytics to cut cost, optimize processes, and to become more competitive.  Therefore, the biggest opportunity for analytics will be in the SaaS field, enabling clients to employ analytics without upfront capital expenditures.

Ajay – What made you choose a career in science. Describe your journey so far.What would your advice be to young science graduates in this recessionary times.

Michael- As a physicist, my research focused on neural networks and intelligent systems.  Predictive analytics is a great
way for me to stay close to science while applying such complex algorithms to solve real business problems.  Even in a recession, there is always a need for good people with the desire to excel in their profession.  Starting your career, I’d say the best way is to remain broad in expertise rather than being too specialized on one particular industry or proficient in a single analytics tool.  A good foundation of math and computer science, combined with curiosity in how to apply analytics to specific business problems will provide opportunities, even in the current economic climate.

About Zementis

Zementis, Inc. is a software company focused on predictive analytics and advanced Enterprise Decision Management technology. We combine science and software to create superior business imageand industrial solutions for our clients. Our scientific expertise includes statistical algorithms, machine learning, neural networks, and intelligent systems and our scientists have a proven record in producing effective predictive models to extract hidden patterns from a variety of data types. It is complemented by our product offering ADAPA®, a decision engine framework for real-time execution of predictive models and rules. For more information please visit www.zementis.com

Ajay-If you have a lot of data ( GB’s and GB’s) , an existing model ( in SAS,SPSS,R) which you converted to PMML, and it is time for you to choose between spending more money to upgrade your hardware, renew your software licenses  then instead take a look at the ADAPA from www.zementis.com and score models as low as 1$ per hour. Check it out ( test and control !!)

Do you have any additional queries from Michael ? Use the comments page to ask….

Interview –Michael Zeller CEO,Zementis

As mentioned before, Zementis is at the forefront of using Cloud Computing ( Amazon EC2 ) for open source analytics. Recently I came in contact with Michael Zeller for a business problem , and Mike being the gentleman he is not only helped me out but also agreed on an extensive and exclusive interview.(!)

image

Ajay- What are the traditional rivals to scoring solutions offered by you. How does ADAPA compare to each of them. Case Study- Assume I have 50000 leads daily on a Car buying website. How would ADAPA help me in scoring the model ( created say by KXEN or , R or,SAS, or SPSS).What would my approximate cost advantages be if I intend to mail say the top 5 deciles everyday.

Michael- Some of the traditional scoring solutions used today are based on SAS, in-database scoring like Oracle, MS SQL Server, or very often even custom code.  ADAPA is able to import the models from all tools that support the PMML standard, so any of the above tools, open source or commercial, could serve as an excellent development environment.

The key differentiators for ADAPA are simple and focus on cost-effective deployment:

1) Open Standards – PMML & SOA:

Freedom to select best-of-breed development tools without being locked into a specific vendor;  integrate easily with other systems.

2) SaaS-based Cloud Computing:

Delivers a quantum leap in cost-effectiveness without compromising on scalability.

In your example, I assume that you’d be able to score your 50,000 leads in one hour using one ADAPA engine on Amazon.  Therefore, you could choose to either spend US$100,000 or more on hardware, software, maintenance, IT services, etc., write a project proposal, get it approved by management, and be ready to score your model in 6-12 months…

OR, you could use ADAPA at something around US$1-$2 per day for the scenario above and get started today!  To get my point across here, I am of course simplifying the scenario a little bit, but in essence these are your choices.

Sounds too good to be true?  We often get this response, so please feel free to contact us today [http://www.zementis.com/contact.htm] and we will be happy show you how easy it can be to deploy predictive models with ADAPA!

 

Ajay- The ADAPA solution seems to save money on both hardware and software costs. Comment please. Also any benchmarking tests that you have done on a traditional scoring configuration system versus ADAPA.

Michael-Absolutely, the ADAPA Predictive Analytics Edition [http://www.zementis.com/predictive_analytics_edition.htm] on Amazon’s cloud computing infrastructure (Amazon EC2) eliminates the upfront investment in hardware and software.  It is a true Software as a Service (SaaS) offering on Amazon EC2 [http://www.zementis.com/howtobuy.htm] whereby users only pay for the actual machine time starting at less than US$1 per machine hour.  The ADAPA SaaS model is extremely dynamic, e.g., a user is able to select an instance type most appropriate for the job at hand (small, large, x-large) or launch one or even 100 instances within minutes.

In addition to the above savings in hardware/software, ADAPA also cuts the time-to-market for new models (priceless!) which adds to business agility, something truly critical for the current economic climate.

Regarding a benchmark comparison, it really depends on what is most important to the business.  Business agility, time-to-market, open standards for integration, or pure scoring performance?  ADAPA addresses all of the above.  At its core, it is a highly scalable scoring engine which is able to process thousands of transactions per second.  To tackle even the largest problems, it is easy to scale ADAPA via more CPUs, clustering, or parallel execution on multiple independent instances. 

Need to score lots of data once a month which would take 100 hours on one computer?  Simply launch 10 instances and complete the job in 10 hours over night.  No extra software licenses, no extra hardware to buy — that’s capacity truly on-demand, whenever needed, and cost-effective.

Ajay- What has been your vision for Zementis. What exciting products are we going to see from it next.

Michael – Our vision at Zementis [http://www.zementis.com] has been to make it easier for users to leverage analytics.  The primary focus of our products is on the deployment side, i.e., how to integrate predictive models into the business process and leverage them in real-time.  The complexity of deployment and the cost associated with it has been the main hurdle for a more widespread adoption of predictive analytics. 

Adhering to open standards like the Predictive Model Markup Language (PMML) [http://www.dmg.org/] and SOA-based integration, our ADAPA engine [http://www.zementis.com/products.htm] paves the way for new use cases of predictive analytics — wherever a painless, fast production deployment of models is critical or where the cost of real-time scoring has been prohibitive to date.

We will continue to contribute to the R/PMML export package [http://www.zementis.com/pmml_exporters.htm] and extend our free PMML converter [http://www.zementis.com/pmml_converters.htm] to support the adoption of the standard.  We believe that the analytics industry will benefit from open standards and we are just beginning to grasp what data-driven decision technology can do for us.  Without giving away much of our roadmap, please stay tuned for more exciting products that will make it easier for businesses to leverage the power of predictive analytics!

Ajay- Any India or Asia specific plans for the Zementis.

Michael-Zementis already serves customers in the Asia/Pacific region from its office in Hong Kong.  We expect rapid growth for predictive analytics in the region and we think our cost-effective SaaS solution on Amazon EC2 will be of great service to this market.  I could see various analytics outsourcing and consulting firms benefit from using ADAPA as their primary delivery mechanism to provide clients with predictive  models that are ready to be executed on-demand.

Ajay-What do you believe be the biggest challenges for analytics in 2009. What are the biggest opportunities.

Michael-The biggest challenge for analytics will most likely be the reduction in technology spending in a deep, global recession.  At the same time, companies must take advantage of analytics to cut cost, optimize processes, and to become more competitive.  Therefore, the biggest opportunity for analytics will be in the SaaS field, enabling clients to employ analytics without upfront capital expenditures.

Ajay – What made you choose a career in science. Describe your journey so far.What would your advice be to young science graduates in this recessionary times.

Michael- As a physicist, my research focused on neural networks and intelligent systems.  Predictive analytics is a grea
t way for me to stay close to science while applying such complex algorithms to solve real business problems.  Even in a recession, there is always a need for good people with the desire to excel in their profession.  Starting your career, I’d say the best way is to remain broad in expertise rather than being too specialized on one particular industry or proficient in a single analytics tool.  A good foundation of math and computer science, combined with curiosity in how to apply analytics to specific business problems will provide opportunities, even in the current economic climate.

About Zementis

Zementis, Inc. is a software company focused on predictive analytics and advanced Enterprise Decision Management technology. We combine science and software to create superior business imageand industrial solutions for our clients. Our scientific expertise includes statistical algorithms, machine learning, neural networks, and intelligent systems and our scientists have a proven record in producing effective predictive models to extract hidden patterns from a variety of data types. It is complemented by our product offering ADAPA®, a decision engine framework for real-time execution of predictive models and rules. For more information please visit www.zementis.com

Ajay-If you have a lot of data ( GB’s and GB’s) , an existing model ( in SAS,SPSS,R) which you converted to PMML, and it is time for you to choose between spending more money to upgrade your hardware, renew your software licenses  then instead take a look at the ADAPA from www.zementis.com and score models as low as 1$ per hour. Check it out ( test and control !!)

Do you have any additional queries from Michael ? Use the comments page to ask….

SPSS and R

I rarely use SPSS now, but in college ( www.iiml.ac.in) my marketing professors kind of ensured I was buried in it for weeks. Much later I did to some ARIMA forecasting in SPSS for macro economic indicators prediction ( details coming up)–

 

However the SPSS help list is a great one ( SPSSX-L@LISTSERV.UGA.EDU) , not just for staying in touch with SPSS but also with the latest statistical modeling techniques. Here is an extract from the list ( www.listserv.uga.edu/archives/spssx-l.html ) on using SPSS and R together

 

Assuming version 16 or later, you need to install the R plug-in from Developer Central.  Then your R syntax can be run in the syntax window between

BEGIN PROGRAM R.

and

END PROGRAM R.

The output automatically appears in the SPSS Viewer with two cautions.  1) In version 16, R graphics are written to files and don’t appear in the Viewer.  Version 17 integrates the graphics directly.  2) When using R interactively, expression output appears in your console windows, e.g.,

summary(dta)

displays the summary statistics for a data frame, dta.  In non-interactive mode, which is what you are in when running BEGIN PROGRAM, you need to enclose the expression in a print function for it to display, e.g.,

print(summary(dta))

The documentation for the apis to communicate between SPSS and R is installed along with the plug-in, and there are examples in the Data Management book linked on Developer Central (www.spss.com/devcentral).

You might also go through the PowerPoint article on Developer Central, "Programmability in SPSS Statistics 17", which you will find on the front page of the site.  It includes a detailed example of using the R Quantreg package in SPSS as an extension command.  There is also a download in the R section on creating an SPSS dialog box that generates an R program directly.  Look for Rboxplot – Creating an R Program from a Dialog.  This has a simple dialog box that generates code for an R boxplot along with an article that explains what is happening.

 

Ajay ‘s 2 cents– SPSS treats R as an opportunity rather than a threat, partly because SPSS is a much lower priced software , and has been working to displace SAS in vain for some time now.

SAS ( the company and not the language) as the market leader has the most to lose due to

  • its high market share ( which it has maintained by aggressively seeking both legal action as well as by pumping in or investing or generously giving — huge amounts of money in hosting conferences,papers and research and keeping alumni and current employees happy and loyal),

and

  • premium pricing ( which comes under greater pricing pressure amid a general economic downturn amongst its preferred customers -especially banks and companies like Amazon , GE Money etc)

and

  • multi pronged competition with tacit support from bigger players waiting on sidelines
  • ( like IBM has an alliance with WPS which is almost a de facto Base SAS clone as it can take in SAS datasets, SAS code, and output SAS code, SAS datasets besides having it’s own Eclipse based design for the Workbench
  • Microsoft expanding data mining capabilities in SQL Server and initiatives like Microsoft Azure ( OS for Cloud Computers ) and Microsoft Mesh .
  • open source players like R, KNIME, Rapid Miner getting commercial momentum due to better value for cost ( 0 ).

and

  • data and code portability between SAS,SPSS,R due to PMML standards means switching barriers are getting lowered. There are almost no switching barriers between Base SAS and WPS in my testing experience.

The coming market share battles between SAS, and WPS and R will be interesting to watch for the analyst/customers — that is if the current economic crisis doesn’t claim any of the companies or the clients first. Alliances as well community networking among users and developers could be critical.

Still innovation flows from creative destruction of old ideas, mindsets, attitudes and yes even software.

SAS L Get out the Vote

Voting is still on for SAS L rookie of the year. 

One of the earliest Rookies nominated from India is ehm, me, which can be pretty rare.

Name            # of 2007  # of 2008
                       posts      posts
Ajay Ohri             0          351
Joe Matise           0          250
Karma Dorjetarap  4          219
Akshaya Nathilvar  0          169
Scott Bucher        0          154
Stefan Pohl          5          148
Richard Wright      0           79
Choy Junyu          0           73
Scott Raynaud      0           69
SAS 9 BI USER      0           55
Tom Smith           0           53
Jim Agnew           0           52
Josh Lee              0           51

 

If you are on the SAS-L list, you can vote for the following

You can vote (one vote per person please) at:
http://ires.ku.edu/~ipsr/SGF2009/saslbof.htm
Voting will end February 12th.

KNIME

Check out KNIME from www.knime.org if you are looking for modular data extensibility and ability to do exploratory analysis. You can use it for data modeling using decision tree and extend it further. Thanks to Bob Schultz and REVolution Computing guys as well as Mike Zeller of www.zementis.com for leading- pointing me this way.

  • Yes they (KNIME.org) have a commercial version as well as free version.
  • No, they wont charge you in hidden costs. Including training or learning time etc.
  • Yes , they do use PMML for porting data from platforms.
  • Best of all , it is great website with video tutorials and segmented data downloads ( German efficiency !!) while the www.r-project.org website is functional but uses HTML 4.0 ( which is from the seventies.or the eighties. or whatever)
  • No, they wont charge you for it !!!

From the website –

Welcome!

KNIME, pronounced [naim], is a modular data exploration platform that enables the user to visually create data flows (often referred to as pipelines), selectively execute some or all analysis steps, and later investigate the results through interactive views on data and models.

KNIME was developed (and will continue to be expanded) by the Chair for Bioinformatics and Information Mining at the University of Konstanz, Germany. The group headed by Michael Berthold is also using KNIME for teaching and research at the University. Quite a number of new data analysis methods developed at the chair are integrated in KNIME. Let us know if you are looking for something in particular, not all of those modules are part of the standard KNIME release just yet…

KNIME base version already incorporates over 100 processing nodes for data I/O, preprocessing and cleansing, modelling, analysis and data mining as well as various interactive views, such as scatter plots, parallel coordinates and others. It integrates all analysis modules of the well known Weka data mining environment and additional plugins allow R-scripts to be run, offering access to a vast library of statistical routines.

KNIME is based on the Eclipse platform and, through its modular API, easily extensible.

image

image

Coming up- Technical comparison of KNIME with Rapid Miner (http://rapid-i.com/content/view/26/82/) – which is similar in both free version and commercial licensing.These are both data mining rather than predictive analytics products.

I wish they host both KNIME as well as Rapid Miner on the Cloud using the Ohri Framework ( a joke which began on the SAS Consulting Group) on a Windows 64 OS , with remote desktop like functionality.So me just  logins , uploads the data,press button, wait for a sec and downloads the results.

Sigh  !!

( All screenshots in this post are acknowledged of www.knime.org)