Interview Paul van Eikeren Inference for R

visit this

http://decisionstats.posterous.com/decisionstats-interview-paul-van-eikeren-infe

Interview Paul van Eikeren Inference for R

Here is an interview with Paul van Eikeren, President and CEO of Blue Reference, Inc. Paul heads up a startup company addressing the need of information workers to have easier-cheaper-faster access to high-end data mining, analysis and reporting capabilities from software like R, S-plus, MATLAB, SAS, SPSS, python and ruby. His recent product Inference for R has been causing waves within the analytical fraternity across both R users and SAS users, especially given the fact that it is quite well designed, has a great GUI, and is priced rather reasonably.

A few weeks ago, rumour had it the SAS Institute was reportedly buying out the Inference for R product ( Note the merger and acquisition question below)

Rather curious to know about this company, I happened to met Ben Hincliffe at the http://www.analyticbridge.com site which with 5000 members has the largest number of data analytics and many business intelligence members as well). Ben who recently authored a guest post for Sandro at Data Mining Blog then put across my request to interview with Paul, the CEO for Blue Reference. Existing products for Blue Reference include additional analytical packages like Inference for Matlab etc.

Paul is an extremely seasoned person with years in the analytical fraternity and with a Phd from MIT. Here is Paul’s vision on his company and analytics product development.
pve1

Ajay: Describe your career journeys. What advice would you give to today’s young people of following careers in science.

Paul: I have been blessed with extremely productive and diversified career journey. After receiving undergraduate and graduate degrees in chemistry, I taught chemistry and carried out research as a college professor for 14 years. During the next 12 years I spend heading R&D teams at three different startup companies focused on the application of novel processing technology for use in drug discovery and development. And using that wealth of acquired experience, I have had the good fortune to successfully co-found and develop with my son Josh, two startup companies (IntelliChem and Blue Reference) directed at the use of informatics to drive more efficient and effective Research, Development, Manufacturing and Operations.

In my journey I have had the opportunity to counsel many young people regarding their career choices. I have offered two principal pieces of advice: one, for the right person, science represents an outstanding opportunity for a productive and satisfying career; and two, a science education provides an outstanding stepping stone to careers in other fields. A study disclosed in a recent Wall Street Journal article (Sarah E. Needleman, “Doing the Math to Find the Good Jobs, 26 January 2009) revealed that mathematicians land the top spot in the new rankings of the best occupations. Science-linked occupations took 7 out of the top 20 spots.

These ratings suggest that the problem solving and innovation aspects of scientific occupations are much less stressful than other occupations, which leads to high job satisfaction. But does one have to be a genius to have a successful career in science? An interesting read on this subject is the book by Robert Weisberg (Creativity: Beyond the Myth of the Genius) in which he dispels the myth of the genius being the results of a genetic gift. Weisberg argues, convincingly, that a genius exhibits three elements: (1) a basic intellectual capacity; (2) a high level of motivation/determination, which enables the genius to remain focused; and (3) immersion in their chosen field, typically represented by over 10,000 hours of study/practice/experience. It turns out that the latter element is the principal differentiator, and fortunately, it is something one has control over.

Ajay: Describe the journey that Blue Reference has made leading to its current product line, including Inference for R.

Paul: The Inference product suite represents a natural extension beyond the Electronic Laboratory Notebook (ELN) product we developed at our previous company, IntelliChem. ELNs are used by scientists and technicians to document research, experiments and procedures performed in a laboratory. The ELN is a fully electronic replacement of the paper notebook. IntelliChem (sold to Symyx in 2004) was a leader in deployment of ELNs at global pharmaceutical companies.

After seeing the successful adoption of ELNs in the laboratory, we saw an opportunity to improve upon the utility of ELN documents and the data contained therein. Essentially, we developed Inference to be a platform for enabling MS Office documents with powerful, flexible, and transparent analytic capabilities – what we call “dynamic documents” or “document mashups”. Executable code from high-level scripting languages like R, MATLAB, and .NET, is combined with data and explanatory text in the document canvas to transform it from a static record into an analytic application.

The pharmaceutical industry, in cooperation with the FDA, has begun to look at ways to implement quality by design (QbD) practices as an alternative to quality by end-testing. QbD comprises a systematic application of predictive analytics to the drug R&D process such that development timelines and costs are reduced while drug safety and efficacy is improved.

Statistical modeling and analysis plays a key role in QbD as a tool for identifying critical quality attributes and confining their variability to a specified design space. Dynamic documents fit nicely into this paradigm, and we’re currently using Inference as a platform to develop an enterprise solution for QbD. You can visit http://www.InferenceForQbD.com for more information about our QbD product.

Along the way, we recognized the need for Inference outside of the pharmaceutical industry. The Inference for R, Inference for MATLAB, and Inference for.NET versions are meant to serve users of these technical computing languages who have analysis, publishing, reporting, collaboration, and reproducible research needs that are best served by a document centric environment. By using Microsoft Word, Excel and PowerPoint as the “front end,” we can serve the the 500 million users that use Microsoft Office as their principal desktop productive application.

Ajay: What is the pricing strategy for Inference for Matlab and Inference for R – and how do you see the current recession as an opportunity for analytical products.

Paul: Our strategy is to reach out to the market Microsoft Office users that would benefit from easy access to datamining and predictive analytics capabilities within their principal desktop productivity tool. Accordingly, we have offered the Inference product at the low price of $199 for a single user/one year subscription. Additionally, because it is implemented on top of an existing installation of Microsoft Office, the cost of training, support and maintenance are expected to be minimal.

create-a-simple-user-interface-for-your-r-application
create-a-simple-user-interface-for-your-r-application

r-code-directly-in-excel-to-customize-your-analysis
r-code-directly-in-excel-to-customize-your-analysis

graphical-output-in-an-excel-tab
graphical-output-in-an-excel-tab

Ajay: Your product seems to follow a nice fit where both open source as well as proprietary packages from Microsoft( .Net) are working together to give the customer a nice solution. Do you believe it is possible that big companies and big open source communities can work together to create some software rather than just be at loggerheads.

Paul: Absolutely. We’re seeing momentum build for open source analytic solutions as the economy impacts companies, both small and large. We saw this take place in the back office with implementation of Linux and Apache Web servers, and now we’re starting to see it in the front office. Smart IT teams are looking for creative ways to stretch their resources, forcing them to look beyond established, but expensive, software products.

We’ve encountered concrete evidence of this in the financial industry. Fresh on the heels of the credit crisis, investment banks and hedge funds have begun to realize that their risk models and supporting software infrastructure are inadequate. In response, quantitative finance and risk analysts are increasingly turning to the open source R statistical computing environment for improved predictive analytics.

R has a core group of devotees in academia that drive innovation, making it a comprehensive venue for development of leading-edge data analysis methods. In order to leverage these tools, banks need a way

for R to play nicely with their existing personnel and IT infrastructure. This is where Inference for R produces real value. It transforms MS Office into platform for the development, distribution, and maintenance of R based quantitative tools – enabling production level predictive analytics.

Commercial distributions of R address issues of scalability and support, which might otherwise be subjects of concern. For example, REvolution Computing distributes an optimized, validated and supported distribution of R, providing peace of mind to corporate IT. REvolution also offers Enterprise R, a distribution of R for 64-bit, high performance computing.

Ajay: Please name any successful customer testimonials for Inference for R.

Paul: We have been working with the director of quantitative analytics at a large international bank. He reported that he has successfully distributed R applications to his team of research analysts and portfolio managers based on Inference in Excel. Use of this strategy eliminated the need to code complex models in Visual Basic for applications, which is time consuming and error prone.

Ajay: Also are there any issues with licensing and IP for mixing open source code and proprietary code.

Paul- The licensing issues with open source R pertain to distributing R. There are no licensing restrictions in using R. Accordingly, we do not distribute R. Rather, our customers install R separately and Inference recognizes the installation.

Ajay: So R is free and I can get Open Office for free. What are the five specific uses where Inference for R can score an edge over this and make me pay for the solution.

Paul: R is free, and many R enthusiasts would argue that all you need for R is a Linux operating system like Ubuntu, a text editor such as Emacs, and R’s command line interface. For some highly-skilled R users this is sufficient; for the new and average R user this is a nightmare.

Many people think that the largest fraction of the cost of implementing new software is the cost of the license. In actuality, and especially in the corporate world, it is the cost of training, user support, software maintenance, and the costs of switching the user base to the new software. Free open source software does not help here. Hence there is a strong ROI argument to be made to build new software application on top of existing systems that have worked well.

Additionally, successful implementation of open source software like R requires a baseline of integration with existing systems. The fact is that Microsoft operating systems dominate the business world, as does Microsoft Office. If one is serious about using R to address the analytic needs of big business, tight integration with these systems is imperative.

Ajay: Any plans for a web hosted SaaS version for Inference for R soon?

Paul: The natural progression of Inference for R to SaaS will coincide with the next release of Office (Office 2010 or Office 14), which we expect to be largely SaaS enabled.

Ajay: Name some alliances and close partners working with Blue Reference

– and what we can expect from you in terms of product launches in 2009.

Paul: We have created a product development consortium in partnership involving ‘top ten’ global pharmaceutical companies The consortium is guiding the development of an enterprise solution for Quality by Design (QbD), using Inference for R as the platform.

We are working with several consulting firms specializing in IT solutions for specialized markets like risk management and predictive analytics.

We are also working with several technology partners who have complementary products and where integration of their products with Inference provides clear and significant value to customers.

Ajay: Any truth to the rumors of an acquisition by a BIG analytics company?

Paul: Our business strategy is centered on growth through partnerships with others. Acquisition is one means to execute that strategy.

Ajay: How do you see this particular product (for R) shaping up down the years.

Paul: R’s success can be attributed, in large part, to the support of its loyal open source community. Its enthusiastic use in academia bodes very well for its growth as a cutting-edge analytics tool. It is just a matter of time before commercial analytic solutions powered by R become de rigueur. We’re happy to be at the tip of the spear.

Ajay: Any Asia plans for Blue Reference or are you still happy with the Oregon location. How do you plan to interact with graduate schools and academia for your products.

Paul: Although we don’t have a major private university in our backyard, Oregon State University has opened a campus here. And, we’ve been in dialogue with the global Academic community from day one. Over 100 academic institutions around the world use Inference through our academic licensing program. Inference is a great tool for preparing dynamic lessons and publishing reproducible research.

Our Central Oregon location is home to a growing high-tech sector that we’ve been a part of for decades. We’ve had success building large and profitable companies here. Bend attracts Silicon Valley types who come here for vacation and don’t want to leave – they just can’t seem to resist the quality of life and bountiful recreational opportunities that this area offers. It’s a good mix of work and play.

Biography

Paul van Eikeren is President and CEO of Blue Reference, Inc. He is responsible for guiding the strategic direction of the company through novel products and services development, partnerships and alliances in the realm of application of informatics to faster-cheaper-better research, development, manufacturing and operations. Van Eikeren is a successful serial entrepreneur, which includes the co-founding of IntelliChem with his son Josh and its ultimate sale to Symyx Technologies. He has headed up R&D at several startup companies focused on drug discovery and development including Sepracor Inc., Argonaut Technologies, Inc, and Bend Research, Inc. He served as Professor of Chemistry and Biochemistry at Harvey Mudd College of Science and Engineering. He is author/co-author and inventor/co-inventor in over 50 scientific articles and patents directed at the application of chemical, biochemical and computational technologies. Van Eikeren holds a BA degree in Chemistry from Columbia University and a PhD in Chemistry from MIT.bluereference-logo

Ajay- To know more I recommend checking out the free evaluation at http://inferenceforr.com/ especially if you need to rev up your MS office Installation with greater graphics and analytics juice.

More R please

some R news

0 The R Foundation Website I guess the http://www.r-project.org team is busy prettyfying before the annual R users conference kicks in- the website of www.r-project.org ( I was told it looks has the aesthetic visual appeal of dead cat splattered on the autobahn a very HTML 4.0 kind of retro look )

I cant believe the R Site and R core honchos finds the following image the prettiest image to represent graphical abilities of R

The R core site has tremendous functionality and demand though I wonder if they can just put up some ads and get some funding/ two way research tie- up with Google —Google uses R extensively, and can help with online methods as well, and is listed as supporting organization at http://www.r-project.org/foundation/memberlist.html …..

The R archives are a collection of emails and thats not documentation at all – but

1 Revolution R Website and particularly David Smith’s blog is a great way to stay updated on R news at http://blog.revolution-computing.com/

I have covered REvolution R before, and they are truly impressive.

http://www.decisionstats.com/2009/01/31/interviewrichard-schultz-ceo-revolution-computing/

It seems the domain name revolutioncomputing.com was squatted ( by NC?) so thats why the hyphenated web name. It is a very lucid website- though I do request them to put more video/podcasts and a Tweet this button would be great :))

and another more techie post here

http://blog.revolution-computing.com/2009/05/verifying-zipfs-powerdistribution-law-for-cities.html

Another great source is the Twitter – it seems that Twitter R users use the hashtag #rstats to search for R kind of news and code – that should help R bloggers and at a later date users.

Click here for checking it out

http://search.twitter.com/search?q=#stats

2 Some more R forums and sites

Forum for R Enterprise Users http://www.revolution-computing.com/forum

A R Tips Site http://onertipaday.blogspot.com/

The R Journal ( yes there is a journal for all hard working R fans) http://journal.r-project.org/

R on Linkedin http://www.linkedin.com/groups?about=&gid=77616

and the Analytic Bridge community group for R

http://www.analyticbridge.com/group/rprojectandotherfreesoftwaretools

2 Here is a terrific post by Robert Grossman

at http://blog.rgrossman.com/2009/05/17/running-r-on-amazons-ec2/

I liked the way he built the case for using R on Amazon EC2 ( Business case not Use case) and then proceeded to a step by step tutorial simple and powerful blog post. I hope R comes out with a standardized Online R Doc like that which is a single point search able archive for code – something like the SAS online doc (which remains free for WPS users 😉 ) but the way the web is evolving it seems the present mish mash method would continue

the main steps to use R on a pre-configured AMI.

Set up.
The set up needs to be done just once.

1. Set up an Amazon Web Services (AWS) account by going to:

aws.amazon.com.

If you already have an Amazon account for buying books and other items from Amazon, then you can use this account also for AWS.
2. Login to the AWS console
3. Create a “key-pair” by clinking on the link “Key Pairs” in the Configuration section of the Navigation Menu on the left hand side of the AWS console page.
4. Clink on the “Create Key Pair” button, about a quarter of the way down the page.
5. Name the key pair and save it to working directory, say /home/rlg/work.

Launching the AMI. These steps are done whenever you want to launch a new AMI.

1. Login to the AWS console. Click on the Amazon EC2 tab.
2. Click the “AMIs” button under the “Images and Instances” section of the left navigation menu of the AWS console.
3. Enter “opendatagroup” in the search box and select the AMI labeled
“opendatagroup/r-timeseries.manifest.xml”, which
is AMI instance “ami-ea846283″.
4. Enter the number of instances to launch (1), the name of the key pair that you have previously created, and select “web server” for the security group. Click the launch button to launch the AMI. Be sure to terminate the AMI when you are done.
5. Wait until the status of the AMI is “running.” This usually takes about 5 minutes.

Accessing the AMI.

1. Get the public IP address of the new AMI. The easiest way to do this is to select the AMI by checking the box. This provides some additional information about the AMI at the bottom of the window. You can can copy the IP address there.
2. Open a console window and cd to your working directory which contains the key-pair that you previously downloaded.
3. Type the command:
ssh -i testkp.pem -X root@ec2-67-202-44-197.compute-1.amazonaws.com

Here we assume that the name of the key-pair you created is “testkp.pem.” The flag “-X” starts a session that supports X11. If you don’t have X11 on your machine, you can still login and use R but the graphics in the example below won’t be displayed on your computer.

Using R on the AMI.

1. Change your directory and start R

#cd examples
#R
2. Test R by entering a R expression, such as:

> mean(1:100)
[1] 50.5
>
3. From within R, you can also source one of the example scripts to see some time series computations:

> source(‘NYSE.r’)
4. After a minute or so, you should see a graph on your screen. After the graph is finished being drawn, you should see a prompt:

CR to continue

Enter a carriage return and you should see another graph. You will need to enter a carriage return 8 times to complete the script (you can also choose to break out of the script if you get bored with the all the graphs.
5. When you are done, exit your R session with a control-D. Exit your ssh session with an “exit” and terminte your AMI from the Amazon AWS console. You can also choose to leave your AMI running (it is only a few dollars a day).

Acknowledgements: Steve Vejcik from Open Data Group wrote the R scripts and configured the AMI.

AjayTerrific R companies, blogs, tweets, research and sites, but do let me know your feedback . Just un-other R day.

saP or saS or sasR or saaS

Some pending news and posts- It appears that the company SAP is moving closer to major acquisitions. This includes launching more and more applications that are analytical in nature as well coming together in an alliance with hardware major Teradata. Teradata off course is a very close partner to SAS Institute. So could SAP and SAS and or Terdata be moving closer to a major announcement on BI and BA merging.

The open source database movement with Hadoop is the one which can be the real game changer in the managed database industry and AsterData is the company to watch here.

However R with its modular extensions is a different paradigm in language developement and SAS no longer has the nimbleness or flexibity in creating such apps- at the same time it has lost a fair deal of credibility in the young academia (due to R) as well cost sensitive consumers (due to WPS)

The succession issue of Jim Goodnight continues to be the biggest problem for SAS Institute- Jim is not getting younger and his second line is not expected to be of the same class as the Sall/ Goodnight partnership. Of all the major companies in software, Jim Goodnight stood alone in remaining private and thus managed to escape distractions of share prices while building up the franchise. Surviving oil shocks, cold wars, three recessions Mr Goodnight has cared for his local community as well despite being active in SAS and fending off sustained attempts by open source languages.

. An automatic partner for Mr Goodnight should have been Google or even Google Labs with the Brin/Page duo being the top data miners ( commerically) of this generation as Sall/Goodnight were 30 years ago.

SAP may spend a lot of its cash but the supply chain paradigm is best served by SaaS and exemplified by Salesforce.com and Force.com developers.

As the ancient Chinese said- May you live in interesting times.

Buddypress for Analytical Buddies??

Let us assume there are top 100 analysts in the world mostly using WordPress or Typepad or Blogger to make posts

Managing them is quite a challenge.

What is marketing ROI of analyst relationships for a Business Intelligence vendor- Curt Monash is the Aerosmith of Business Intelligence Analysts so he can tell it better.

How about a magical community where you just use their mostly Feedburner of Feedblitz RSS feeds to create a self automated community.

Serach Engine Optimization can be tricked by keeping that community website free from Google or Search Engines ( yes it can be done).

Use numerical etc as in Linkedin to spur rivalry by shifting their page positions up and down, or by clicking repeatedly on some posts to manipulate their views on blog posts.

What would SAS pay to have all SAS analysts in one webpage. or SPSS to have all SPSS analysts in one webpage.

Six months later suddenly open the website for search engines, and the RSS feed has downloaded all the posts of all the top 50 analysts of the world. Google advertsing wont matter because hey we have a mega vendor sponsor- while individual bloggers / analysts have no collective strength now as the community is too big.

So much blah blah-

What software would you use.

you can choose between

Ning.com ( but it mostly non Blog feeds based)

or Wordframe.com ( which interface and name sounds suspiciously like WordPress software)

Or you can choose a customized WordPress Solution called Buddy Press.

Here is the software-

BuddyPress

BuddyPress will transform an installation of WordPress MU into a social network platform.

BuddyPress is a suite of WordPress plugins and themes, each adding a distinct new feature. BuddyPress contains all the features you’d expect from WordPress but aims to let members socially interact. Read More ?

Note this was just a generic case study for making a case for open source based community softwares. Resemblance to any thing is a matter of coincidence – except for Curt Monash of course.

Cost of Customized WordPress Software for communties is a big zero- it is free and open source and tjousands of plugins can be installed and maintained for it.

See an existing installation here

www.decisionstats.com/community

or at www.buddypress.org

Mergers and Acqusitions: Analyzing them

Valuation of future cash flows is an inexact science- too often it relies either on flat historical numbers (we grew by 5% last year so next year we will grow by 10%)

To add to the fun is the agency conflict, manager’s priorities (in terms of stock options encashment) is different from owner’s priorities.

These are some ways you can track companies for analysis-

1) Make a Google Alert on Company Name

2) Track if there is sudden and sustained spike in activity – it may be that company may be on road show seeking like minded partners, investors or mergers.

3) Watch for sudden drop in news alerts- it may mean radio silence or company may be in negotiations

4) Watch how company starts behaving with traditional antagonists…….

The easiest word thrown in the melee is ethics, copyright violations or payments delayed.

I am pasting an extract by a noted and renowned analyst in the business intelligence field-

Curt Monash

His Professional opinion on SAP

SAP’s NetWeaver Business Warehouse software will soon run natively on Teradata’s database for high-end data warehousing and BI (business intelligence), the vendors announced Monday.

SAP and its BusinessObjects BI subsidiary already had partnerships and product integrations with Teradata. But the vendors’ many joint customers have been clamoring for more, and native Business Warehouse support is the answer, said Tim Lang, vice president of product management for Business Objects.

SAP expects the new capability to enter beta testing in the fourth quarter of this year, with general availability in the first quarter of 2010, according to a spokesman.

Under the partnership, SAP will be handling first-line support, according to Lang. Pricing was not available.

The announcement drew a skeptical response from analyst Curt Monash of Monash Research, who questioned how deeply SAP will be committed to selling its customers on Teradata versus rival platforms.

“Business Objects has long been an extremely important partner for Teradata. But SAP’s most important DBMS partner is and will long be IBM, simply because [IBM] DB2 is not Oracle,” Monash said.”

Credit-

http://www.infoworld.com/d/data-management/sap-and-teradata-deepen-data-warehousing-ties-088

and here are some words from Curt Monash’s personal views on SAP

Typical nonsense from SAP

Below, essentially in its entirety, is an e-mail I just received from SAP, today, January 3. (Emphasis mine.)

Thank you for attending SAPs 4th Annual Analyst Summit in Las Vegas. We hope you found the time to be valuable. To ensure that we continue meeting your informational needs, please take a few moments to complete our online survey by using the link below. We ask that you please complete the survey before December 20. We look forward to receiving your feedback.

What makes this typical piece of SAP over-organization particularly amusing is that I didnt actually attend the event. I was planning to, but after considerable effort I think I finally made it clear to VP of Analyst Relations Don Bulmer that I was fed up with being lied to* by him and his colleagues. In connection with that, we came to a mutual agreement, as it were, that I wouldnt go.

*and lied about

Obviously, administrative ineptitude and dishonesty are two very different matters, united only by the fact that they both are characteristics of SAP, particularly its analyst relations group. Having said that, I should hasten to add that there are plenty of people at SAP I still trust. If Peter Zencke or Lothar Schubert tells me something, I expect it to be true. And its not just Germans; I feel the same way about Dan Rosenberg or Andrew Cabanski-Dunning, to name just a couple non-German SAP guys.

But I have to say this both SAPs ethics and its internal business processes are sufficiently screwed up as to cast doubt on SAPs qualifications to run the worlds best-run businesses.

Source:

http://www.monashreport.com/2007/01/03/sap-nonsense-ethics/

Journalism ethics off course makes sure that journalists don’t get renumerance or have to compulsorily declare benefits openly.This is not true for online journalism as it is still evolving.

Curt Monash is the grand daddy of all Business Intelligence Journalists- he has been doing this and seen it all since 1981 ( I was 4 years old then).

Almost incorruptible and therefore much respected his Monash report remains closely watched.

Some techniques to thwart Business Intelligence journalists is off course tactics of

1) Fear

2) Uncertainity

3) Doubt

by planting false leaks, or favoring more pliable journalists than the ones who ask difficult questions.

Another way is to use Search Engine Optimization so the Google search is rendered ineffective for diificult journalists for people to read them.

Why did I start this thread?

Well it seems the Business Intelligence world is coming to a round of consolidations and mergers. So will the trend of mega vendors first mentioned by M Fauschette here lead to a trend of mega journalist agencies as well- like a Fox News for all business intelligence journalists to report and get a share of the booty.

The Business Intelligence companies have long viewed analyst relationships as an unnecessary and uncontrollable marketing channel which they would like to see evolve.

Television Ratings can be manipulated for advertising similarly can you manipulate views, page views, clicks on a website for website advertisement.The catch is Google Trends may just give you the actual picture, but you can lie low by choosing not to submit or ping google during initial days and then we the website is big enough in terms of viewers or contributing bloggers can then safely ping Google as the momentum would be inertial in terms of getting bigger and bigger.

http://www.mfauscette.com/software_technology_partn/2009/05/the-emergence-of-the-mega-tech-vendor-economy.html

Here are some facts as per companies-

1) For SAS Institute

a) WPS is launching its Desktop software which enables SAS language users to migrate seamlessly at 1/10 th of the cost of SAS Base and SAS Stat. It will include Proc Reg and Proc Logistic in this and have a huge documentation.

b) R – open source software is increasingly powerful to manipulate data. SAS/IML tried offering a peace hand but they would need to reconcile with the GPL conditions for R- so if it is a plugin the source code is open and so on

c) Inference of R may be acquired by SAS to get a limited liability stake in a R based user platform.

d) Traditional Rival SPSS ( the two have dunked it out in analytics since 40 years) has a much better GUI and launched a revamped brand PASW. They are no longer distracted with a lawsuit which curiously accused them of stock manipulation and were found innocent.

e) Jim Goodnight has been dominating the industry since 1975 and has managed to stay private despite three recessions and huge inducements ( a wise miove given the mess in the markets in 2008). After Jim who will lead SAS with as much wisdom is an open question. Jim has refused Microsoft some years back, and is still very much in command despite being isolated in terms of industry alliances he remains respected. Pressure on him to rush into a merger would may just backfire.

f) The politics of envy- SAS is hated by many analytics people just as in some corners people hate America- it is because it is number 1, and been there too long.Did you mention anti-trust investigations . Well WPS is based out of UK and the European Union takes competition much more seriously.

g) Long time grudges – SAS is disliked despite its substantial R and D investments, the care it takes of its employees, and local community. Naturally people who are excluded or were excluded at some point of time have resentments.

h) SAS ambitions in Business Intelligence where curiously it is not that expensive and is actually more efficient than other players. The recent salvo fired by Jim Davis declaring business analytics as better than business intelligence- a remark much resented by cricket loving British journalist, Peter J Thomas

http://peterthomas.wordpress.com/category/business-intelligence/sas-bi-ba-controversy/

Intellectuals can carry huge grudges for decades ( Newton and Liebnitz) or Me with people who delay my interviews.

Teradata

1) Teradata has been a big partner with both SAS and SAP. It has also been losing ground recently in the same scenario SAS will shortly face.

It was also spun off in 2007-8 by the parent company NCR

http://it.toolbox.com/blogs/infosphere/against-the-flow-ncr-unacquires-teradata-13842

So will SAS buy Teradata

Will SAP Buy Teradata

Will SAS merge with Teradata and acquired by SAP while reaching a compromise with both WPS and R Project.

Will SAS call the bluff, make sincere efforts with the GPL and academic community to reconcile, give away multiple SAS Base and SAS Stat licenses in colleges and universities (like Asia, India, China) by expanding their academic program globally, start offering more coverage to JMP at a reduced price, make a trust for succession.

I dont know. All I know is I like writing code and poetry. Any code that gets the job done.

Any poem that I want to write ( see scribd books on the right)

R or SAS —– R and SAS ?

http://support.sas.com/rnd/app/studio/Rinterface2.html

R Interface Coming to SAS/IML Studio

While readers of the New York Times may have learned about R in recent weeks, it’s not news to many at SAS.

R is a leading language for developing new statistical methods, said Bob Rodriguez, Senior Director of Statistical Development at SAS. Our new PhD developers learned R in their graduate programs and are quite versed in it.

R is a matrix-based programming language that allows you to program statistical methods reasonably quickly. It’s open source software, and many add-on packages for R have emerged, providing statisticians with convenient access to new research. Many new statistical methods are first programmed in R.

While SAS is committed to providing the new statistical methodologies that the marketplace demands and will deliver new work more quickly with a recent decoupling of the analytical product releases from Base SAS, a commercial software vendor can only put out new work so fast. And never as as fast as a professor and a grad student writing an academic implementation of brand-new methodology.

Both R and SAS are here to stay, and finding ways to make them work better with each other is in the best interests of our customers.

We know a lot of our users have both R and SAS in their tool kit, and we decided to make it easier for them to access R by making it available in the SAS environment, said Rodriguez. Our first interface to R will be in an upcoming version of SAS/IML Studio (currently known as SAS Stat Studio), scheduled for this summer.

The SAS/IML Studio interface allows you to integrate R functionality with IML or SAS programs. You can also exchange data between SAS and R as data sets or matrices.

This is just the first step, said Radhika Kulkarni, Vice President of Advanced Analytics. We are busy working on an R interface that can be surfaced in the SAS server or via other SAS clients. For example, users will be able to interface with R through the IML procedure, possibly as soon as the first part of 2010.

SAS/IML Studio is distributed with SAS/IML software. Stay tuned for details on availability.

 

This is not to be co related by recent announcement by Mr Gentleman who invented the R language that if needed they will enforce legal action if terms of creative common licensing are not enforced.

It is a sad day for science when Gentleman professors are issuing mild legal threats just to make sure some pseudo science people are satisfied in  their intellectual hubris even though they themselves innovated R from language S. Revolution Computing does not want to be like the commercial maker of S Plus so they are supporting this legal position. Sad day when lawyers have to enforce code share. Maybe the R Project should start updating their website which looks like wreck across the auto bahn. Maybe Jim should visit the R users conference so the R Core team can see his horns.

Newton sued Leibnitz, and in the last days of his life, was tasked with enforcing a paper currency which he did rigorously. Good for the worlds currency, bad for science.

SAS commits $70 million to Cloud Computing

From the official SAS website

http://www.sas.com/news/preleases/CCF2009.html

SAS to build $70 million cloud computing facility

New cloud computing facility will support needed data-intensive customer solutions

CARY, NC (Mar.19, 2009) SAS, the leader in business analytics software and services, announces today it is building a 38,000-square-foot cloud computing facility to provide the additional data-handling capacity needed to expand SAS OnDemand offerings and hosted solutions.

As the need for hosted solutions grows, new research and development jobs will be generated at SAS Cary, N.C., world headquarters, where the majority of R&D employees (more than 1,400) are located.

This project is proof that, despite the down economy, SAS continues to grow and innovate, said Jim Goodnight, CEO of SAS. The growing demand by our customers for hosted solutions has given us this opportunity to invest even further in North Carolina and the Cary community.

In keeping with SAS commitment to protecting the environment, the facility will be built to Leadership in Energy and Environmental Design (LEED) standards for water and energy conservation. The sustainable construction methods encourage recycling of materials, similar to the Executive Briefing Center under construction on the Cary campus. SAS first LEED building, SAS Canadas headquarters in Toronto, opened in April 2006.

In keeping with LEED standards, about 60 percent of the projects construction and equipment spending will be in North Carolina. Approximately 1,000 people will be involved in its design and construction.

The facility will include two 10,000-square-foot server farms. Server Farm 1 is anticipated to be on-line mid-2010 and support growth for three to five years. Server Farm 2 will be constructed as a shell and will be populated with mechanical and electrical infrastructure once Server Farm 1 reaches 80 percent capacity. The facility will be built on SAS Cary campus.

Apparently SAS Institute believes in creating jobs ( and thousands of them) during the recession ! Jim clearly is in top intellectual shape despite his err vintage. Imagine with just a browser and you could be crunching billions of bytes of data sitting from a beach in Goa! Thankfully they did not believe the hot air that McKinsey put out on cloud computing (read here http://smartdatacollective.com/Home/17942 )