Software HIStory: Bass Institute Part 1

or How SAS Institute needs to take competition from WPS, (sas language compiler) in an alliance with IBM, and from R (open source predictive analytics with tremendous academic support) and financial pressure from Microsoft and SAP more seriously.

On the weekend, I ran into Jeff Bass, owner of BASS Institute. BASS Institute provided a SAS -like compiler in the 1980’s , was very light compared to the then clunky SAS ( which used multiple floppies), and sold many copies. It ran out of money when the shift happened to PCs and SAS Institute managed to reach that first.

Today the shift is happening to cloud computing and though SAS has invested 70 Million in it, it still continues to SUPPORT Microsoft by NOT supporting or even offering financial incentives for customers to use  Ubuntu Linux server and Ubuntu Linux desktop. For academic students it charges 25$ per Windows license, and thus helping sell much more copies of Windows Vista. Why does it not give the Ubuntu Linux version free to students. Why does SAS Institute continue to give the online doc free to people who use it’s language, and undercut it. More importantly why does SAS charge LESS money for excellent software in the BI space. It is one of the best and cheapest BI software and the most expensive desktop software. Why Does the SAS Institute not support Hadoop , Map/Reduce database systems insted of focusing on Oracle, Teradata relationships and feelings ??

Anyways, back to Jeff Bass- This is part 1 of the interview.

Ajay- Jeff, tell us all about the BASS Institute?

Jeff-

the BASS system has been off the market for about 20 years and is an example of old, command line, DOS based software that has been far surpassed by modern products – including SAS for the PC platform.  It was fun providing a “SAS like” language for people on PCs – running MS DOS – but I scrapped the product when PC SAS became a reasonably useable product and PC’s got enough memory and hard disk space.
 
BASS was a SAS “work alike”…it would run many (but certainly not all) SAS programs with few modifications.  It required a DOS PC with 640K of RAM and a hard disk with 1MB of available space.  We used to demo it on a Toshiba laptop with NO hard disk and only a floppy drive.  It was a true compiler that parsed the data / proc step input code and generated 8086 assembly language that went through mild optimization, and then executed.
 
I no longer have the source code…it was saved to an ancient Irwin RS-232 tape drive onto tapes that no longer exist…it is fun how technology has moved on in 20 years!  The BASS system was written in Microsoft Pascal and the code for the compiler was similar to the code that would be generated by the Unix YACC “compiler compiler” when fed the syntax of the SAS data step language.  BASS included the “DATA Step” and the most basic PROCS, like MEANS, FREQ, REG, TTEST, PRINT, SORT and others.  Parts of the system were written in 8086 assembler (I have to smile when I remember that).  If I was to recreate it today, I would probably use YACC and have it produce R source code…but that is an idea I am never likely to spend any time on.
 
We sold quite a few copies of the software and BASS Institute, Incorporated was a going concern until PC SAS became debugged and reliable.  Then there was no point in continuing it.  But I think it would be fun for someone to write a modern open source version of a SAS compiler (the data step and basic procs were developed in the public domain at NC State University before Sall and Goodnight took the company private, so as long as no copyrighted code was used in any way, an open source compiler would probably be legal).
 
I still use SAS (my company has an enterprise license), but only very rarely.  I use R more often and am a big fan of free software (sometimes called open source software, but I like the free software foundation’s distinction at fsf.org).  I appreciated your recommendation of the book “R for SAS and SPSS Users” on your website.  I bought it for my Kindle immediately upon reading about it on your website.I no longer work in the software world; I’m a reimbursement and health policy director for the biotech firm Amgen, where I have worked since 1990 or so…  I also serve on the boards of a couple of non-profit organizations in the health care field.

the BASS system has been off the market for about 20 years and is an example of old, command line, DOS based software that has been far surpassed by modern products – including SAS for the PC platform.  It was fun providing a “SAS like” language for people on PCs – running MS DOS – but I scrapped the product when PC SAS became a reasonably useable product and PC’s got enough memory and hard disk space.

 

BASS was a SAS “work alike”…it would run many (but certainly not all) SAS programs with few modifications.  It required a DOS PC with 640K of RAM and a hard disk with 1MB of available space.  We used to demo it on a Toshiba laptop with NO hard disk and only a floppy drive.  It was a true compiler that parsed the data / proc step input code and generated 8086 assembly language that went through mild optimization, and then executed.

 

I no longer have the source code…it was saved to an ancient Irwin RS-232 tape drive onto tapes that no longer exist…it is fun how technology has moved on in 20 years!  The BASS system was written in Microsoft Pascal and the code for the compiler was similar to the code that would be generated by the Unix YACC “compiler compiler” when fed the syntax of the SAS data step language.  BASS included the “DATA Step” and the most basic PROCS, like MEANS, FREQ, REG, TTEST, PRINT, SORT and others.  Parts of the system were written in 8086 assembler (I have to smile when I remember that).  If I was to recreate it today, I would probably use YACC and have it produce R source code…but that is an idea I am never likely to spend any time on.

 

We sold quite a few copies of the software and BASS Institute, Incorporated was a going concern until PC SAS became debugged and reliable.  Then there was no point in continuing it.  But I think it would be fun for someone to write a modern open source version of a SAS compiler (the data step and basic procs were developed in the public domain at NC State University before Sall and Goodnight took the company private, so as long as no copyrighted code was used in any way, an open source compiler would probably be legal).

 

I still use SAS (my company has an enterprise license), but only very rarely.  I use R more often and am a big fan of free software (sometimes called open source software, but I like the free software foundation’s distinction at fsf.org).  I appreciated your recommendation of the book “R for SAS and SPSS Users” on your website.  I bought it for my Kindle immediately upon reading about it on your website.

 

I’m a reimbursement and health policy director for the biotech firm Amgen, where I have worked since 1990 or so…  I also serve on the boards of a couple of non-profit organizations in the health care field.

Ajay- Any comments on WPS?

Jeff- I’m glad WPS is out there.  I think alternatives help keep the SAS folks aware that they have to care about competition, at least a little 😉

( Note from Ajay-

You can see more on WPS at http://www.teamwpc.co.uk/home

wps

and on SAS at http://www.sas.com/


Interview – Anne Milley, SAS Part 1

Anne Milley has been a part of SAS Institutes core strategy team.

She was in the news recently with an article by the legendary Ashlee Vance in the Bits Blog of  New York Times http://bits.blogs.nytimes.com/2009/02/16/sas-warms-to-open-source-one-letter-at-a-time/

In the article,  Ms. Milley said, I think it addresses a niche market for high-end data analysts that want free, readily available code. We have customers who build engines for aircraft. I am happy they are not using freeware when I get on a jet.

To her credit, Ms. Milley addressed some of the critical comments head-on in a subsequent blog post.

This sparked my curiosity in knowing Anne ,and her perspective more than just a single line quote and here is an interview. This is part 1 of the interview . Anne_Milley

Ajay -Describe your career journey , both out of and in SAS Institute. What advice would you give to young high school students to pursue careers in science. Do you think careers in science are as rewarding as other careers.

Anne-

Originally, I wanted to major in international business to leverage my German (which is now waning from lack of use!).  I found the marketing and management classes at the time provided little practical value and happily ended up switching to the college of social science in the economics department, where I was challenged with several quantitative courses and encouraged to always have an analytical perspective.  In school, I was exposed to BASIC, SPSS, SHAZAM, and SAS.  Once I began my thesis (bank failure prediction models and the term structure of interest rates) and started working, it was SAS that served as the best software investment, both in banking (Federal Home Loan Bank of Dallas) and in retail (7-Eleven Corp.).  After 5+ years in Dallas, my husband wanted to move back to New England and SAS happened to be opening an office at the time.  From there, I enjoyed a few years as a pre-sales technical consultant, many years in analytical product management, and most recently in product marketing.  All the while, it has been a great motivating factor to work with so many talented people focused on solving problems, revealing opportunities and doing things betterboth within and outside of SAS.

For high school and college students, I urge them to invest in studying some math and science, no matter the career theyre pursuing.  Whether they are interested in banking/finance, medicine and the life sciences, engineering or other fields, courses that will help them explore and analyze data, and come up with new approaches, new solutions, new advances based on a more scientific approach will pay off.

Course work in statistics, operations research, computer science and others will help hone skills for todays data- and analytics-driven world.  One example of this idea in action:  North Carolina State Universitys (NCSU) Institute for Advanced Analytics is seeing a huge increase in interest.  Its first graduating class last year saw higher average salaries than other graduate programs and multiple job offers per graduate.  Why?  Because there is still a huge demand for graduates with the ability to manipulate and analyze data in order to make better, more informed decisions.  I personally think careers in math and science are especially rewarding, but we need many diverse skills to make the world go round :o)

Ajay- Big corporations versus Startups. Where do you think is the balance between being big in terms of stability and size and being swift and nimble in terms of speed of product roll outs. What are the advantages and disadvantages of being a big corporation in a fast changing technology field.

Anne-

Ever a balancing act, with continuous learning along the way.  The advantage of being big (and privately held) is that you can be more long-term-oriented.  The challenge with fast-changing technology is to know where to best invest.  While others may go to market faster with new capabilities, we seek to provide superior implementations (we invest in R (Research) AND D (Development), making capabilities available on a number of platforms. 

In todays economy, I think the big vs. small comparison is becoming less and less relevant.  Big corporations need to be agile and innovative, like their smaller rivals.  And small- to medium-sized businesses (SMBs) need to use the same techniques and technologies as the big boys.

First, on the big side, Ill use an example of which Im very familiar:  At SAS, a company founded more than 30 years ago as an entrepreneurial venture, weve certainly changed over the decades.  SAS started out in a small office with a handful of people.  Its now a global company with hundreds of offices and thousands of employees around the world.  Yet one thing that has not changed for SAS in all this time:  a laser-like focus on the customer.  This has been the key to SAS success and uninterrupted growth .Not really a secret sauce. Just a simple yet profound approach: listen carefully to your customers and their changing needs, and innovate, develop and adapt based on these needs.

Of course, being large has its advantages:  we have more ideas from more people, and creativity and innovation knows no borders.  From Sydney to Warsaw, So Paulo to Singapore, Shanghai to Heidelberg, SAS employees work closely with customers to meet their business needs today and in the future.

SAS provides the stability and proven success that businesses look for, particularly in troubled economic times.  Being large and privately held enables SAS to grow when others are cutting back, and continue to invest in R&D at a high rate 22% of revenues in 2008.

Yet with our annual subscription licensing model, SAS cannot rest on its laurels.  Each year, customers vote with their checkbooks:  if SAS provided them with business benefits, results and a positive ROI, they renew; if not, they can walk away.  Happily for SAS, the overwhelming majority of customers keep coming back.  But the licensing model keeps SAS on its toes, customer-focused, and always listening and innovating based on customer feedback.

As for SMBs, they are rapidly adopting the technologies used by large companies such as business analytics to compete in the global economy.  Two examples of this:

BGF Industries is a manufacturer of high-tech fabrics used in jet fighters, bullet-proof vests, movie-theater screens and surfboards, based in Greensboro, NC. BGF turned to SAS business analytics to help it deal with foreign competition.  BGF created a cost-effective, easy-to-use early-warning system that helps it track quality and productivity.  Per BGF, data is now available in minutes instead of hours.  And in the business world, this speed can be the difference between success and failure.  Per Bobby Hull, a BGF systems analyst: The early-warning system we built with SAS allowed us to go from nothing to everything.  SAS allows us to focus away from clerical tasks to focus on the quality and process side of the job. Because of SAS, were never more than three clicks away from finding an answer.

For Los Angeles-based The Wine House, installing a SAS-powered
inventory-management system helped it discover nearly $400,000 in lost inventory sitting on warehouse shelves.  For an SMB with annual sales of $20 million, that was a major find.  Business analytics helps it to compete with major retail and grocery chains.  Per Bill Knight, owner of The Wine House: The first day the SAS application was live, we identified approximately 1,000 cases of wine that had not moved in over a year. Thats significant cash tied up in inventory.  We had a huge sale to blow it out, and just in time, because in todays economy, we would be choking on that inventory.

So regardless of size, businesses must remain agile, listen to their customers, and use technologies like business analytics to make sense of and derive value from their data whether on the quality of surfboard covers or the number of cases of Oregon Pinot Noir in stock.

3) SAS Institute has been the de-facto leader in both market volume share as well as market value share in the field of data analytics. What are some of the factors do you think have contributed to this enduring success. What have been the principal challengers over the years.(Any comments on the challenge from SAS language software WPS please ??)

At SAS, we seek to provide a complete environment for analyticsfrom data collection, data manipulation, data exploration, data analysis, deployment of results and the means to manage that whole process.  Competition comes in many forms and it pushes us to keep delivering value.  For me, one thing that sets SAS apart from other vendors is that we care so deeply about the quality of results.  Our Technical Support, Education and consulting services organizations really do partner with customers to help them achieve the best results.  That kind of commitment is deep in the DNA of SAS culture.

The good thing about competition is that it forces you to re-examine your value proposition and rethink your business strategy.  Customers value attributes of their analytics infrastructure in varying degrees speed, quality, support, flexibility, ease of migration, backward and forward compatibility, etc.  Often there are options to trump any one or a subset of these and when that aligns with the customers priorities of what they value, they will vote with their pocketbooks.  For some customers with tight batch-processing windows, speed trumps everything.  In tests conducted by Merrill Consultants, an MXG program running on WPS runs significantly longer, consumes more CPU time and requires more memory than the same MXG program hosted on its native SAS platform.

While its easy to get caught up in fast-changing technology, one has to also consider history.  Some programming languages come and go; others have stood the test of time.  Even the use of different flavors of analysis ebbs and flows.  For instance, when data mining was all the rage almost a decade ago, many asked the very good question, Why so much excitement about analyzing so much opportunistic data when design of experiments offers so much more?  Finally, experimental design is being more readily adopted in areas like marketing.

At the end of the day, innovation is the only sustainable competitive advantage.  As noted above in question 2, SAS has remained firmly committed to customer-driven innovation.  And SAS has stuck to its knitting with respect to analytics.  A while back, SAS used to stand for Statistical Analysis System. If not literally, then philosophically, Analytics remains our middle name.

(Ajay- to be continued)

As mentioned before, Zementis is at the forefront of using Cloud Computing ( Amazon EC2 ) for open source analytics. Recently I came in contact with Michael Zeller for a business problem , and Mike being the gentleman he is not only helped me out but also agreed on an extensive and exclusive interview.(!)

image

Ajay- What are the traditional rivals to scoring solutions offered by you. How does ADAPA compare to each of them. Case Study- Assume I have 50000 leads daily on a Car buying website. How would ADAPA help me in scoring the model ( created say by KXEN or , R or,SAS, or SPSS).What would my approximate cost advantages be if I intend to mail say the top 5 deciles everyday.

Michael- Some of the traditional scoring solutions used today are based on SAS, in-database scoring like Oracle, MS SQL Server, or very often even custom code.  ADAPA is able to import the models from all tools that support the PMML standard, so any of the above tools, open source or commercial, could serve as an excellent development environment.

The key differentiators for ADAPA are simple and focus on cost-effective deployment:

1) Open Standards – PMML & SOA:

Freedom to select best-of-breed development tools without being locked into a specific vendor;  integrate easily with other systems.

2) SaaS-based Cloud Computing:

Delivers a quantum leap in cost-effectiveness without compromising on scalability.

In your example, I assume that you’d be able to score your 50,000 leads in one hour using one ADAPA engine on Amazon.  Therefore, you could choose to either spend US$100,000 or more on hardware, software, maintenance, IT services, etc., write a project proposal, get it approved by management, and be ready to score your model in 6-12 months

OR, you could use ADAPA at something around US$1-$2 per day for the scenario above and get started today!  To get my point across here, I am of course simplifying the scenario a little bit, but in essence these are your choices.

Sounds too good to be true?  We often get this response, so please feel free to contact us today [http://www.zementis.com/contact.htm] and we will be happy show you how easy it can be to deploy predictive models with ADAPA!

 

Ajay- The ADAPA solution seems to save money on both hardware and software costs. Comment please. Also any benchmarking tests that you have done on a traditional scoring configuration system versus ADAPA.

Michael-Absolutely, the ADAPA Predictive Analytics Edition [http://www.zementis.com/predictive_analytics_edition.htm] on Amazon’s cloud computing infrastructure (Amazon EC2) eliminates the upfront investment in hardware and software.  It is a true Software as a Service (SaaS) offering on Amazon EC2 [http://www.zementis.com/howtobuy.htm] whereby users only pay for the actual machine time starting at less than US$1 per machine hour.  The ADAPA SaaS model is extremely dynamic, e.g., a user is able to select an instance type most appropriate for the job at hand (small, large, x-large) or launch one or even 100 instances within minutes.

In addition to the above savings in hardware/software, ADAPA also cuts the time-to-market for new models (priceless!) which adds to business agility, something truly critical for the current economic climate.

Regarding a benchmark comparison, it really depends on what is most important to the business.  Business agility, time-to-market, open standards for integration, or pure scoring performance?  ADAPA addresses all of the above.  At its core, it is a highly scalable scoring engine which is able to process thousands of transactions per second.  To tackle even the largest problems, it is easy to scale ADAPA via more CPUs, clustering, or parallel execution on multiple independent instances. 

Need to score lots of data once a month which would take 100 hours on one computer?  Simply launch 10 instances and complete the job in 10 hours over night.  No extra software licenses, no extra hardware to buy — that’s capacity truly on-demand, whenever needed, and cost-effective.

Ajay- What has been your vision for Zementis. What exciting products are we going to see from it next.

Michael – Our vision at Zementis [http://www.zementis.com] has been to make it easier for users to leverage analytics.  The primary focus of our products is on the deployment side, i.e., how to integrate predictive models into the business process and leverage them in real-time.  The complexity of deployment and the cost associated with it has been the main hurdle for a more widespread adoption of predictive analytics. 

Adhering to open standards like the Predictive Model Markup Language (PMML) [http://www.dmg.org/] and SOA-based integration, our ADAPA engine [http://www.zementis.com/products.htm] paves the way for new use cases of predictive analytics — wherever a painless, fast production deployment of models is critical or where the cost of real-time scoring has been prohibitive to date.

We will continue to contribute to the R/PMML export package [http://www.zementis.com/pmml_exporters.htm] and extend our free PMML converter [http://www.zementis.com/pmml_converters.htm] to support the adoption of the standard.  We believe that the analytics industry will benefit from open standards and we are just beginning to grasp what data-driven decision technology can do for us.  Without giving away much of our roadmap, please stay tuned for more exciting products that will make it easier for businesses to leverage the power of predictive analytics!

Ajay- Any India or Asia specific plans for the Zementis.

Michael-Zementis already serves customers in the Asia/Pacific region from its office in Hong Kong.  We expect rapid growth for predictive analytics in the region and we think our cost-effective SaaS solution on Amazon EC2 will be of great service to this market.  I could see various analytics outsourcing and consulting firms benefit from using ADAPA as their primary delivery mechanism to provide clients with predictive  models that are ready to be executed on-demand.

Ajay-What do you believe be the biggest challenges for analytics in 2009. What are the biggest opportunities.

Michael-The biggest challenge for analytics will most likely be the reduction in technology spending in a deep, global recession.  At the same time, companies must take advantage of analytics to cut cost, optimize processes, and to become more competitive.  Therefore, the biggest opportunity for analytics will be in the SaaS field, enabling clients to employ analytics without upfront capital expenditures.

Ajay – What made you choose a career in science. Describe your journey so far.What would your advice be to young science graduates in this recessionary times.

Michael- As a physicist, my research focused on neural networks and intelligent systems.  Predictive analytics is a great
way for me to stay close to science while applying such complex algorithms to solve real business problems.  Even in a recession, there is always a need for good people with the desire to excel in their profession.  Starting your career, I’d say the best way is to remain broad in expertise rather than being too specialized on one particular industry or proficient in a single analytics tool.  A good foundation of math and computer science, combined with curiosity in how to apply analytics to specific business problems will provide opportunities, even in the current economic climate.

About Zementis

Zementis, Inc. is a software company focused on predictive analytics and advanced Enterprise Decision Management technology. We combine science and software to create superior business imageand industrial solutions for our clients. Our scientific expertise includes statistical algorithms, machine learning, neural networks, and intelligent systems and our scientists have a proven record in producing effective predictive models to extract hidden patterns from a variety of data types. It is complemented by our product offering ADAPA, a decision engine framework for real-time execution of predictive models and rules. For more information please visit www.zementis.com

Ajay-If you have a lot of data ( GBs and GBs) , an existing model ( in SAS,SPSS,R) which you converted to PMML, and it is time for you to choose between spending more money to upgrade your hardware, renew your software licenses  then instead take a look at the ADAPA from www.zementis.com and score models as low as 1$ per hour. Check it out ( test and control !!)

Do you have any additional queries from Michael ? Use the comments page to ask.

Modified Ohri Framework

 

Some time back, I had created a framework for data mining through on demand cloud computing. This is the next version- it is free to use for all, with only authorship credit back to me…………..
 
It tries to do away with fixed server ,desktop costs AND fixed software costs in softwares which are used for data mining ,stats and analytics and have huge huge per CPU count annual license fees

 

The modified Ohri Framework tries to mash the following

 

0) HTTPS rather than HTTP

1) Encryption and Compression Software for data transfer (like PGP)

2) Open source stats package like R in cloud computer (like Amazon EC2 or Rightscale  with hadoop)

3) GUI to make it easy to use (like Rattle GUI and PMML Package)

4) A Data Mining Open Source Package (like Rapid Miner or Splunk)

5) RIA Graphics (like Silverlight )

6) Secure Output to cloud computing devices (like Google Docs)

7) Billing or Priced at simple cost plus X % (where simple cost can be like 0.85 cent /per instance hour or more depending on usage and X should not be more than 15 %)

8) Open source sharing of all code to ensure community sandboxing

 

Intention is to remove fixed computing costs of servers and desktops to normal PC’s (Ubuntu Linux ) with (Firefox or IE Explorer ) access to secure data mining on demand .

On tap demand mining to anyone in the world without going for the big license purchases/renewals (software expenses) or big hardware purchases (which become obsolete in 2-3 years).