Interview – Anne Milley, SAS Part 1

Anne Milley has been a part of SAS Institute’s core strategy team.

She was in the news recently with an article by the legendary Ashlee Vance in the Bits Blog of  New York Times http://bits.blogs.nytimes.com/2009/02/16/sas-warms-to-open-source-one-letter-at-a-time/

In the article,  Ms. Milley said, “I think it addresses a niche market for high-end data analysts that want free, readily available code. We have customers who build engines for aircraft. I am happy they are not using freeware when I get on a jet.”

To her credit, Ms. Milley addressed some of the critical comments head-on in a subsequent blog post.

This sparked my curiosity in knowing Anne ,and her perspective more than just a single line quote and here is an interview. This is part 1 of the interview . Anne_Milley

Ajay -Describe your career journey , both out of and in SAS Institute. What advice would you give to young high school students to pursue careers in science. Do you think careers in science are as rewarding as other careers.

Anne-

Originally, I wanted to major in international business to leverage my German (which is now waning from lack of use!).  I found the marketing and management classes at the time provided little practical value and happily ended up switching to the college of social science in the economics department, where I was challenged with several quantitative courses and encouraged to always have an analytical perspective.  In school, I was exposed to BASIC, SPSS, SHAZAM, and SAS.  Once I began my thesis (bank failure prediction models and the term structure of interest rates) and started working, it was SAS that served as the best software investment, both in banking (Federal Home Loan Bank of Dallas) and in retail (7-Eleven Corp.).  After 5+ years in Dallas, my husband wanted to move back to New England and SAS happened to be opening an office at the time.  From there, I enjoyed a few years as a pre-sales technical consultant, many years in analytical product management, and most recently in product marketing.  All the while, it has been a great motivating factor to work with so many talented people focused on solving problems, revealing opportunities and doing things better—both within and outside of SAS.

For high school and college students, I urge them to invest in studying some math and science, no matter the career they’re pursuing.  Whether they are interested in banking/finance, medicine and the life sciences, engineering or other fields, courses that will help them explore and analyze data, and come up with new approaches, new solutions, new advances based on a more scientific approach will pay off.

Course work in statistics, operations research, computer science and others will help hone skills for today’s data- and analytics-driven world.  One example of this idea in action:  North Carolina State University’s (NCSU) Institute for Advanced Analytics is seeing a huge increase in interest.  Its first graduating class last year saw higher average salaries than other graduate programs and multiple job offers per graduate.  Why?  Because there is still a huge demand for graduates with the ability to manipulate and analyze data in order to make better, more informed decisions.  I personally think careers in math and science are especially rewarding, but we need many diverse skills to make the world go round :o)

Ajay- Big corporations versus Startups. Where do you think is the balance between being big in terms of stability and size and being swift and nimble in terms of speed of product roll outs. What are the advantages and disadvantages of being a big corporation in a fast changing technology field.

Anne-

Ever a balancing act, with continuous learning along the way.  The advantage of being big (and privately held) is that you can be more long-term-oriented.  The challenge with fast-changing technology is to know where to best invest.  While others may go to market faster with new capabilities, we seek to provide superior implementations (we invest in ‘R’ (Research) AND ‘D’ (Development), making capabilities available on a number of platforms. 

In today’s economy, I think the big vs. small comparison is becoming less and less relevant.  Big corporations need to be agile and innovative, like their smaller rivals.  And small- to medium-sized businesses (SMBs) need to use the same techniques and technologies as the “big boys.”

First, on the big side, I’ll use an example of which I’m very familiar:  At SAS, a company founded more than 30 years ago as an entrepreneurial venture, we’ve certainly changed over the decades.  SAS started out in a small office with a handful of people.  It’s now a global company with hundreds of offices and thousands of employees around the world.  Yet one thing that has not changed for SAS in all this time:  a laser-like focus on the customer.  This has been the key to SAS’ success and uninterrupted growth .Not really a “secret sauce.” Just a simple yet profound approach: listen carefully to your customers and their changing needs, and innovate, develop and adapt based on these needs.

Of course, being large has its advantages:  we have more ideas from more people, and creativity and innovation knows no borders.  From Sydney to Warsaw, São Paulo to Singapore, Shanghai to Heidelberg, SAS employees work closely with customers to meet their business needs today and in the future.

SAS provides the stability and proven success that businesses look for, particularly in troubled economic times.  Being large and privately held enables SAS to grow when others are cutting back, and continue to invest in R&D at a high rate – 22% of revenues in 2008.

Yet with our annual subscription licensing model, SAS cannot rest on its laurels.  Each year, customers vote with their checkbooks:  if SAS provided them with business benefits, results and a positive ROI, they renew; if not, they can walk away.  Happily for SAS, the overwhelming majority of customers keep coming back.  But the licensing model keeps SAS on its toes, customer-focused, and always listening and innovating based on customer feedback.

As for SMBs, they are rapidly adopting the technologies used by large companies – such as business analytics – to compete in the global economy.  Two examples of this:

BGF Industries is a manufacturer of high-tech fabrics used in jet fighters, bullet-proof vests, movie-theater screens and surfboards, based in Greensboro, NC. BGF turned to SAS business analytics to help it deal with foreign competition.  BGF created a cost-effective, easy-to-use early-warning system that helps it track quality and productivity.  Per BGF, data is now available in minutes instead of hours.  And in the business world, this speed can be the difference between success and failure.  Per Bobby Hull, a BGF systems analyst: “The early-warning system we built with SAS allowed us to go from nothing to everything.  SAS allows us to focus away from clerical tasks to focus on the quality and process side of the job. Because of SAS, we’re never more than three clicks away from finding an answer.”

For Los Angeles-based The Wine House, installing a SAS-powered
inventory-management system helped it discover nearly $400,000 in “lost” inventory sitting on warehouse shelves.  For an SMB with annual sales of $20 million, that was a major find.  Business analytics helps it to compete with major retail and grocery chains.  Per Bill Knight, owner of The Wine House: “The first day the SAS application was live, we identified approximately 1,000 cases of wine that had not moved in over a year. That’s significant cash tied up in inventory.  We had a huge sale to blow it out, and just in time, because in today’s economy, we would be choking on that inventory.”

So regardless of size, businesses must remain agile, listen to their customers, and use technologies like business analytics to make sense of and derive value from their data – whether on the quality of surfboard covers or the number of cases of Oregon Pinot Noir in stock.

3) SAS Institute has been the de-facto leader in both market volume share as well as market value share in the field of data analytics. What are some of the factors do you think have contributed to this enduring success. What have been the principal challengers over the years.(Any comments on the challenge from SAS language software WPS please ??)

At SAS, we seek to provide a complete environment for analytics—from data collection, data manipulation, data exploration, data analysis, deployment of results – and the means to manage that whole process.  Competition comes in many forms and it pushes us to keep delivering value.  For me, one thing that sets SAS apart from other vendors is that we care so deeply about the quality of results.  Our Technical Support, Education and consulting services organizations really do partner with customers to help them achieve the best results.  That kind of commitment is deep in the DNA of SAS’ culture.

The good thing about competition is that it forces you to re-examine your value proposition and rethink your business strategy.  Customers value attributes of their analytics infrastructure in varying degrees— speed, quality, support, flexibility, ease of migration, backward and forward compatibility, etc.  Often there are options to trump any one or a subset of these and when that aligns with the customers’ priorities of what they value, they will vote with their pocketbooks.  For some customers with tight batch-processing windows, speed trumps everything.  In tests conducted by Merrill Consultants, an MXG program running on WPS runs significantly longer, consumes more CPU time and requires more memory than the same MXG program hosted on its native SAS platform.

While it’s easy to get caught up in fast-changing technology, one has to also consider history.  Some programming languages come and go; others have stood the test of time.  Even the use of different flavors of analysis ebbs and flows.  For instance, when data mining was all the rage almost a decade ago, many asked the very good question, “Why so much excitement about analyzing so much opportunistic data when design of experiments offers so much more?”  Finally, experimental design is being more readily adopted in areas like marketing.

At the end of the day, innovation is the only sustainable competitive advantage.  As noted above in question 2, SAS has remained firmly committed to customer-driven innovation.  And SAS has “stuck to its knitting” with respect to analytics.  A while back, SAS used to stand for “Statistical Analysis System.” If not literally, then philosophically, Analytics remains our middle name.

(Ajay- to be continued)

 

ND_FilmingContinuing with the Slumdog Millionaire celebrations in India , we have an interview here at DecisionStats with an up and coming intense Indian film maker. Nitin Dash is a creative film maker based out of India. He has created movies like the science fiction movie “ Formula 69” , short videos like “500” (see below) . He gave up a corporate career after 5 tears of corporate experience and after studying at the renowned  Indian Institute of Management, Lucknow to pursue his creative side. Here in a candid interview , Nitin discusses the things that motivate him and passes some tips for home movie making. Coming from a person that has 200,000 plus views on his YouTube video , it is nifty and useful advice.

 

Ajay – What has been your educational and career journey so far . 

Nitin- Finished my MBA from IIML in 2000. Worked for 5 years in the corporate media sector.. Did a short 6- month course in filmmaking from New York in  2005. Started my own film production company  Filmkaar Productions.. www.filmkaar.com

Ajay – What inspires your art. What are the key things that made you decide to take a leap into movie direction from the corporate world.

Nitin – I find inspiration from people around me. A common man , his life and the simple conflicts and challenges that make it interesting..
I felt that the corporate world was stifling my creativity and the work was very operational and mundane. I had some friends from Jamia mass comm.. school. We got together and started making short films over the weekend. After a few months I decided that film making is my calling in life.

Ajay – I have a Sony hand camera and I would like to be Steven Spielberg while shooting my son’s videos. Comment please. Give me 5 bullet points or tips.

Nitin –  The following links would help you.

www.video101course.com
www.cybercollege.com

Learn windows moviemaker. It is a very simple to learn and easy to use software and already installed on PC’s with windows xp / vista

Ajay- How effective do you think is viral marketing . What paradigm changes do you think have Web 2.0 ,blogs, YouTube brought about in the traditional content business.

Nitin- Viral marketing is very effective, but the content has to be right. Getting a good
content that turns into a viral is very difficult and most of the times unpredictable. Web 2.0 has given access to people to express their creativity and share it with the world.

Ajay-What do you think about the  casting couch as a director ? 1 line comment please

Nitin- It’s unofficial term for networking in the film industry.

Ajay- What has been your most successful movie- short film. Which short movie do you like the most and why ?

Nitin-http://www.youtube.com/watch?v=35u0J4p26Fg
A magical tale about a young boy who finds a solution for Global Warming from a monk in the mountains. I like the simplicity of the story and the beautiful location it is shot in.

Ajay- Well there is no telling when Nitin would bring home an Oscar , but you can preview a short 2 minute video by him. Its called 500 and describes how different people would spend Rs 500 if they found it. Simple and powerful on the ways money moves us in different ways-and the social disparities in India shining.

 

 Filmkaar Productions has been set up to promote thought provoking cinema.To create entertaining films that are socially relevant. Engaging films that can transform minds.Our mission is to make ‘Extraordinary films with Ordinary people’. Visit them at www.filmkaar.com

Interview- Endre Domiczi

imageHere is an interview with a client and partner of mine, Mr. Endre Domiczi of Sevana Oy (www.sevana.fi) .

Sevana is a Finland based company which creates excellent software and analytics  products and their latest release is their automated audio quality product. Existing releases have been a shopping cart analyzer which does wonderful automated market basket analysis.

Ajay – What has been your career journey so far ? What advice would you
give to a fresh science graduate entering the market in today’s
recession .

Endre – About my career journey 

After receiving an MSc in Electronic Engineering my first job was maintenance of the Soviet "clone" of an IBM/360 computer (I still remember some of the Russian language terminology).While doing post graduate studies (got something that would be called today Tech.Lic. in Data Communication) I was offered a job by one of the professors in a research institute. Through the research institute I got a chance to work on a nuclear powerplant simulator in Finland as a Hungarian ex-pat (important, because Tsernobyl happened in the meanwhile).

I specified and implemented the mainframe side of the communication between a VAX/VMS mainframe and several PDP’s  (I’m still proud that later on someone who saw my part of the system, written in 1986, said that it was object-oriented, but the language was Fortran 🙂

One of the jobs enjoyed most was at Fiskars Power electronics. I could design the Hardware and write all software for a microcontroller-based intelligent display of a UPS (uninterruptable (or unpredictable?) power supply), which communicated with the UPS via the power line (around 1988-89).

Then 6 years at Nokia and 5 years at Nokia Research Center, where I got more familiar with object-orientation.A brief stop at Rational, followed by lecturing at the Helsinki Technical University for about 3 years (concurrent programming; UML-related topics). Somewhere in the meantime a (or rather THE) company has been founded, where I still work.

Here is the answer to the "advice" part

My advice would be – if we were speaking of a bright graduate – that his decision to start establishing contacts with potential employers during his studies and to lay down the foundations of his professional network was very wise, and now he should start using his contacts.

Finding a good position on the labor market, or a place on the IT market with a product or idea involves a certain amount of luck but also planning and conscious self management, the sooner career starters realize this the better.

Ajay – What are the key things that you have worked with in terms of technologies.

Endre- To my opinion it’s always a matter of people rather than anything else,
because people create technologies and people use technologies.

I believe that the key technologies we worked with are the way our company is organized and managed, the way our employees treat working with us and of course that state-of-the-art products (no matter what actual technology we have in mind: C, .NET, Delphi, PHP, Java etc), which our employees develop for our customers.

Two major examples are existing product providing automated audio quality measurement and analysis and the tool to mine and manage association rules in high data volumes that we expect to release QI 2009. Both are unique on the market as technology/science wise as well as functionality wise.

Ajay- What is the most creative product that has been released or is going to be released by your company.

Endre- I would mention the same two analytical products:

Automated audio/voice quality estimation is already released and we are searching and negotiating with companies to partner on its dissemination and integration to voice quality and quality of service test solutions.

All information about scientific approach, technology, tests and benefits is available from our web site (www.sevana.fi) partly freely and partly under NDA.We also put big hopes for the association rules mining system, which we develop trying to take into account needs of statisticians and marketing/sales analysts as well as typical demands in various industries: retail, wholesale, maintenance. I would like to give special thanks to Mr. Ajay Ohri whom we were consulting with about the features of such product and its market applications and demand. ( Ajay- Pleasure is mine)

Ajay-  Outsourcing has taken off really well in Poland and Romania. What
are the best known success stories of outsourcing that you can tell
of.What are the best known success stories of outsourcing that you can tell of.

Endre- Well, outsourcing may have different faces – it can be a big success and a
big failure or even a failure with a face of success. I believe that success story for software outsourcing is any company that has established a well operating and profitable company in any country, where doing software outsourcing makes sense.

I also believe that we have a good concept for software outsourcing projects as well, providing onshore software development at offshore prices in Finland.

We have our own know-how in order to make it possible.

Ajay- What do you think about the open source versus proprietary software debate. What is scenario in your local market ( across parts of the country ) regarding this.

Endre Open source gives the freedom to the “evolution” of applications and services.

It can spare you from reinventing the wheel. I forgot the source, but some famous computer scientist said something like: if programmers read more they would have to write less (code)One can argue that in case of open source one doesn’t easily find a bug-fix if her/his problem is not "mainstream".

However, even in proprietary software the vendor has priorities (often market-driven) and if your wallet is not thick enough and you are at the end of the list you’ll have to wait. And fixing, making a workaround, on your own is much more difficult.

Ajay – What are the intellectual property rights conditions as well as language facilities for Russian software companies ? What is the best way to contact local Russian companies for a software contract.

Endre- Protecting intellectual property rights is a reasonable issue in Russia and a lot of effort is put to improve the situation by the government and business, however I believe that the same challenges can be found in any other country: if your IPRs are broken for instance by your outsourcing company, would you really be able to afford court trial? I am sure not every company would be able to afford it no matter where we have IPR violation: in Russia, Romania, Poland or India.

I think the best way is to try to contact individuals first, because in Russia for instance there are a lot of highly qualified people who would rather try to establish their own
business than trying to be highly recognized by local outsourcing companies. We’ll be happy to assist in providing connections to the Russian software compan
ies and individuals.

 

Disclaimer- Ajay- I advise Sevana on Web 2.0 initiatives .See more on their products at http://wordpress.sevana.fi/ and http://sevana.fi

Interview- Phil Rack

 

Phil Rack is the creator of a Bridge to R and SAS Bridge to R which enables both WPS and SAS softwares to connect to R. He is also a WPS Reseller. WPS is a base SAS equivalent that can take in SAS code , SAS datasets, write SAS code, and create SAS datasets ( and also create its own format)- at the cost of 660 $ a license ( and almost one tenth of a SAS Institute installation on network servers). Having worked in SAS language and analytics consulting for almost 26 years ,Phil runs www.minequest.com besides running the SAS Consultants network that mentors analytics consultant globally ( I am an ex- member :))

Ajay- What has been your career journey. What advice would you give to someone entering a science career after high school?

Phil- I started out consulting full-time in 1983. I left an analytics job with McMillan-McGraw-Hill Publishing because I didn’t believe the company was investing in BI tools and training as it should. That was pretty early in terms of when BI was becoming important.

Many companies at that point saw BI as only two things.

(a) Ability to forecast sales and

(b) ad-hoc reporting with sums/totals and percentages.

It was obvious to me that I had to make a change to do the kind of work I wanted to do. In terms of training, I was formally trained as a demographer and did my graduate studies at Ohio State so I received a pretty good dose of quantitative subject matter as well as a unique perspective on the social implications of markets and geography. If I had to do it over again, I would probably take more course work in the subject area of the “Family.”  I’m always amazed how many times the work I do in banking and finance revolves around the family lifecycle.

 

Ajay- What has been the biggest project success you have seen in your consulting practice?

Phil- This goes back three years to a project where I was working on Basel II compliance with a commercial banking client that I just loved working with.

A few months into the engagement, they pulled me aside and asked me to put together an automotive portfolio stress test for them. This bank had very large loan exposures to the auto market for second and third tier suppliers to the Big Three as well as international auto manufacturers.

The Risk Management group and I sat down for a couple of days and pulled together a project plan and an outline of what we needed to be able implement a dynamic Auto Risk Stress Test Model for this portfolio. The software used was SAS/Base and Excel and the program allowed us to modify 50 to 60 parameters to model different scenarios. All together, it took perhaps three weeks to implement and it was amazingly indicative of the fall out of the auto industry as well as foreshadowing some of the financial carnage in South Eastern Michigan such as lower property values and unemployment.

 

Ajay- “It is not what you know, it is whom you know.” Comment please as an SAS consultant.

Phil- In terms of my business, 80-90% of the work I do is either based on prior work that I’ve done for that company or through referrals.  If you want to have a successful consulting career, you really have to pay attention to developing your network. I’ve taken advantage of social gatherings such as charity events and other social mixers to try to extend my network. I hand out a lot of business cards every year. Formal organizations exist here in Columbus, Ohio such as TechColumbus.org that is a dynamite organization that helps small tech businesses in the area of networking, financing, access to different hardware platforms for testing, etc…  I have mixed emotions about the value of some social networks however.

I see so many individuals on LinkedIn that have 5,000 connections that I have to wonder what it is these folks really do. Who has the time to read all the updates and postings for 5,000 people and still be able to get work out the door? ( Note from Ajay- I have 6300 connections on LinkedIn . Ouch !!)

 

Ajay- What motivated you to write the SAS to R and WPS to R bridges? (Which IS your favorite analytical tool, since you are active in all three?)

Phil- It started out as a “proof-of-concept” exercise and it’s just keeps growing. The WPS to R Bridge is a piece of software that I wrote originally for WPS users to access R from within the WPS Workbench. For those who are unfamiliar with WPS, it’s a SAS/Base alternative that is extremely compatible with your existing SAS/Base software and your code is just plug-and-play. WPS doesn’t have the statistical capabilities of SAS such as SAS/STAT, ETS, OR, etc… so the idea was to write a bridge so that WPS users wouldn’t have to learn a new GUI/IDE to use R. The Bridge gives WPS users access to R graphics as well as any of the R statistical libraries but it has the advantage of the superior data handling of the SAS language. One of the new features is the ability of the WPS to R Bridge to run R programs in parallel. Depending on your hardware, you can easily run six to a dozen R programs simultaneously and collect the R listing and log files back into the WPS listing and log in the order you submitted the programs.

I did write a Bridge to R for SAS users but very few SAS users have expressed interest in it. I suppose that SAS users are happy enough paying the fat licensing fees to SAS that it just doesn’t matter to them. I have to say, my favorite tool at the moment is WPS. I find the interface/workbench to be so superior to what SAS has to offer that I now find myself writing code in WPS and then taking it over to SAS if that’s what the client requires.

 

Ajay- What do you think about internet based delivery and social networking including communities and lists changing the software product cycle?

Phil- This somewhat goes back to question #3 in terms of communities. I think it has its value as a place to share your concerns and find answers to difficult programming issues. Now, Internet delivery and cloud computing I find very interesting. I think there’s some strong advantages to using the cloud to provide services to your clients. If you look at the SAS pricing model, they really take it to you financially if you want to use your license to be a DSP (data service provider) or put your code on an intranet/internet. For some reason, SAS is just hostile when it comes to small and medium sized businesses. Companies like World Programming who license WPS have a much more realistic idea of licensing in that you can expose your WPS license to your intranet/internet and not have to pay 10x the fees that SAS charges. WPS doesn’t charge additional fees for those who are DSP’s either and there are quite a few of them in the Pharma domain.

Beyond security challenges associated with cloud computing, I think SaaS that provides analytical services such as high performance forecasting and name and address cleanup and verification are ripe for the picking. One other issue I see with cloud computing is when you have tens of gigs of data that you have to move from your desktop or server to the cloud. The infrastructure just isn’t fast enough, or let’s say reasonably priced, to allow for moving this amount of data to really scale well.

Ajay- How does MineQuest intend to influence the analytical software paradigm?

Phil- I think the role for MineQuest in the next few years is twofold.

We’ll keep offering services to banks and other financial service firms in the area of Operational Risk and SAS programming.

The other area is to help these large financial service companies realize that they can save millions of dollars by moving their SAS Server licenses to WPS. This
also allows the smaller businesses who have steered away from SAS software because of cost to begin using WPS and not take such a big financial hit. I find it exciting to think how this will also open the job market for the thousands of SAS programmers out there already.

The BI battles are taking place on the desktop and Windows Servers and MineQuest has invested a lot of time and effort in creating macro libraries to help these organizations migrate their code to WPS and access R for advanced statistical capabilities.

We believe that the bread and butter software for almost any financial organization in the BI realm ultimately revolves around the SAS language for reporting, summarization and disbursement of data and we plan to continue to serve that market.

About Minequest –

MineQuest has been providing SAS Consulting and Programming Services for more than 25 years. Our associates and employees are expert SAS programmers and specialize in the Banking and Financial Industries. Our staff has expertise in such areas as Market Analytics, ETL and Reporting Systems, Fraud Detection, and Credit Risk and Operational Risk segments. Validating Operational Risk models using SAS, in support of the Basel II Capital Framework is one of our specialties. We have real world experience developing SAS software to test and validate Credit and Operational Risk Systems like Fair Isaac’s Blaze Advisor which is one of our areas with subject matter expertise.

MineQuest, LLC

SAS & WPS Consulting and WPS Reseller

Tel: (614) 457-3714

Web: www.MineQuest.com

Blog: www.MineQuest.com/WordPress

image001

( Ajay –

SAS language uses mainly Procs and Data step for output and input.Base SAS is a product copyrighted by the SAS Institute (www.sas.com) .SAS Institute has been leading the analytics world since the 70’s.WPS is copyright of World Programming Company (WPC) (www.teamwpc.co.uk/products/wps ) )

Interview –Michael Zeller CEO,Zementis

As mentioned before, Zementis is at the forefront of using Cloud Computing ( Amazon EC2 ) for open source analytics. Recently I came in contact with Michael Zeller for a business problem , and Mike being the gentleman he is not only helped me out but also agreed on an extensive and exclusive interview.(!)

image

Ajay- What are the traditional rivals to scoring solutions offered by you. How does ADAPA compare to each of them. Case Study- Assume I have 50000 leads daily on a Car buying website. How would ADAPA help me in scoring the model ( created say by KXEN or , R or,SAS, or SPSS).What would my approximate cost advantages be if I intend to mail say the top 5 deciles everyday.

Michael- Some of the traditional scoring solutions used today are based on SAS, in-database scoring like Oracle, MS SQL Server, or very often even custom code.  ADAPA is able to import the models from all tools that support the PMML standard, so any of the above tools, open source or commercial, could serve as an excellent development environment.

The key differentiators for ADAPA are simple and focus on cost-effective deployment:

1) Open Standards – PMML & SOA:

Freedom to select best-of-breed development tools without being locked into a specific vendor;  integrate easily with other systems.

2) SaaS-based Cloud Computing:

Delivers a quantum leap in cost-effectiveness without compromising on scalability.

In your example, I assume that you’d be able to score your 50,000 leads in one hour using one ADAPA engine on Amazon.  Therefore, you could choose to either spend US$100,000 or more on hardware, software, maintenance, IT services, etc., write a project proposal, get it approved by management, and be ready to score your model in 6-12 months…

OR, you could use ADAPA at something around US$1-$2 per day for the scenario above and get started today!  To get my point across here, I am of course simplifying the scenario a little bit, but in essence these are your choices.

Sounds too good to be true?  We often get this response, so please feel free to contact us today [http://www.zementis.com/contact.htm] and we will be happy show you how easy it can be to deploy predictive models with ADAPA!

 

Ajay- The ADAPA solution seems to save money on both hardware and software costs. Comment please. Also any benchmarking tests that you have done on a traditional scoring configuration system versus ADAPA.

Michael-Absolutely, the ADAPA Predictive Analytics Edition [http://www.zementis.com/predictive_analytics_edition.htm] on Amazon’s cloud computing infrastructure (Amazon EC2) eliminates the upfront investment in hardware and software.  It is a true Software as a Service (SaaS) offering on Amazon EC2 [http://www.zementis.com/howtobuy.htm] whereby users only pay for the actual machine time starting at less than US$1 per machine hour.  The ADAPA SaaS model is extremely dynamic, e.g., a user is able to select an instance type most appropriate for the job at hand (small, large, x-large) or launch one or even 100 instances within minutes.

In addition to the above savings in hardware/software, ADAPA also cuts the time-to-market for new models (priceless!) which adds to business agility, something truly critical for the current economic climate.

Regarding a benchmark comparison, it really depends on what is most important to the business.  Business agility, time-to-market, open standards for integration, or pure scoring performance?  ADAPA addresses all of the above.  At its core, it is a highly scalable scoring engine which is able to process thousands of transactions per second.  To tackle even the largest problems, it is easy to scale ADAPA via more CPUs, clustering, or parallel execution on multiple independent instances. 

Need to score lots of data once a month which would take 100 hours on one computer?  Simply launch 10 instances and complete the job in 10 hours over night.  No extra software licenses, no extra hardware to buy — that’s capacity truly on-demand, whenever needed, and cost-effective.

Ajay- What has been your vision for Zementis. What exciting products are we going to see from it next.

Michael – Our vision at Zementis [http://www.zementis.com] has been to make it easier for users to leverage analytics.  The primary focus of our products is on the deployment side, i.e., how to integrate predictive models into the business process and leverage them in real-time.  The complexity of deployment and the cost associated with it has been the main hurdle for a more widespread adoption of predictive analytics. 

Adhering to open standards like the Predictive Model Markup Language (PMML) [http://www.dmg.org/] and SOA-based integration, our ADAPA engine [http://www.zementis.com/products.htm] paves the way for new use cases of predictive analytics — wherever a painless, fast production deployment of models is critical or where the cost of real-time scoring has been prohibitive to date.

We will continue to contribute to the R/PMML export package [http://www.zementis.com/pmml_exporters.htm] and extend our free PMML converter [http://www.zementis.com/pmml_converters.htm] to support the adoption of the standard.  We believe that the analytics industry will benefit from open standards and we are just beginning to grasp what data-driven decision technology can do for us.  Without giving away much of our roadmap, please stay tuned for more exciting products that will make it easier for businesses to leverage the power of predictive analytics!

Ajay- Any India or Asia specific plans for the Zementis.

Michael-Zementis already serves customers in the Asia/Pacific region from its office in Hong Kong.  We expect rapid growth for predictive analytics in the region and we think our cost-effective SaaS solution on Amazon EC2 will be of great service to this market.  I could see various analytics outsourcing and consulting firms benefit from using ADAPA as their primary delivery mechanism to provide clients with predictive  models that are ready to be executed on-demand.

Ajay-What do you believe be the biggest challenges for analytics in 2009. What are the biggest opportunities.

Michael-The biggest challenge for analytics will most likely be the reduction in technology spending in a deep, global recession.  At the same time, companies must take advantage of analytics to cut cost, optimize processes, and to become more competitive.  Therefore, the biggest opportunity for analytics will be in the SaaS field, enabling clients to employ analytics without upfront capital expenditures.

Ajay – What made you choose a career in science. Describe your journey so far.What would your advice be to young science graduates in this recessionary times.

Michael- As a physicist, my research focused on neural networks and intelligent systems.  Predictive analytics is a grea
t way for me to stay close to science while applying such complex algorithms to solve real business problems.  Even in a recession, there is always a need for good people with the desire to excel in their profession.  Starting your career, I’d say the best way is to remain broad in expertise rather than being too specialized on one particular industry or proficient in a single analytics tool.  A good foundation of math and computer science, combined with curiosity in how to apply analytics to specific business problems will provide opportunities, even in the current economic climate.

About Zementis

Zementis, Inc. is a software company focused on predictive analytics and advanced Enterprise Decision Management technology. We combine science and software to create superior business imageand industrial solutions for our clients. Our scientific expertise includes statistical algorithms, machine learning, neural networks, and intelligent systems and our scientists have a proven record in producing effective predictive models to extract hidden patterns from a variety of data types. It is complemented by our product offering ADAPA®, a decision engine framework for real-time execution of predictive models and rules. For more information please visit www.zementis.com

Ajay-If you have a lot of data ( GB’s and GB’s) , an existing model ( in SAS,SPSS,R) which you converted to PMML, and it is time for you to choose between spending more money to upgrade your hardware, renew your software licenses  then instead take a look at the ADAPA from www.zementis.com and score models as low as 1$ per hour. Check it out ( test and control !!)

Do you have any additional queries from Michael ? Use the comments page to ask….

As mentioned before, Zementis is at the forefront of using Cloud Computing ( Amazon EC2 ) for open source analytics. Recently I came in contact with Michael Zeller for a business problem , and Mike being the gentleman he is not only helped me out but also agreed on an extensive and exclusive interview.(!)

image

Ajay- What are the traditional rivals to scoring solutions offered by you. How does ADAPA compare to each of them. Case Study- Assume I have 50000 leads daily on a Car buying website. How would ADAPA help me in scoring the model ( created say by KXEN or , R or,SAS, or SPSS).What would my approximate cost advantages be if I intend to mail say the top 5 deciles everyday.

Michael- Some of the traditional scoring solutions used today are based on SAS, in-database scoring like Oracle, MS SQL Server, or very often even custom code.  ADAPA is able to import the models from all tools that support the PMML standard, so any of the above tools, open source or commercial, could serve as an excellent development environment.

The key differentiators for ADAPA are simple and focus on cost-effective deployment:

1) Open Standards – PMML & SOA:

Freedom to select best-of-breed development tools without being locked into a specific vendor;  integrate easily with other systems.

2) SaaS-based Cloud Computing:

Delivers a quantum leap in cost-effectiveness without compromising on scalability.

In your example, I assume that you’d be able to score your 50,000 leads in one hour using one ADAPA engine on Amazon.  Therefore, you could choose to either spend US$100,000 or more on hardware, software, maintenance, IT services, etc., write a project proposal, get it approved by management, and be ready to score your model in 6-12 months…

OR, you could use ADAPA at something around US$1-$2 per day for the scenario above and get started today!  To get my point across here, I am of course simplifying the scenario a little bit, but in essence these are your choices.

Sounds too good to be true?  We often get this response, so please feel free to contact us today [http://www.zementis.com/contact.htm] and we will be happy show you how easy it can be to deploy predictive models with ADAPA!

 

Ajay- The ADAPA solution seems to save money on both hardware and software costs. Comment please. Also any benchmarking tests that you have done on a traditional scoring configuration system versus ADAPA.

Michael-Absolutely, the ADAPA Predictive Analytics Edition [http://www.zementis.com/predictive_analytics_edition.htm] on Amazon’s cloud computing infrastructure (Amazon EC2) eliminates the upfront investment in hardware and software.  It is a true Software as a Service (SaaS) offering on Amazon EC2 [http://www.zementis.com/howtobuy.htm] whereby users only pay for the actual machine time starting at less than US$1 per machine hour.  The ADAPA SaaS model is extremely dynamic, e.g., a user is able to select an instance type most appropriate for the job at hand (small, large, x-large) or launch one or even 100 instances within minutes.

In addition to the above savings in hardware/software, ADAPA also cuts the time-to-market for new models (priceless!) which adds to business agility, something truly critical for the current economic climate.

Regarding a benchmark comparison, it really depends on what is most important to the business.  Business agility, time-to-market, open standards for integration, or pure scoring performance?  ADAPA addresses all of the above.  At its core, it is a highly scalable scoring engine which is able to process thousands of transactions per second.  To tackle even the largest problems, it is easy to scale ADAPA via more CPUs, clustering, or parallel execution on multiple independent instances. 

Need to score lots of data once a month which would take 100 hours on one computer?  Simply launch 10 instances and complete the job in 10 hours over night.  No extra software licenses, no extra hardware to buy — that’s capacity truly on-demand, whenever needed, and cost-effective.

Ajay- What has been your vision for Zementis. What exciting products are we going to see from it next.

Michael – Our vision at Zementis [http://www.zementis.com] has been to make it easier for users to leverage analytics.  The primary focus of our products is on the deployment side, i.e., how to integrate predictive models into the business process and leverage them in real-time.  The complexity of deployment and the cost associated with it has been the main hurdle for a more widespread adoption of predictive analytics. 

Adhering to open standards like the Predictive Model Markup Language (PMML) [http://www.dmg.org/] and SOA-based integration, our ADAPA engine [http://www.zementis.com/products.htm] paves the way for new use cases of predictive analytics — wherever a painless, fast production deployment of models is critical or where the cost of real-time scoring has been prohibitive to date.

We will continue to contribute to the R/PMML export package [http://www.zementis.com/pmml_exporters.htm] and extend our free PMML converter [http://www.zementis.com/pmml_converters.htm] to support the adoption of the standard.  We believe that the analytics industry will benefit from open standards and we are just beginning to grasp what data-driven decision technology can do for us.  Without giving away much of our roadmap, please stay tuned for more exciting products that will make it easier for businesses to leverage the power of predictive analytics!

Ajay- Any India or Asia specific plans for the Zementis.

Michael-Zementis already serves customers in the Asia/Pacific region from its office in Hong Kong.  We expect rapid growth for predictive analytics in the region and we think our cost-effective SaaS solution on Amazon EC2 will be of great service to this market.  I could see various analytics outsourcing and consulting firms benefit from using ADAPA as their primary delivery mechanism to provide clients with predictive  models that are ready to be executed on-demand.

Ajay-What do you believe be the biggest challenges for analytics in 2009. What are the biggest opportunities.

Michael-The biggest challenge for analytics will most likely be the reduction in technology spending in a deep, global recession.  At the same time, companies must take advantage of analytics to cut cost, optimize processes, and to become more competitive.  Therefore, the biggest opportunity for analytics will be in the SaaS field, enabling clients to employ analytics without upfront capital expenditures.

Ajay – What made you choose a career in science. Describe your journey so far.What would your advice be to young science graduates in this recessionary times.

Michael- As a physicist, my research focused on neural networks and intelligent systems.  Predictive analytics is a great
way for me to stay close to science while applying such complex algorithms to solve real business problems.  Even in a recession, there is always a need for good people with the desire to excel in their profession.  Starting your career, I’d say the best way is to remain broad in expertise rather than being too specialized on one particular industry or proficient in a single analytics tool.  A good foundation of math and computer science, combined with curiosity in how to apply analytics to specific business problems will provide opportunities, even in the current economic climate.

About Zementis

Zementis, Inc. is a software company focused on predictive analytics and advanced Enterprise Decision Management technology. We combine science and software to create superior business imageand industrial solutions for our clients. Our scientific expertise includes statistical algorithms, machine learning, neural networks, and intelligent systems and our scientists have a proven record in producing effective predictive models to extract hidden patterns from a variety of data types. It is complemented by our product offering ADAPA®, a decision engine framework for real-time execution of predictive models and rules. For more information please visit www.zementis.com

Ajay-If you have a lot of data ( GB’s and GB’s) , an existing model ( in SAS,SPSS,R) which you converted to PMML, and it is time for you to choose between spending more money to upgrade your hardware, renew your software licenses  then instead take a look at the ADAPA from www.zementis.com and score models as low as 1$ per hour. Check it out ( test and control !!)

Do you have any additional queries from Michael ? Use the comments page to ask….

Updated-R for SAS and SPSS Users

Updated  –I  finally got my hardback copy of the R for SAS and SPSS users . Digital copies are one thing, but a paper book is really beautiful .I had written an article on R ( with some mild sarcasm on some other softwares that are mildly more expensive) at Smart Data Collective. That created around 711 views of that article, ( my website got X00 hits that day, which is a personal best ,ehmm 🙂

It also inspired Sandro, a terrific data miner from Switzerland and a PhD to write an article called 5 reasons R is good for you, which can be accessed here http://smartdatacollective.com/Home/15756 and http://dataminingresearch.blogspot.com/2009/01/top-5-reasons-r-is-good-for-you.html

The story of how I wrote that Top Ten R article is also amusing – mentioned here by Jerry who creates terrific communities for content , all extremely digital and informative , readable here –http://www.socialmediatoday.com/SMC/67268

Now the reason I originally became involved with R, was because I couldn’t afford SAS and SPSS on my own computer after years of getting companies to pick up the tab. A question on the R help list led me to Bob Muenchen , who had written a short guidebook on R for SAS and SPSS users, and was then finishing his book. The following article is interesting given that it was done almost 3-4 months back yet some themes and events seemed to recur exactly as Bob mentioned them. I still bounce between Bob’s book and the Rattle guide for R programming but I am getting there !!!

Note-Robert Muenchen (pronounced Min’-chen) is the author of the famous R for SAS and SPSS users, and his book is an extensive tutorial on anyone wanting to learn either SAS,SPSS,or R or even to migrate from one platform to another. In an exclusive interview Bob agreed to answer some questions on the book , and on students planning to enter science careers.

What made you write the R For SAS and SPSS users?

The book-

A few years ago, all my colleagues seemed to be suddenly talking about R. Had I tried it? What did I think? Wasn’t it amazing? I searched around for a review and found an article by Patrick Burns, "R Relative to Statistics Packages" which is posted on the UCLA site (http://www.ats.ucla.edu/stat/technicalreports/). That article pointed out the many advantages of R and in it Burns claimed that knowing a standard statistics package interfered with learning R. That article really got my interest up. Pat’s article was a rejoinder to "Strategically using General Purpose Statistics Packages: A Look at Stata, SAS and SPSS" by Michael Mitchell, then the manager of statistical consulting at UCLA (it’s at that same site). In it he said little about R, other than he had "enormous difficulties" learning it that he had especially found the documentation lacking.

I dove in and started learning R. It was incredibly hard work, most of which was caused by my expectations of how I thought it ought to work. I did have a lot to "unlearn" but once I figured a certain step out, I could see that explaining it to another SAS or SPSS user would be relatively easy. I started keeping notes on these differences for myself initially. I finally posted them on the Internet as the first version of R for SAS and SPSS Users. It was only 80 pages and much of its explanation was in the form of extensive R program comments. I provided 27 example programs, each done in SAS, SPSS and R. A person could see how they differed, topic by topic. When a person ran the sections of the R programs and read all the comments, he or she would learn how R worked.

A web page counter on that document showed it was getting about 10,000 hits a month. That translates into about 300 users, paging back and forth through the document. An editor from Springer emailed me to ask if I could make it a book. I said it might be 150 pages when I wrote out the prose to replace all the comments. It turned out to be 480 pages!

What are the salient points in this book ?

The main point is that having R taught to you using terms you already know will make R much easier to learn. SAS and SPSS concepts are used in the body of the book as well as the table of contents, the index and even the glossary. For example, the table of contents has an entry for "Value Labels or Formats" even though R uses neither of those terms as SPSS and SAS do, respectively. The index alone took over 80 hours to compile because it is important for people to be able to look up things like "length" as both a SAS statement and as an R function. The glossary defines R terms using SAS/SPSS jargon and then again using proper R definitions.

SAS and SPSS each have five main parts: 1) commands to read and manage data, 2) procedures for statistics & graphics, 3) output management systems that allow you to use output as input to other analyses, 4) a macro language to automate the above steps and finally 5) a matrix language to help you extend the packages. All five of these parts use different statements and rules that do not apply to the others. Due to the complexity of all this, many SAS and SPSS users never get past the first two parts.

R instead has all these functions unified into a common single structure. That makes it much more flexible and powerful. This claim may seem to be a matter of opinion, but the evidence to back it up comes from the companies themselves. The developers at SAS Institute and SPSS Inc. don’t write their procedures in their own languages, R developers do.

How do you think R will impact the statistical software vendors?

With more statistical procedures than any other package, and its free price, some people think R will put many of the proprietary vendors out of business. R is a tsunami coming at the vendors and how they respond will determine their future. Take SPSS Inc. for example. They have written an excellent interface to R that lets you transfer your data back and forth, letting you run R functions in the middle of your SPSS programs. I show how to use it in my book. Starting with SPSS 17, you can also add R functions to the SPSS menus. This is particularly important because most SPSS users prefer to use menus. The company itself is adding menus to R functions, letting them rapidly expand SPSS’ capabilities at very little expense. They saw the R tsunami coming and they hopped on a surfboard to make the most of it. I think this attitude will help them thrive in the future.

SAS Institute so far as been ignoring R. That means if you need to use an analytic method that is only available in R, you must learn much more R than an SPSS user would. Once you have done that, you might be much more likely to switch over completely to R. Colleagues inside SAS Institute tell me they are debating whether they should follow SPSS’ lead and write a link to R. T
his has already been done by MineQuest, LLC (see http://www.minequest.com/Products.html ) with their amusingly named, "A bridge to R" product (playing off "A Bridge Too Far.")

Statistica is officially supporting R. You can read about the details at (http://www.statsoft.com/industries/Rlanguage.htm) . Statacorp has not supported R in Stata yet, although a user, Roger Newson, has written an R interface to it (http://ideas.repec.org/c/boc/bocode/s456847.html).

The company with the most to lose are the makers of S-PLUS. That was Insightful Corp. until they were recently bought out by Tibco. Since R is an implementation of the S language, S-PLUS could be hit pretty hard. On the other hand, they do have functions that handle "big data" so there is a chance that people will develop programs in R, run out of memory and then end up porting them to S-PLUS. S-PLUS also has a more comprehensive graphical user interface than R does, giving them an advantage. However, XL-Solutions Corp. has their new R-PLUS version that adds a slick GUI to R (http://www.experience-rplus.com/). There could be a rocky road ahead for S-PLUS. IBM faced a similar dilemma when computing hardware started becoming commodities. They prospered by making up the difference with service income. Perhaps Tibco can too.

Do you have special discounts for students?

My original version of R for SAS and SPSS Users is still online at http://RforSASandSPSSusers.com so students can get it there for free. The book version has a small market that is mostly students so pricing was set with that in mind.

What made you choose a career in Science and what have been the reasons for your success in it.

I started out as an accounting major. I was lucky enough to have had two years of bookkeeping in high school, and I worked part-time in the accounting department of ServiceMaster Industries for several years. I got to fill in for whoever was on vacation, so I got a broad range of accounting experience. I also got my first experience with statistics by helping the auditors. We took a stratified sample of transactions. With transactions divided into segments by their value, and sample a greater proportion as the value increased. For the most expensive transactions, we examined them all. My job was to be the "gofer" who collected all the invoices, checks, etc. to prove that the transactions were real. For a kid in high school, that was great fun!

By the time I was a freshman at Bradley University, I became excited by three new areas: mathematics, computing and psychology. I got to work in a lab at the Peoria Addictions Research Institute, studying addiction in rats and the parts of the brain that were involved. I wrote a simple stat package in FORTRAN to analyze data. After getting my B.A. in psychology, I worked on a PhD in Educational Psychology at Arizona State University. I loved that field and did well, but the job market for professors in that field was horrible at the time. So I transferred to a PhD program in Industrial/Organizational Psychology at The University of Tennessee. It turned out that I did not really care for that area at all, and I spent much of my time studying computing and calculus. My assistantship was with the Department of Statistics. By the time my first year was up, I transferred to statistics. At the time the department lacked a PhD program, so after four years of grad school I stopped with an M.S. in Statistics and got a job as a computing consultant helping people with their SAS, SPSS and STATGRAPHICS programs. Later I was able to expand that role, creating a full-fledged statistical consulting center in partnership with the Department of Statistics. Ongoing funding cuts have been chipping away at that concept though.

What made me a success? I love my job! I get to work with a lot of smart scientists and their grad students, expanding scientific knowledge. What could be better?

Science is boring, and not well paying career compared to being a lawyer or a sales job. People think you are a nerd. Please comment based on your experiences.

Science is constantly making new discoveries. That’s not boring! An area that most people can relate to is medicine. When we finish a study that shows a new treatment is better than an old one, our efforts will help thousands of people. In one study we compared a new, very expensive anti-nausea drug to an old one that was quite cheap. The pharmaceutical company claimed the new drug was better of course, but our study showed that it was not. That ended up helping to control health care costs that we all see escalating rapidly.

Another study found for the first time, a measure that could predict how well a hearing aid would help a person. Now, it’s easy to measure a hearing aid and see that it is doing what it is supposed to do, but a huge proportion of people who buy them don’t like them and stop wearing them after a brief period. Scientists tried for decades to predict which people would not be good candidates for hearing aids. A very sharp scientist at UT, Anna Nabelek, came up with the concept of Acceptable Noise Level. We measured how much background noise people were willing to tolerate before trying a hearing aid. That allowed us to develop a model that could predict well for the first time if someone should bother spending up to $5,000 for hearing aids. For retired people on a fixed income, that was an important finding. An audiology journal devoted an entire issue to the work.

It’s true that you can make more money in many other fields. But the excitement of discovery and the feeling that I’m helping to extend science very satisfying and well worth the lower salary. Plus, having a job in science means you will never have a chance to get bored!

What is your view on Rice University’s initiatives to create open source textbooks at http://cnx.org/ .

I think this is a really good idea. One of my favorite statistics books is Statnotes: Topics in Multivariate Analysis, by G David Garson. You can read it for free at http://www2.chass.ncsu.edu/garson/pa765/statnote.htm .

Universities pay professors to spend their time doing research, which must be published to get credit. So why not pay professors to write text books too? There have been probably hundreds of introductory books in every imaginable field. They cannot all make it in the marketplace so when they drop out of publication, why not make them available for free? I still have my old Introductory Statistics textbook from 30 years ago and the material is still good. It may be missing a few modern things like boxplots, but it would not take much effort to bring it up to date.

I’m also a huge fan of Project Gutenburg (http://www.archive.org/details/gutenberg). That is a collection of over 20,000 books, articles, etc. available there for free download. My wife does volunteer project management and post-processing with Distributed Proofreaders (http://www.pgdp.net/) which supplies books for Gutenburg.

What are your vie
ws on students uploading scanned copies of books to torrent sharing web sites because of expensive books.

The cost of textbooks has gotten out of hand. I think students should pressure universities and professors to consider cheaper alternatives. However scanning books putting them up on web sites isn’t sharing, it’s stealing. I put in most of my weekends and nights for 2 ½ years on my book that will be lucky to sell a few thousand copies. That works out to pennies per hour. Seeing it scanned in would be quite depressing.

When is the book coming out ? What is taking so long ?

We ran into problems when the book was translated from Microsoft Word to LaTeX. The translator program did not anticipate that an index would already be in place. That resulted in 2-3 errors per page. We’re working through that and should finally get it printed in early October.

Biography

Robert A. Muenchen is a consulting statistician with 28 years of experience. He is currently the manager of the Statistical Consulting Center at the University of Tennessee. He holds a B.A. in Psychology and an M.S. in Statistics. Bob has conducted research for a variety of public and private organizations and has assisted on more than 1,000 graduate theses and dissertations. He has coauthored over 40 articles published in scientific journals and conference proceedings. Bob has served on the advisory boards of SPSS Inc., the Statistical Graphics Corporation and PC Week Magazine. His suggested improvements have been incorporated into SAS, SPSS, JMP, STATGRAPHICS and several R packages. His research interests include statistical computing, data graphics and visualization,text analysis, data mining, psychometrics and resampling.

Ajay-He is also a very modest and great human being.

http://www.amazon.com/SAS-SPSS-Users-Statistics-Computing/dp/0387094172/ref=pd_bbs_sr_1?ie=UTF8&s=books&qid=1217456813&sr=8-1