What is Cloud Computing

Here is nice video on What is Cloud Computing . It was created by  Joyent http://joyent.com/. It basically shows you the perspectives on how cloud computing is without getting into jargon, technical gab or semantics.

Enjoy!

 

 

It is also available at http://www.youtube.com/watch?v=6PNuQHUiV3Q

Cloud Nine

I got a note saying my entry below has been accepted for the Cloud Slam 09 Webinar

If you want to hear me speak on Cloud Computing, please mark your calendars for April 21.

Here is the Link to April 21 Webinar-

http://tr.im/ixF5

 

Abstract.

 

The cloud computing paradigm offers unparalleled access to computing resources both in terms of storage as well as processing power for developing countries.The use of predictive analytics and data mining has been hitherto restricted to an elite set of universities and organizations willing to invest tens of thousands in annual license 
fees to software companies like SAS ,SPSS,Oracle and SAP and even more in terms of network and server hardware costs to companies like HP,Dell and IBM.

Every two or three years, the hardware needed to be upgraded , thus putting total cost of ownership of predictive analytics, data driven decision making and resource planning well out of reach of a major part of the planet’s population.Copyright infringements and intellectual property violations further helped create a divide between advanced computing and those who needed it the most.

Now thanks to open source softwares, softwares as a service and cloud hosted processing , even a relatively non funded Indian or Asian or African university ,government office as well as small and medium enterprise can avail the advanced cost savings due to predictive analytics. This in turn will lead to a new era of resource optimized decision making, one which benefits all companies that offer the flexibility of cloud hosted applications to hitherto closed markets.

Which reminds me I have to prepare the presentation as well.I will post the slides and full article here on this blog too.

Speaking of Webinars, here is one which I am helping which tries to showcase technological methods including CRM and BI to help manage cost challenges and marketing ops in the recession. Its on 11 a.m. EST April 16, 2009

http://tr.im/isf

Using Web 2.0 for Analytics 2.0

Here is a great video tutorial on You Tube by Zementis, creator of ADAPA,the cloud scoring engine for next gen predictive analytics. You can watch it on the URL or below-

http://www.youtube.com/watch?v=8hNqxqrdXLI

 

A few weeks back, I was working with the ADAPA engine on a consulting gig, and Ron Ramos, the head of sales mentioned that though they have extensive documentation, they were planning a video tutorial as well on You Tube.

Beats a pdf everytime , doesnt it !!!

I wonder why companies continue to spend huge and I mean huge amounts on white papers and PDFs when they can have much better customer support using a bit of audio, video and even twitter support.

Surprisingly true even for companies working at the cutting edge with other technologies.And the essentially free availability of these tools.

 

I mean if companies can spend huge amounts for predictive solutions for the big big datasets , why cant they offer some solutions or apps for the web and social media- An exception is KXEN of course with a new Social Network Analysis Module here ).

Imagine a future –

( Example

  • Hello SAS , My code wont run blah blah blah

SAS Support on Twitter..okay do this

or

  • Hello SPSS, Where Can I find some stuff on Python because I got lost on the website
  • SPSS Support on Skype/Twitter- Dude , do this , click this link !

)

It is much better than endless rounds of email, aggravation and the list server method is well the users should try and test www.twitter.com for user groups )

KNIME and Zementis shake hands

Two very good and very customer centric (and open source ) companies shook hands on a strategic partnership today.

Knime  www.knime.org and Zementis www.zementis.com .

Decision Stats has been covering these companies and both the products are amazing good, synch in very well thanks to the support of the PMML standard and lower costs considerably for the consumer. (http://www.decisionstats.com/2009/02/knime/ ) and http://www.decisionstats.com/2009/02/interview-michael-zeller-ceozementis/ )

While Knime has both a free personal as well as a commercial license , it supports R thanks to the PMML (www.dmg.org initiative ). Knime also supports R very well .

See http://www.knime.org/blog/export-and-convert-r-models-pmml-within-knime

The following example R script learns a decision tree based on the Iris-Data and exports this as PMML and as an R model which is understood by the R Predictor node:

# load the library for learning a tree model
library(rpart);
# load the pmml export library
library(pmml);
# use class column as predicted column to build decision tree
dt <- rpart(class~., R)
# export to PMML
r_pmml <- pmml(dt)
# write the PMML model to an export file
write(toString(r_pmml), file="C:/R.pmml")
# provide the native R model at the out-port
R<-dt

 

Zementis takes the total cost of ownership and total pain of creating scored models to something close to 1$ /hour thanks to using their proprietary ADAPA engine.

As mentioned before, Zementis is at the forefront of using Cloud Computing ( Amazon EC2 ) for open source analytics. Recently I came in contact with Michael Zeller for a business problem , and Mike being the gentleman he is not only helped me out but also agreed on an extensive and exclusive interview.(!)

image

Ajay- What are the traditional rivals to scoring solutions offered by you. How does ADAPA compare to each of them. Case Study- Assume I have 50000 leads daily on a Car buying website. How would ADAPA help me in scoring the model ( created say by KXEN or , R or,SAS, or SPSS).What would my approximate cost advantages be if I intend to mail say the top 5 deciles everyday.

Michael- Some of the traditional scoring solutions used today are based on SAS, in-database scoring like Oracle, MS SQL Server, or very often even custom code.  ADAPA is able to import the models from all tools that support the PMML standard, so any of the above tools, open source or commercial, could serve as an excellent development environment.

The key differentiators for ADAPA are simple and focus on cost-effective deployment:

1) Open Standards – PMML & SOA:

Freedom to select best-of-breed development tools without being locked into a specific vendor;  integrate easily with other systems.

2) SaaS-based Cloud Computing:

Delivers a quantum leap in cost-effectiveness without compromising on scalability.

In your example, I assume that you’d be able to score your 50,000 leads in one hour using one ADAPA engine on Amazon.  Therefore, you could choose to either spend US$100,000 or more on hardware, software, maintenance, IT services, etc., write a project proposal, get it approved by management, and be ready to score your model in 6-12 months

OR, you could use ADAPA at something around US$1-$2 per day for the scenario above and get started today!  To get my point across here, I am of course simplifying the scenario a little bit, but in essence these are your choices.

Sounds too good to be true?  We often get this response, so please feel free to contact us today [http://www.zementis.com/contact.htm] and we will be happy show you how easy it can be to deploy predictive models with ADAPA!

 

Ajay- The ADAPA solution seems to save money on both hardware and software costs. Comment please. Also any benchmarking tests that you have done on a traditional scoring configuration system versus ADAPA.

Michael-Absolutely, the ADAPA Predictive Analytics Edition [http://www.zementis.com/predictive_analytics_edition.htm] on Amazon’s cloud computing infrastructure (Amazon EC2) eliminates the upfront investment in hardware and software.  It is a true Software as a Service (SaaS) offering on Amazon EC2 [http://www.zementis.com/howtobuy.htm] whereby users only pay for the actual machine time starting at less than US$1 per machine hour.  The ADAPA SaaS model is extremely dynamic, e.g., a user is able to select an instance type most appropriate for the job at hand (small, large, x-large) or launch one or even 100 instances within minutes.

In addition to the above savings in hardware/software, ADAPA also cuts the time-to-market for new models (priceless!) which adds to business agility, something truly critical for the current economic climate.

Regarding a benchmark comparison, it really depends on what is most important to the business.  Business agility, time-to-market, open standards for integration, or pure scoring performance?  ADAPA addresses all of the above.  At its core, it is a highly scalable scoring engine which is able to process thousands of transactions per second.  To tackle even the largest problems, it is easy to scale ADAPA via more CPUs, clustering, or parallel execution on multiple independent instances. 

Need to score lots of data once a month which would take 100 hours on one computer?  Simply launch 10 instances and complete the job in 10 hours over night.  No extra software licenses, no extra hardware to buy — that’s capacity truly on-demand, whenever needed, and cost-effective.

Ajay- What has been your vision for Zementis. What exciting products are we going to see from it next.

Michael – Our vision at Zementis [http://www.zementis.com] has been to make it easier for users to leverage analytics.  The primary focus of our products is on the deployment side, i.e., how to integrate predictive models into the business process and leverage them in real-time.  The complexity of deployment and the cost associated with it has been the main hurdle for a more widespread adoption of predictive analytics. 

Adhering to open standards like the Predictive Model Markup Language (PMML) [http://www.dmg.org/] and SOA-based integration, our ADAPA engine [http://www.zementis.com/products.htm] paves the way for new use cases of predictive analytics — wherever a painless, fast production deployment of models is critical or where the cost of real-time scoring has been prohibitive to date.

We will continue to contribute to the R/PMML export package [http://www.zementis.com/pmml_exporters.htm] and extend our free PMML converter [http://www.zementis.com/pmml_converters.htm] to support the adoption of the standard.  We believe that the analytics industry will benefit from open standards and we are just beginning to grasp what data-driven decision technology can do for us.  Without giving away much of our roadmap, please stay tuned for more exciting products that will make it easier for businesses to leverage the power of predictive analytics!

Ajay- Any India or Asia specific plans for the Zementis.

Michael-Zementis already serves customers in the Asia/Pacific region from its office in Hong Kong.  We expect rapid growth for predictive analytics in the region and we think our cost-effective SaaS solution on Amazon EC2 will be of great service to this market.  I could see various analytics outsourcing and consulting firms benefit from using ADAPA as their primary delivery mechanism to provide clients with predictive  models that are ready to be executed on-demand.

Ajay-What do you believe be the biggest challenges for analytics in 2009. What are the biggest opportunities.

Michael-The biggest challenge for analytics will most likely be the reduction in technology spending in a deep, global recession.  At the same time, companies must take advantage of analytics to cut cost, optimize processes, and to become more competitive.  Therefore, the biggest opportunity for analytics will be in the SaaS field, enabling clients to employ analytics without upfront capital expenditures.

Ajay – What made you choose a career in science. Describe your journey so far.What would your advice be to young science graduates in this recessionary times.

Michael- As a physicist, my research focused on neural networks and intelligent systems.  Predictive analytics is a great
way for me to stay close to science while applying such complex algorithms to solve real business problems.  Even in a recession, there is always a need for good people with the desire to excel in their profession.  Starting your career, I’d say the best way is to remain broad in expertise rather than being too specialized on one particular industry or proficient in a single analytics tool.  A good foundation of math and computer science, combined with curiosity in how to apply analytics to specific business problems will provide opportunities, even in the current economic climate.

About Zementis

Zementis, Inc. is a software company focused on predictive analytics and advanced Enterprise Decision Management technology. We combine science and software to create superior business imageand industrial solutions for our clients. Our scientific expertise includes statistical algorithms, machine learning, neural networks, and intelligent systems and our scientists have a proven record in producing effective predictive models to extract hidden patterns from a variety of data types. It is complemented by our product offering ADAPA, a decision engine framework for real-time execution of predictive models and rules. For more information please visit www.zementis.com

Ajay-If you have a lot of data ( GBs and GBs) , an existing model ( in SAS,SPSS,R) which you converted to PMML, and it is time for you to choose between spending more money to upgrade your hardware, renew your software licenses  then instead take a look at the ADAPA from www.zementis.com and score models as low as 1$ per hour. Check it out ( test and control !!)

Do you have any additional queries from Michael ? Use the comments page to ask.

Interview:Richard Schultz , CEO REvolution Computing

Here is an interview with the CEO of REvolution Computing, Richard Schultz. Mr. Schultz offers his perspectives on aspects of the open source, predictive analytics, cloud computing as well his vision for R Commercial.

Note from Ajay-As I blogged previously, commercial establishments now have an option to use R commercially with a full service contract and all guarantees which they expect and get from existing analytics software vendors.

Ajay -Linux has not really succeeded in capturing Windows /Desktop Operating market. What are the technical and business reasons that you think R will succeed in analytics desktop software market.

Richard- To start, Linux was never really targeted at the Windows desktop market, but rather at deseating proprietary Unix deployments (particularly in finance), which it did quite successfully.  This is a similar trend to what we’re seeing in the R world – it’s not that R is generally replacing Excel, for instance.  In addition, with the large and growing base of both users and contributors, the vibrancy of the R community has taken on a life of its own.

As to R and Windows, two things are worth noting:

1. Microsoft has moved rapidly to embrace R and REvolution for that matter.

2. Windows is still the predominate operating system in large commercial enterprises. Because we deploy R on multiprocessors, which are now common on all computers including those pre-loaded with Windows, REvolution R is very much at home in both Windows, Mac, and Linux environments.

Ajay- What are the biggest challenges to Revolution Computing while explaining R Pro to users of traditional statistics softwares. What are the biggest advantages?

Richard- The biggest challenge is getting the word out that there now exists validated and supported R products designed for commercial use. But that’s changing rapidly, as your own interest in REvolution Computing demonstrates. Our biggest advantages are several:

1. we are focused on building a close and collegial relationship with the open source R community;

2. our company has a deep history in super computing and parallelization;

3. with, by Intel’s estimate, over 1 million R users and growing, there is a large community eager to adapt our products as its members advance their careers in the business and research worlds.

Ajay- Which softwares do you think will be affected the most by R’s spread across colleges and companies. What do you believe will be their strategies to compete.


Richard – I want to be politic here. Let me say that the programming software likely most affected by the rise of R is probably proprietary.

We see many opportunities to partner and leverage the strengths of REvolution’s products specifically – high performance, handling of large data, validation, IDE / user interface.

Ajay- How do you intend to incorporate the cloud computing and Software as a Service Model for R Pro. When , if at all, do you think it be possible  for a person to simply upload a zipped csv file, work on a remote cloud computer for analytics and forecasting, and just pay for the hired software,hardware,bandwidth.

Richard – We were thinking of something based on the Ohri framework.  ;-). ( Ajay- Touché!)

In fact, we have deployed, and are deploying cloud-based REvolution R for clients, and it’s something we expect to evolve as those technologies evolve.


Ajay- Asian countries have huge demand for analytics, and are more price conscious on softwares. What would your strategy to sell in Asia /China and India be.

Richard – Open source can be a tremendous win for users in Asia / China / India.  The upfront costs are low, the technology is leading-edge, and there is a distribution network for support.  REvolution has partners, and is continuing to build its partner network to be able to reach these markets.  We expect to accelerate our efforts in these regions toward the end of 2009.

Ajay- What has been the story so far for your career. What prompted you to join/start Revolution Computing. What would be the advice you would give to young science graduates in today’s recession.

Richard – My own background is in computer science, business… and music. Through school I held various positions at IBM, and after graduate school, I worked at Dunn & Bradsteet in a product management role and developed a taste for entrepreneurship. I’ve started two companies so far, MetaServer, a business intelligence middleware company that catered to the insurance industry, and REvolution Computing. Today, MetaServer is part of Oracle. And I continue to play music – guitar and piano. One of these days we’ll get a REvolution Computing band together.

My advice to young science graduates is the same recession or no: follow your enthusiasms; find a passion outside of work like playing music; master open source program languages because that is the future and the future is here.

About Richard Schultz –Chief Executive Officer,REvolution Computing

Richard guides REvolution’s long-range business strategy and leads the company’s teams on a daily basis. His experience developing and growing Business Intelligence software companies includes founding and leading Metaserver, Inc., now a part of Oracle, from inception to sale. Richard has been named Innovator of the Year by Business New Haven; served on the board of the Connecticut Venture Group; and been the keynote speaker for CIO Forum and other technology industry events.  A graduate of Washington University with degrees in Computer Science, Business and Music, Richard also holds a Masters degree in Computer Science from the State University of New York at Stonybrook and has held senior positions at Dunn and Bradstreet and IBM.

Ajay -REvolution Computing has been a leader in this field and going by the latest product launch –well you can try it yourself and see from here http://www.revolution-computing.com