Modified Ohri Framework


Some time back, I had created a framework for data mining through on demand cloud computing. This is the next version- it is free to use for all, with only authorship credit back to me…………..
It tries to do away with fixed server ,desktop costs AND fixed software costs in softwares which are used for data mining ,stats and analytics and have huge huge per CPU count annual license fees


The modified Ohri Framework tries to mash the following


0) HTTPS rather than HTTP

1) Encryption and Compression Software for data transfer (like PGP)

2) Open source stats package like R in cloud computer (like Amazon EC2 or Rightscale  with hadoop)

3) GUI to make it easy to use (like Rattle GUI and PMML Package)

4) A Data Mining Open Source Package (like Rapid Miner or Splunk)

5) RIA Graphics (like Silverlight )

6) Secure Output to cloud computing devices (like Google Docs)

7) Billing or Priced at simple cost plus X % (where simple cost can be like 0.85 cent /per instance hour or more depending on usage and X should not be more than 15 %)

8) Open source sharing of all code to ensure community sandboxing


Intention is to remove fixed computing costs of servers and desktops to normal PC’s (Ubuntu Linux ) with (Firefox or IE Explorer ) access to secure data mining on demand .

On tap demand mining to anyone in the world without going for the big license purchases/renewals (software expenses) or big hardware purchases (which become obsolete in 2-3 years).



The Ohri Framework – Data Mining on Demand

The Ohri Framework tries to create an economic alternative to proprietary data mining softwares by giving more value to the customer and utilizing open source statistical package R , with the GUI Rattle , hosted on a cloud computing environment.

It is based on the following assumptions-

1) R is relatively inefficient in processing bigger file sizes on same desktop configuration as other softwares like SAS.

2) R has a steep learning curve , hence the need for the GUI Rattle .

3) The enhanced need for computing resources for R is best solved using a cloud computing on demand processing environment. This enables R to scale up to whatever processing power it needs. Mainstream data mining softwares charge by CPU count for servers and are much more expensive due to software costs alone.

Continue reading “The Ohri Framework – Data Mining on Demand”

%d bloggers like this: