Linux Counter- Use Linux so be counted

Here’s a nice website at

http://counter.li.org/

You can basically spend 2 minutes and register yourself publicly/anonymously/or your machine

and some fun at http://counter.li.org/estimates.php

Carole-Ann’s 2011 Predictions for Decision Management

Carole-Ann’s 2011 Predictions for Decision Management

For Ajay Ohri on DecisionStats.com

What were the top 5 events in 2010 in your field?
  1. Maturity: the Decision Management space was made up of technology vendors, big and small, that typically focused on one or two aspects of this discipline.  Over the past few years, we have seen a lot of consolidation in the industry – first with Business Intelligence (BI) then Business Process Management (BPM) and lately in Business Rules Management (BRM) and Advanced Analytics.  As a result the giant Platform vendors have helped create visibility for this discipline.  Lots of tiny clues finally bubbled up in 2010 to attest of the increasing activity around Decision Management.  For example, more products than ever were named Decision Manager; companies advertised for Decision Managers as a job title in their job section; most people understand what I do when I am introduced in a social setting!
  2. Boredom: unfortunately, as the industry matures, inevitably innovation slows down…  At the main BRMS shows we heard here and there complaints that the technology was stalling.  We heard it from vendors like Red Hat (Drools) and we heard it from bored end-users hoping for some excitement at Business Rules Forum’s vendor panel.  They sadly did not get it
  3. Scrum: I am not thinking about the methodology there!  If you have ever seen a rugby game, you can probably understand why this is the term that comes to mind when I look at the messy & confusing technology landscape.  Feet blindly try to kick the ball out while superhuman forces are moving randomly the whole pack – or so it felt when I played!  Business Users in search of Business Solutions are facing more and more technology choices that feel like comparing apples to oranges.  There is value in all of them and each one addresses a specific aspect of Decision Management but I regret that the industry did not simplify the picture in 2010.  On the contrary!  Many buzzwords were created or at least made popular last year, creating even more confusion on a muddy field.  A few examples: Social CRM, Collaborative Decision Making, Adaptive Case Management, etc.  Don’t take me wrong, I *do* like the technologies.  I sympathize with the decision maker that is trying to pick the right solution though.
  4. Information: Analytics have been used for years of course but the volume of data surrounding us has been growing to unparalleled levels.  We can blame or thank (depending on our perspective) Social Media for that.  Sites like Facebook and LinkedIn have made it possible and easy to publish relevant (as well as fluffy) information in real-time.  As we all started to get the hang of it and potentially over-publish, technology evolved to enable the storage, correlation and analysis of humongous volumes of data that we could not dream of before.  25 billion tweets were posted in 2010.  Every month, over 30 billion pieces of data are shared on Facebook alone.  This is not just about vanity and marketing though.  This data can be leveraged for the greater good.  Carlos pointed to some fascinating facts about catastrophic event response team getting organized thanks to crowd-sourced information.  We are also seeing, in the Decision management world, more and more applicability for those very technology that have been developed for the needs of Big Data – I’ll name for example Hadoop that Carlos (yet again) discussed in his talks at Rules Fest end of 2009 and 2010.
  5. Self-Organization: it may be a side effect of the Social Media movement but I must admit that I was impressed by the success of self-organizing initiatives.  Granted, this last trend has nothing to do with Decision Management per se but I think it is a great evolution worth noting.  Let me point to a couple of examples.  I usually attend traditional conferences and tradeshows in which the content can be good but is sometimes terrible.  I was pleasantly surprised by the professionalism and attendance at *un-conferences* such as P-Camp (P stands for Product – an event for Product Managers).  When you think about it, it is already difficult to get a show together when people are dedicated to the tasks.  How crazy is it to have volunteers set one up with no budget and no agenda?  Well, people simply show up to do their part and everyone has fun voting on-site for what seems the most appealing content at the time.  Crowdsourcing applied to shows: it works!  Similar experience with meetups or tweetups.  I also enjoyed attending some impromptu Twitter jam sessions on a given topic.  Social Media is certainly helping people reach out and get together in person or virtually and that is wonderful!

A segment of a social network
Image via Wikipedia

What are the top three trends you see in 2011?

  1. Performance:  I might be cheating here.   I was very bullish about predicting much progress for 2010 in the area of Performance Management in your Decision Management initiatives.  I believe that progress was made but Carlos did not give me full credit for the right prediction…  Okay, I am a little optimistic on timeline…  I admit it…  If it did not fully happen in 2010, can I predict it again in 2011?  I think that companies want to better track their business performance in order to correct the trajectory of course but also to improve their projections.  I see that it is turning into reality already here and there.  I expect it to become a trend in 2011!
  2. Insight: Big Data being available all around us with new technologies and algorithms will continue to propagate in 2011 leading to more widely spread Analytics capabilities.  The buzz at Analytics shows on Social Network Analysis (SNA) is a sign that there is interest in those kinds of things.  There is tremendous information that can be leveraged for smart decision-making.  I think there will be more of that in 2011 as initiatives launches in 2010 will mature into material results.
    5 Ways to Cultivate an Active Social Network
    Image by Intersection Consulting via Flickr
  3. Collaboration:  Social Media for the Enterprise is a discipline in the making.  Social Media was initially seen for the most part as a Marketing channel.  Over the years, companies have started experimenting with external communities and ideation capabilities with moderate success.  The few strategic initiatives started in 2010 by “old fashion” companies seem to be an indication that we are past the early adopters.  This discipline may very well materialize in 2011 as a core capability, well, or at least a new trend.  I believe that capabilities such Chatter, offered by Salesforce, will transform (slowly) how people interact in the workplace and leverage the volumes of social data captured in LinkedIn and other Social Media sites.  Collaboration is of course a topic of interest for me personally.  I even signed up for Kare Anderson’s collaboration collaboration site – yes, twice the word “collaboration”: it is really about collaborating on collaboration techniques.  Even though collaboration does not require Social Media, this medium offers perspectives not available until now.

Brief Bio-

Carole-Ann is a renowned guru in the Decision Management space. She created the vision for Decision Management that is widely adopted now in the industry. Her claim to fame is the strategy and direction of Blaze Advisor, the then-leading BRMS product, while she also managed all the Decision Management tools at FICO (business rules, predictive analytics and optimization). She has a vision for Decision Management both as a technology and a discipline that can revolutionize the way corporations do business, and will never get tired of painting that vision for her audience. She speaks often at Industry conferences and has conducted university classes in France and Washington DC.

Leveraging her Masters degree in Applied Mathematics / Computer Science from a “Grande Ecole” in France, she started her career building advanced systems using all kinds of technologies — expert systems, rules, optimization, dashboarding and cubes, web search, and beta version of database replication – as well as conducting strategic consulting gigs around change management.

She now tweets as @CMatignon, blogs at blog.sparklinglogic.com and interacts at community.sparklinglogic.com.

She started her career building advanced systems using all kinds of technologies — expert systems, rules, optimization, dashboarding and cubes, web search, and beta version of database replication.  At Cleversys (acquired by Kurt Salmon & Associates), she also conducted strategic consulting gigs mostly around change management.

While playing with advanced software components, she found a passion for technology and joined ILOG (acquired by IBM).  She developed a growing interest in Optimization as well as Business Rules.  At ILOG, she coined the term BRMS while brainstorming with her Sales counterpart.  She led the Presales organization for Telecom in the Americas up until 2000 when she joined Blaze Software (acquired by Brokat Technologies, HNC Software and finally FICO).

Her 360-degree experience allowed her to gain appreciation for all aspects of a software company, giving her a unique perspective on the business.  Her technical background kept her very much in touch with technology as she advanced.

She also became addicted to Twitter in the process.  She is active on all kinds of social media, always looking for new digital experience!

Outside of work, Carole-Ann loves spending time with her two boys.  They grow fruits in their Northern California home and cook all together in the French tradition.

profile on LinkedIn

TwitterFollow me on Twitter

Filtering to Gain Social Network Value
Image by Intersection Consulting via Flickr
Social Networks Hype Cycle
Image by fredcavazza via Flickr

The Year 2010

Nokia N800 internet tablet, with open source s...
Image via Wikipedia

My annual traffic to this blog was almost 99,000 . Add in additional views on networking sites plus the 400 plus RSS readers- so I can say traffic was 1,20,000 for 2010. Nice. Thanks for reading and hope it was worth your time. (this is a long post and will take almost 440 secs to read but the summary is just given)

My intent is either to inform you, give something useful or atleast something interesting.

see below-

Jan Feb Mar Apr May Jun
2010 6,311 4,701 4,922 5,463 6,493 4,271
Jul Aug Sep Oct Nov Dec Total
5,041 5,403 17,913 16,430 11,723 10,096 98,767

 

 

Sandro Saita from http://www.dataminingblog.com/ just named me for an award on his blog (but my surname is ohRi , Sandro left me without an R- What would I be without R :)) ).

Aw! I am touched. Google for “Data Mining Blog” and Sandro is the best that it is in data mining writing.

DMR People Award 2010
There are a lot of active people in the field of data mining. You can discuss with them on forums. You can read their blogs. You can also meet them in events such as PAW or KDD. Among the people I follow on a regular basis, I have elected:

Ajay Ori

He has been very active in 2010, especially on his blog . Good work Ajay and continue sharing your experience with us!”

What did I write in 2010- stuff.

What did you read on this blog- well thats the top posts list.

2009-12-31 to Today

Title Views
Home page More stats 21,150
Top 10 Graphical User Interfaces in Statistical Software More stats 6,237
Wealth = function (numeracy, memory recall) More stats 2,014
Matlab-Mathematica-R and GPU Computing More stats 1,946
The Top Statistical Softwares (GUI) More stats 1,405
About DecisionStats More stats 1,352
Using Facebook Analytics (Updated) More stats 1,313
Test drive a Chrome notebook. More stats 1,170
Top ten RRReasons R is bad for you ? More stats 1,157
Libre Office More stats 1,151
Interview Hadley Wickham R Project Data Visualization Guru More stats 1,007
Using Red R- R with a Visual Interface More stats 854
SAS Institute files first lawsuit against WPS- Episode 1 More stats 790
Interview Professor John Fox Creator R Commander More stats 764
R Package Creating More stats 754
Windows Azure vs Amazon EC2 (and Google Storage) More stats 726
Norman Nie: R GUI and More More stats 716
Startups for Geeks More stats 682
Google Maps – Jet Ski across Pacific Ocean More stats 670
Not so AWkward after all: R GUI RKWard More stats 579
Red R 1.8- Pretty GUI More stats 570
Parallel Programming using R in Windows More stats 569
R is an epic fail or is it just overhyped More stats 559
Enterprise Linux rises rapidly:New Report More stats 537
Rapid Miner- R Extension More stats 518
Creating a Blog Aggregator for free More stats 504
So which software is the best analytical software? Sigh- It depends More stats 473
Revolution R for Linux More stats 465
John Sall sets JMP 9 free to tango with R More stats 460

So how do people come here –

well I guess I owe Tal G for almost 9000 views ( incidentally I withdrew posting my blog from R- Bloggers and Analyticbridge blogs – due to SEO keyword reasons and some spam I was getting see (below))

http://r-bloggers.com is still the CAT’s whiskers and I read it  a lot.

I still dont know who linked my blog to a free sex movie site with 400 views but I have a few suspects.

2009-12-31 to Today

Referrer Views
r-bloggers.com 9,131
Reddit 3,829
rattle.togaware.com 1,500
Twitter 1,254
Google Reader 1,215
linkedin.com 717
freesexmovie.irwanaf.com 422
analyticbridge.com 341
Google 327
coolavenues.com 322
Facebook 317
kdnuggets.com 298
dataminingblog.com 278
en.wordpress.com 185
google.co.in 151
xianblog.wordpress.com 130
inside-r.org 124
decisionstats.com 119
ifreestores.com 117
bits.blogs.nytimes.com 108

Still reading this post- gosh let me sell you some advertising. It is only $100 a month (yes its a recession)

Advertisers are treated on First in -Last out (FILO)

I have been told I am obsessed with SEO , but I dont care much for search engines apart from Google, and yes SEO is an interesting science (they should really re name it GEO or Google Engine Optimization)

Apparently Hadley Wickham and Donald Farmer are big keywords for me so I should be more respectful I guess.

Search Terms for 365 days ending 2010-12-31 (Summarized)

2009-12-31 to Today

Search Views
libre office 925
facebook analytics 798
test drive a chrome notebook 467
test drive a chrome notebook. 215
r gui 203
data mining 163
wps sas lawsuit 158
wordle.net 133
wps sas 123
google maps jet ski 123
test drive chrome notebook 96
sas wps 89
sas wps lawsuit 85
chrome notebook test drive 83
decision stats 83
best statistics software 74
hadley wickham 72
google maps jetski 72
libreoffice 70
doug savage 65
hive tutorial 58
funny india 56
spss certification 52
donald farmer microsoft 51
best statistical software 49

What about outgoing links? Apparently I need to find a way to ask Google to pay me for the free advertising I gave their chrome notebook launch. But since their search engine and browser is free to me, guess we are even steven.

Clicks for 365 days ending 2010-12-31 (Summarized)

2009-12-31 to Today

URL Clicks
rattle.togaware.com 378
facebook.com/Decisionstats 355
rapid-i.com/content/view/182/196 319
services.google.com/fb/forms/cr48basic 313
red-r.org 228
decisionstats.wordpress.com/2010/05/07/the-top-statistical-softwares-gui 199
teamwpc.co.uk/products/wps 162
r4stats.com/popularity 148
r-statistics.com/2010/04/r-and-the-google-summer-of-code-2010-accepted-students-and-projects 138
socserv.mcmaster.ca/jfox/Misc/Rcmdr 138
spss.com/certification 116
learnr.wordpress.com 114
dudeofdata.com/decisionstats 108
r-project.org 107
documentfoundation.org/faq 104
goo.gl/maps/UISY 100
inside-r.org/download 96
en.wikibooks.org/wiki/R_Programming 92
nytimes.com/external/readwriteweb/2010/12/07/07readwriteweb-report-google-offering-chrome-notebook-test-11919.html 92
sourceforge.net/apps/mediawiki/rkward/index.php?title=Main_Page 92
analyticdroid.togaware.com 88
yeroon.net/ggplot2 87

so in 2010,

SAS remained top daddy in business analytics,

R made revolutionary strides in terms of new packages,

JMP  launched a new version,

SPSS got integrated with Cognos,

Oracle sued Google and did build a great Data Mining GUI,

Libre Office gave you a non Oracle Open office ( or open even more office)

2011 looks like  a fun year. Have safe partying .

Light Cycle of Tron review

comiccon2010-6814.jpg
Image by YGX via Flickr

I really enjoyed the Light Cycle race in Tron- so instead of naming this the Tron Legacy Review- I call this Light cycle review.

The movie is a geek must check it out- and the mix of music, models,cars, and lights can be heady at first. The younger Jeff Bridges looks like a BeoWolf, and his son is ok. But Olivia Wilde is nice- and the cars and bikes are superb. If you like playing video games then check out the free game at http://armagetronad.net/downloads.php Its called Armagedtron.

And boy the 80s was a great time for pop music and video games.

How to Analyze Wikileaks Data – R SPARQL

Logo for R
Image via Wikipedia

Drew Conway- one of the very very few Project R voices I used to respect until recently. declared on his blog http://www.drewconway.com/zia/

Why I Will Not Analyze The New WikiLeaks Data

and followed it up with how HE analyzed the post announcing the non-analysis.

“If you have not visited the site in a week or so you will have missed my previous post on analyzing WikiLeaks data, which from the traffic and 35 Comments and 255 Reactions was at least somewhat controversial. Given this rare spotlight I thought it would be fun to use the infochimps API to map out the geo-location of everyone that visited the blog post over the last few days. Unfortunately, after nearly two years with the same web hosting service, only today did I realize that I was not capturing daily log files for my domain”

Anyways – non American users of R Project can analyze the Wikileaks data using the R SPARQL package I would advise American friends not to use this approach or attempt to analyze any data because technically the data is still classified and it’s possession is illegal (which is the reason Federal employees and organizations receiving federal funds have advised not to use this or any WikiLeaks dataset)

https://code.google.com/p/r-sparql/

Overview

R is a programming language designed for statistics.

R Sparql allows you to run SPARQL Queries inside R and store it as a R data frame.

The main objective is to allow the integration of Ontologies with Statistics.

It requires Java and rJava installed.

Example (in R console):

> library(sparql)> data <- query("SPARQL query>","RDF file or remote SPARQL Endpoint")

and the data in a remote SPARQL  http://www.ckan.net/package/cablegate

SPARQL is an easy language to pick  up, but dammit I am not supposed to blog on my vacations.

http://code.google.com/p/r-sparql/wiki/GettingStarted

Getting Started

1. Installation

1.1 Make sure Java is installed and is the default JVM:

$ sudo apt-get install sun-java6-bin sun-java6-jre sun-java6-jdk$ sudo update-java-alternatives -s java-6-sun

1.2 Configure R to use the correct version of Java

$ sudo R CMD javareconf

1.3 Install the rJava library

$ R> install.packages("rJava")> q()

1.4 Download and install the sparql library

Download: http://code.google.com/p/r-sparql/downloads/list

$ R CMD INSTALL sparql-0.1-X.tar.gz

2. Executing a SPARQL query

2.1 Start R

#Load the librarylibrary(sparql)#Run the queryresult <- query("SELECT ... ", "http://...")#Print the resultprint(result)

3. Examples

3.1 The Query can be a string or a local file:

query("SELECT ?date ?number ?season WHERE {  ... }", "local-file.rdf")
query("my-query.rq", "local-file.rdf")

The package will detect if my-query.rq exists and will load it from the file.

3.3 The uri can be a file or an url (for remote queries):

query("SELECT ... ","local-file.db")
query("SELECT ... ","http://dbpedia.org/sparql")

3.4 Get some examples here: http://code.google.com/p/r-sparql/downloads/list

SPARQL Tutorial-

http://openjena.org/ARQ/Tutorial/index.html

Also read-

http://webr3.org/blog/linked-data/virtuoso-6-sparqlgeo-and-linked-data/

and from the favorite blog of Project R- Also known as NY Times

http://bits.blogs.nytimes.com/2010/11/15/sorting-through-the-government-data-explosion/?twt=nytimesbits

In May 2009, the Obama administration started putting raw 
government data on the Web. 
It started with 47 data sets. Today, there are more than
 270,000 government data sets, spanning every imaginable 
category from public health to foreign aid.

Statistical Analysis with R- by John M Quick

I was asked to be a techie reviewe for John M Quick’s new R book “Statistical Analysis with R” from Packt Publishing some months ago-(very much to my surprise I confess)-

I agreed- and technical reviewer work does take time- its like being a mid wife and there is whole team trying to get the book to birth.

Statistical Analysis with R- is a Beginner’s Guide so has nice screenshots, simple case studies, and quizzes to check recall of student/ reader. I remember struggling with the official “beginner’s guide to R” so this one is different in that it presents a story of a Chinese Army and how to use R to plan resources to fight the battle. It’s recommended especially for undergraduate courses- R need not be an elitist language- and given my experience with Asian programming acumen – I am sure it is a matter of time before high schools in India teach basic R in final years ( I learnt quite a shit load of quantum physics as compulsory topics in Indian high schools- but I guess we didnt have Jersey Shore things to do)

Congrats to author Mr John M Quick- he is doing his educational Phd from ASU- and I am sure both he and his approach to making education simple informative and fun will go places.

Only bad thing- The Name Statistical Analysis with R has atleast three other books , but I guess Google will catch up to it.

This book is here-https://www.packtpub.com/statistical-analysis-with-r-beginners-guide/book

For R Writers- Inside R

A composite of the GNU logo and the OSI logo, ...
Image via Wikipedia

Hurray I am on Inside -R

http://www.inside-r.org/blogs/2010/11/04/r-apache-next-frontier-r-computing

Thats blog post number 1 there.

Basically Inside R is a go-to site for tips, tricks, packages, as well as blog posts. It thus enhances R Bloggers – but also adds in other multiple features as well.

It is an excellent place for R beginners and learning R. Also it is moderated ( so you wont get the flashy jhing bhang stuff- just your R.

What I really liked is the Pretty R functionality for turning R code -its nifty for color coding R code for use of posting in your blog, journal or article

and when you are there drop them a line for their excellent R support for events (like Pizza, sponsorship) and nifty R packages (doSNOW, foreach, RevoScaler, RevoDeployR) and how much open core makes them look silly?

Come on Revolution- share the open code for RevoScaler package- did you notice any sales dip when you open sourced the other packages? (cue to David Smith to roll his eyes again)

Anyway- all that is part of the R family fun 🙂

Do check http://www.inside-r.org/pretty-r