StiMulating Conversation

Stimulating conversation is the bait

Stimulating conversation is the bait

Lure the curious monkey to his zoo like fate

curious_george

Come stimulate the conversation for a while

Amuse us O Exotic one, with your pungent style.

We are all egalitarian, at least we have to pretend

This is the American south dont you comprehend.

Stay quiet and keep shut, do you job , move your nut

Ajays friend

Our patience is as deep the color in our skin,

Go ahead and slave for us, lest we begin

There are trees in Tennessee , tall enough to hang you

Curiosity killed the cat- it will noose the monkey too.

( Inspired by a Real life incident


Analyzing Monkeys

I once promised a reader long time back that I would not get into politics but something unexpected hit me like a big truck.

At what point do you decide your boss is a racist. How do you analyze the difference between jokes and racial insults.

Another interesting analysis

Citation Emerald

Interview Augusto Albeghi (Straycat) —Founder Straysoft

An interview with Augusto (StrayCat), a Startup Entrepreneur with an interesting technology StraySoft.

Ajay- Describe your career as a BI consultant.

Straycat- I’m an aerospace engineer who had to turn to IT right after graduation because of the Italian aerospace industry crisis in the first half of the 90’s . My first job was by the company now called Accenture, as a simple developer. I was part of a large project for a large US food corporation.

We built an enterprise level reporting and budgeting system based on what was later to become Hyperion. After that I had various experiences, always as an IT professional, always focusing on BI or related subjects. I worked for the Milan Airport Authority, the l’Oreal group and couple of local software houses. Now I’m a project manager by a large Italian consulting firm but, most of all, I’m a bootstrapping entrepreneur.

Ajay- How do you think we can teach BI at an early stage to young students.

Straycat- I think that the main problem resides in the naïve university approach toward business data analysis. Collecting data is considered trivial compared to other related subjects.

Data availability is often given for granted, then equations are written upon them. No use to say it is not trivial at all and there is an entire class of problems which students are not aware of.

A few lessons spent focusing on data quality, aggregations, measure definitions etc are enough to create the necessary awareness of the problem. It’s no longer cool telling to be ignorant on the subject!

Ajay- Describe the most challenging project you ever did. Name a project which led to the biggest dollar impact.

Straycat- About three years ago we signed a contract with a large fashion firm here in Italy to reengineer their entire business intelligence setup. It has been a project ranging from sales to production, from accounting to human resources.

It impacted almost a thousand users in six different time zones. The main challenge we had to tackle was the fragmentation of their legacy BI systems, which produced different jargon and practices across the corporation. We changed the database and the presentation layer, built a modern datawarehouse, and worked relentlessly on change management.

I can’t disclose figures but the new unified system shed light on some bad practices, revealed inefficiencies and provided a whole new set of analytics that increased market awareness.

Ajay- Describe your start-up StraySoft and what it is hoping to accomplish.

Straycat-

StraySoft is a small and fresh startup devoted to build Business Intelligence applications.

It produces Viney@rd an Excel/SQL Server based spreadsheet automation and BI tool.

I have personal reasons for embarking in such a project but the kick off came from a sudden realization. Despite the terrific sophistication level provided by current BI tools, the one thing each and every user wants is to have data in MS Excel.

This is simply a fact, users get data, elaborate them and make Excel reports. It’s not a matter of features, people feel in full control only when they have an Excel file.

Why? Because Excel is able to address a single cell, and the figures within can be adjusted at will and saved in a familiar place like C: .

So, the original idea (2 years and one half ago) was to create a tool to refresh a complex layout without disrupting it i.e. a tool which could address query results into single cells.

This can be done by Excel alone but it’s far too difficult even for and advanced user.

Viney@rd features this but I soon realized that, if I wanted to go down this path, I had to tackle a second issue: the data provided by the systems are never the data required by the user. I’m not talking about bad analysis or wrong KPIs; even if the architects did everything fine, the human brain works according to categories that often are not saved within a database.

Example: you are a salesman and you have 4 customers who make the 75% of your business. Plus you have 40 customers who make the other 25% of your business.

Question: “how many customers do you have?”, reply: “I have 4, customer A,B.C and D. A is bla bla bla, B is bla bla bla, C etc. etc. Oh, by the way I have some others but they are marginal.”

The salesman needs any kind of information about the 4, and just few hints about the rest; every detailed information about the rest is perceived as clutter. He needs a screen with 4 ultra detailed sheets per customer, not a customer ABC report with 44 rows.

So far nothing revolutionary, what is revolutionary is that the user himself must be able to tag the 4 main customers according to his own perception of the customer importance.

If one of the small customers is going to place a large order, than it must become important as well and should immediately take the fifth place, to be automatically demoted when the opportunity expires.

The point is that these rules are defined heuristically by the human brain and have so many exceptions that can be handled only by a human brain. This consideration led to implementing the unique feature of letting the users change their data directly by an Excel table.

The Viney@rd database is easy to be fed by traditional techniques but Excel sheet data can be saved within as well. This gives the best of both worlds, a central repository for “conventional” data, so no more “spreadsheet hell” nightmares, but the ability to classify and adjust the data still working in Excel.

This approach has limits, specifically when we talk about large amounts of data, I’m the first to admit it, but I still think that it’s the one thing that can popularize BI among business users.

When large vendors will embrace this, I’ll remind them this interview! :o) Viney@rd now is in its infancy but already implements these two core features.

There are a lot of things to do, and many features to add to take it to a full corporate level, but I enjoy the process so much that I can’t stop working on it!

I’ve been asked “What if I buy from you and you go belly up next year?”. My reply is that you must shoot me to stop me from working on it! I still have a long list of features to implement and I’m not going to dismiss the fun!

For example, did you ever notice that people think naturally in terms of information streams ….?

I’ll consider myself successful when 3 conditions will be met:

a) I’ll have a body of satisfied users which had their working lives improved by my products

b) I’ll make a living out of StraySoft together with the employees, when I’ll have some

c) people will think to me as the Business Intelligence “enfant terrible”.

Ajay-  What do you do in your spare time ?

StrayCat- Sorry? What’s spare time? Jokes apart, I devote time to my wife, who’s really supportive in this effort. Late at night, before falling asleep, I’m used to read for half an hour: I’m passionate about history; but the events I really never miss are the Italian National Rugby Union Team matches.

Ajay- why do you tweet using the name Stray Cat ?

Augusto– I named the company StraySoft after the adoption of a stray cat; the full story is told here http://www.straysoft.com/dblog/articolo.asp?id=30.

The twitter name came as natural as naming the company. I know that someone may find it awkward but I feel like going upstream on that! Secondly, I want to keep my consulting activity and StraySoft totally separate for a matter of convenience. I did not, and will never propose my product to my consulting customers.

Ajay-  What visible trends in Business Intelligence do you fore see for the next two to three years.

Augusto- The #1 trend is that all the main vendors (excluding Microsoft, which already did) finally realized that there’s a midrange market which needs BI more than ever.

What they’re doing wrong is targeting this segment with the same enterprise class tools which miss the few key features required by this market.

The #2 trend is the rising of workgroup BI and the new dignity given to informal analysis. This is a whole new approach I do not share completely but I admit it has its strengths.

The #3 trend is at the opposite side of the spectrum; unconventional databases (columnar stores, appliances etc.) are becoming increasingly popular to manage very large amounts of data.

There are two fake trends: Clouds and SaaS. They’ll get a share of the market but will not become, in the foreseeable future, the reference architecture. Thank you again for giving me voice. All the best. Augusto Albeghi

Ajay-To know more on Augusto’s startup and Vineyard please see www.straysoft.com

R and SAS- Together again at PAWS

Two of my favorite speakers ( though maybe not favorite to each other) speak at PAWS ,

Anne Milley from SAS and David Smith, REvolution Computing.Also a great author and writer, Stephen Baker from Numerati ( that mathematical equivalent of The Godfather). More events at the link below.

Hmmmm- I hope they attend each other’s sessions just to keep up, but is that asking too much?

Citation-http://www.predictiveanalyticsworld.com/dc/2009/agenda.php#day1-22

7:30pm-10:00pm
useR Meeting
Room: Magnolia
– Sponsored by  Please join the group at www.meetup.com/R-users-DC/

R is an open source programming language for statistical computing, data analysis, and graphical visualization. R has an estimated one million users worldwide, and its user base is growing. While most commonly used within academia, in fields such as computational biology and applied statistics, it is gaining currency in commercial areas such as quantitative finance and business intelligence.

Among R’s strengths as a language are its powerful built-in tools for inferential statistics, its compact modeling syntax, its data visualization capabilities, and its ease of connectivity with persistent data stores (from databases to flatfiles).

In addition, R is open source nature and extensible via add-on “packages” allowing it to keep up with the leading edge in academic research.

For all its strengths, though, R has an admittedly steep learning curve; the first steps towards learning and using R can be challenging.

This DC R Users Group is dedicated to bringing together area practitioners of R to exchange knowledge, inspire new users, and spur the adoption of R for innovative research and commercial applications.


Wednesday October 21, 2009

8:00am-9:00am
Registration & Continental Breakfast


9:00am-9:50am
Keynote
Room: Magnolia
Opportunities and Pitfalls:
What the World Does and Doesn’t Want from Predictive Analytics

Mathematicians and statisticians are churning through mountains of data in their efforts to model and predict human behavior. The goal is to optimize every function possible, from sales and marketing to the enterprise itself. These Numerati are guided by the two dominant models of the late 20th century, the modeling of financial markets and of industrial systems. How do humans fit into these systems? And what will their response be when the analytic systems appear to misunderstand them or invade their privacy?

Stephen Baker joins PAW to directly address the Numerati. In his keynote presentation, Mr. Baker will guide us toward the untapped goldmines where predictive analytics will be embraced and thrive, and teach us to anticipate and maneuver around two central pitfalls: Consumer misperception of us, and our inadvertent mistreatment of them.

Moderator: Eric Siegel, Program Chair, Predictive Analytics World

Speaker: Stephen Baker, BusinessWeek – author, The Numerati


9:50am-10:10am
Platinum Sponsor Presentation
Room: Magnolia
Strength in Numbers: ACE!

As more organizations are beginning their analytical journey or reinvigorating their existing efforts, Analytic Centers of Excellence (ACEs) are helping them along the way. The interest in ACEs is growing across industries as organizations seek better ways to tap into their analytic infrastructure-most importantly, scarce high-end analytic expertise to improve results. We will highlight valuable best practices for achieving greater analytic bandwidth realizing more and better evidence-based decisions.

Moderator: Eric Siegel, Program Chair, Predictive Analytics World

Speaker: Anne Milley, Senior Director of Tech. Product Marketing, SAS

Red R- A new beginning

Check out an interesting new interface to R.

Note I haven’t tested it but plan to do so shortly as I am currently using Ubuntu 9 almost exclusively nowadays.

R fans who are  not quite overjoyed  with the wonderful beauty and charm  of the traditional R GUI may want to give it a try.

Citation-

http://code.google.com/p/r-orange/

Note- This website does not assume responsibilty for any software glitches as R comes with no warranty- unlike other softwares that come loaded with both a warranty and then bug-fix patches.

redr

Losing a Million Bucks: Netflix Prize Interview

I ( and collective pseudo geeks) across the world lost a potential million dollars when the following team won the Netflix prize. In disgust, I just renewed my Netflix subscription and noticed a 10% increase in the way I liked them.

Jokes apart, here is an except ( perhaps one of the few ever) of an interview of the Netflix winners done by the great Eric Siegel, Phd.

Eric is conference chair of the Predictive Analytics Conference ( a King Arthur’s round table conference on all the shining knights of the data analytic’s world)

Citation-http://www.predictiveanalyticsworld.com/layman-netflix-leader.php

[ES] With no relevant background in statistics — let alone product recommendations specifically — what capabilities or background did make your success possible? Do you consider yourselves mathematicians, or at least strong with math?

[MC] I am certainly not a mathematician – I have engineering level skill. I consider Martin Piotte to have an exceptional mathematical mind (he participated successfully in international math contests when he was a student) even though he never formally studied in that field. In the end, the mathematics used in this contest seem very complex, but are really rather simple. Compared to what most people think, this was more of an engineering contest than a mathematical contest [See Martin’s response below for elaboration on this central point. -Ed]. Also, I think that having a perhaps less in-depth but wider array of skills and knowledge helped us.

[ES] You’ve said, when first getting started, you learned many core strategies/techniques from the Netflix Prize discussion board. Did you do much reading or research elsewhere to ramp up?

[MC] Having started late in the competition, the forum was a good starting point as many avenues had already been explored and links had been posted to many interesting papers. In the end though, reading and getting a good understanding of the actual research papers was a very important step. The forum was also a place where people proposed new (sometimes far fetched) ideas; these ideas often inspired us to come up with our own creative innovations.

PAWS is a great place to meet, greet and do business and though it is 5 hours away I have too much homework to do and grade while at University of Tennessee ( for now)-

Here is a very interesting poll that they are carrying it is good to see conferences take feedback in such a transparent manner-

paws poll

A comment on OffShoring

A comment on offshoring was put by a reader- I am re-posting it entirely.

When you use the phrase “labor shortage” or “skills shortage” you’re speaking in a sentence fragment.  What you actually mean to say is:  “There is a labor shortage at the salary level I’m willing to pay.”  That statement is the correct phrase; the complete sentence and the intellectually honest statement.

Employers speak about shortages as though they represent some absolute, readily identifiable lack of desirable services. Price is rarely accorded its proper importance in their discussion.

If you start raising wages and improving working conditions, and continue doing so, you’ll solve your shortage and will have people lining up around the block to work for you even if you need to have huge piles of steaming manure hand-scooped on a blazing summer afternoon.

Re:  Shortage caused by employees retiring out of the workforce:  With the majority of retirement accounts down about 50% or more, most people entering retirement age are working well into their sunset years.  So, you won’t be getting a worker shortage anytime soon due to retirees exiting the workforce.

Okay, fine.  Some specialized jobs require training and/or certification, again, the solution is higher wages and improved benefits. People will self-fund their re-education so that they can enter the industry in a work-ready state.  The attractive wages, working conditions and career prospects of technology during the 1980’s and 1990’s was a prime example of people’s willingness to self-fund their own career re-education.

There is never enough of any good or service to satisfy all wants or desires. A buyer, or employer, must give up something to get something. They must pay the market price and forego whatever else he could have for the same price. The forces of supply and demand determine these prices — and the price of a skilled workman is no exception. The buyer can take it or leave it. However, those who choose to leave it (because of lack of funds or personal preference) must not cry shortage. The good is available at the market price. All goods and services are scarce, but scarcity and shortages are by no means synonymous. Scarcity is a regrettable and unavoidable fact.

Shortages are purely a function of price. The only way in which a shortage has existed, or ever will exist, is in cases where the “going price” has been held below the market-clearing price.