Enterprise Linux rises rapidly:New Report

Tux, as originally drawn by Larry Ewing
Image via Wikipedia

A new report from Linux Foundation found significant growth trends for enterprise usage of Linux- which should be welcome to software companies that have enabled Linux versions of software, service providers that provide Linux based consulting (note -lesser competition, lower overheads) and to application creators.

From –


Key Findings from the Report
• 79.4 percent of companies are adding more Linux relative to other operating systems in the next five years.

• More people are reporting that their Linux deployments are migrations from Windows than any other platform, including Unix migrations. 66 percent of users surveyed say that their Linux deployments are brand new (“Greenfield”) deployments.

• Among the early adopters who are operating in cloud environments, 70.3 percent use Linux as their primary platform, while only 18.3 percent use Windows.

• 60.2 percent of respondents say they will use Linux for more mission-critical workloads over the next 12 months.

• 86.5 percent of respondents report that Linux is improving and 58.4 percent say their CIOs see Linux as more strategic to the organization as compared to three years ago.

• Drivers for Linux adoption extend beyond cost: technical superiority is the primary driver, followed by cost and then security.

• The growth in Linux, as demonstrated by this report, is leading companies to increasingly seek Linux IT professionals, with 38.3 percent of respondents citing a lack of Linux talent as one of their main concerns related to the platform.

• Users participate in Linux development in three primary ways: testing and submitting bugs (37.5 percent), working with vendors (30.7 percent) and participating in The Linux Foundation activities (26.0 percent).

and from the report itself-

download here-


Interview Michael J. A. Berry Data Miners, Inc

Here is an interview with noted Data Mining practitioner Michael Berry, author of seminal books in data mining, noted trainer and consultantmjab picture

Ajay- Your famous book “Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management” came out in 2004, and an update is being planned for 2011. What are the various new data mining techniques and their application that you intend to talk about in that book.

Michael- Each time we do a revision, it feels like writing a whole new book. The first edition came out in 1997 and it is hard to believe how much the world has changed since then. I’m currently spending most of my time in the on-line retailing world. The things I worry about today–improving recommendations for cross-sell and up-sell,and search engine optimization–wouldn’t have even made sense to me back then. And the data sizes that are routine today were beyond the capacity of the most powerful super computers of the nineties. But, if possible, Gordon and I have changed even more than the data mining landscape. What has changed us is experience. We learned an awful lot between the first and second editions, and I think we’ve learned even more between the second and third.

One consequence is that we now have to discipline ourselves to avoid making the book too heavy to lift. For the first edition, we could write everything we knew (and arguably, a bit more!); now we have to remind ourselves that our intended audience is still the same–intelligent laymen with a practical interest in getting more information out of data. Not statisticians. Not computer scientists. Not academic researchers. Although we welcome all readers, we are primarily writing for someone who works in a marketing department and has a title with the word “analyst” or “analytics” in it. We have relaxed our “no equations” rule slightly for cases when the equations really do make things easier to explain, but the core explanations are still in words and pictures.

The third edition completes a transition that was already happening in the second edition. We have fully embraced standard statistical modeling techniques as full-fledged components of the data miner’s toolkit. In the first edition, it seemed important to make a distinction between old, dull, statistics, and new, cool, data mining. By the second edition, we realized that didn’t really make sense, but remnants of that attitude persisted. The third edition rectifies this. There is a chapter on statistical modeling techniques that explains linear and logistic regression, naive Bayes models, and more. There is also a brand new chapter on text mining, a curious omission from previous editions.

There is also a lot more material on data preparation. Three whole chapters are devoted to various aspects of data preparation. The first focuses on creating customer signatures. The second is focused on using derived variables to bring information to the surface, and the third deals with data reduction techniques such as principal components. Since this is where we spend the greatest part of our time in our work, it seemed important to spend more time on these subjects in the book as well.

Some of the chapters have been beefed up a bit. The neural network chapter now includes radial basis functions in addition to multi-layer perceptrons. The clustering chapter has been split into two chapters to accommodate new material on soft clustering, self-organizing maps, and more. The survival analysis chapter is much improved and includes material on some of our recent application of survival analysis methods to forecasting. The genetic algorithms chapter now includes a discussion of swarm intelligence.

Ajay- Describe your early career and how you came into Data Mining as a profession. What do you think of various universities now offering MS in Analytics. How do you balance your own teaching experience with your consulting projects at The Data Miners.

Michael- I fell into data mining quite by accident. I guess I always had a latent interest in the topic. As a high school and college student, I was a fan of Martin Gardner‘s mathematical games in in Scientific American. One of my favorite things he wrote about was a game called New Eleusis in which one players, God, makes up a rule to govern how cards can be played (“an even card must be followed by a red card”, say) and the other players have to figure out the rule by watching what plays are allowed by God and which ones are rejected. Just for my own amusement, I wrote a computer program to play the game and presented it at the IJCAI conference in, I think, 1981.

That paper became a chapter in a book on computer game playing–so my first book was about finding patterns in data. Aside from that, my interest in finding patterns in data lay dormant for years. At Thinking Machines, I was in the compiler group. In particular, I was responsible for the run-time system of the first Fortran Compiler for the CM-2 and I represented Thinking Machines at the Fortran 8X (later Fortran-90) standards meetings.

What changed my direction was that Thinking Machines got an export license to sell our first machine overseas. The machine went to a research lab just outside of Paris. The connection machine was so hard to program, that if you bought one, you got an applications engineer to go along with it. None of the applications engineers wanted to go live in Paris for a few months, but I did.

Paris was a lot of fun, and so, I discovered, was actually working on applications. When I came back to the states, I stuck with that applied focus and my next assignment was to spend a couple of years at Epsilon, (then a subsidiary of American Express) working on a database marketing system that stored all the “records of charge” for American Express card members. The purpose of the system was to pick ads to go in the billing envelope. I also worked on some more general purpose data mining software for the CM-5.

When Thinking Machines folded, I had the opportunity to open a Cambridge office for a Virginia-based consulting company called MRJ that had been a major channel for placing Connection Machines in various government agencies. The new group at MRJ was focused on data mining applications in the commercial market. At least, that was the idea. It turned out that they were more interested in data warehousing projects, so after a while we parted company.

That led to the formation of Data Miners. My two partners in Data Miners, Gordon Linoff and Brij Masand, share the Thinking Machines background.

To tell the truth, I really don’t know much about the university programs in data mining that have started to crop up. I’ve visited the one at NC State, but not any of the others.

I myself teach a class in “Marketing Analytics” at the Carroll School of Management at Boston College. It is an elective part of the MBA program there. I also teach short classes for corporations on their sites and at various conferences.

Ajay- At the previous Predictive Analytics World, you took a session on Forecasting and Predicting Subsciber levels (http://www.predictiveanalyticsworld.com/dc/2009/agenda.php#day2-6) .

It seems inability to forecast is a problem many many companies face today. What do you think are the top 5 principles of business forecasting which companies need to follow.

Michael- I don’t think I can come up with five. Our approach to forecasting is essentially simulation. We try to model the underlying processes and then turn the crank to see what happens. If there is a principal behind that, I guess it is to approach a forecast from the bottom up rather than treating aggregate numbers as a time series.

Ajay- You often partner your talks with SAS Institute, and your blog at http://blog.data-miners.com/ sometimes contain SAS code as well. What particular features of the SAS software do you like. Do you use just the Enterprise Miner or other modules as well for Survival Analysis or Forecasting.

Michael- Our first data mining class used SGI’s Mineset for the hands-on examples. Later we developed versions using Clementine, Quadstone, and SAS Enterprise Miner. Then, market forces took hold. We don’t market our classes ourselves, we depend on others to market them and then share in the revenue.

SAS turned out to be much better at marketing our classes than the other companies, so over time we stopped updating the other versions. An odd thing about our relationship with SAS is that it is only with the education group. They let us use Enterprise Miner to develop course materials, but we are explicitly forbidden to use it in our consulting work. As a consequence, we don’t use it much outside of the classroom.

Ajay- Also any other software you use (apart from SQL and J)

Michael- We try to fit in with whatever environment our client has set up. That almost always is SQL-based (Teradata, Oracle, SQL Server, . . .). Often SAS Stat is also available and sometimes Enterprise Miner.

We run into SPSS, Statistica, Angoss, and other tools as well. We tend to work in big data environments so we’ve also had occasion to use Ab Initio and, more recently, Hadoop. I expect to be seeing more of that.


Together with his colleague, Gordon Linoff, Michael Berry is author of some of the most widely read and respected books on data mining. These best sellers in the field have been translated into many languages. Michael is an active practitioner of data mining. His books reflect many years of practical, hands-on experience down in the data mines.

Data Mining Techniques cover

Data Mining Techniques for Marketing, Sales and Customer Relationship Management

by Michael J. A. Berry and Gordon S. Linoff
copyright 2004 by John Wiley & Sons

Mining the Web cover

Mining the Web

by Michael J.A. Berry and Gordon S. Linoff
copyright 2002 by John Wiley & Sons
ISBN 0-471-41609-6

Non-English editions available in Traditional Chinese and Simplified Chinese

This book looks at the new opportunities and challenges for data mining that have been created by the web. The book demonstrates how to apply data mining to specific types of online businesses, such as auction sites, B2B trading exchanges, click-and-mortar retailers, subscription sites, and online retailers of digital content.

Mastering Data Mining

by Michael J.A. Berry and Gordon S. Linoff
copyright 2000 by John Wiley & Sons
ISBN 0-471-33123-6

Non-English editions available in JapaneseItalianTraditional Chinese , and Simplified Chinese

A case study-based guide to applying data mining techniques for solving practical business problems. These “warts and all” case studies are drawn directly from consulting engagements performed by the authors.

A data mining educator as well as a consultant, Michael is in demand as a keynote speaker and seminar leader in the area of data mining generally and the application of data mining to customer relationship management in particular.

Prior to founding Data Miners in December, 1997, Michael spent 8 years at Thinking Machines Corporation. There he specialized in the application of massively parallel supercomputing techniques to business and marketing applications, including one of the largest database marketing systems of the time.

Interview Neil Raden Founder of Hired Brains Inc

Here is an interview with one terrific person who has always inspired my writing ( or atleast my attempts to write) on data and systems. Neil Raden is a giant in the publishing and consulting space for business intelligence ,analytics, and decision management. In a nice interview Neil talks of his passion for his work, his prolific authoring of white papers, his seminal work with James Taylor and how he sees the BI space evolve.

The history of BI pretty much follows the history of computing platforms. First we had time-sharing, then mainframes, then mini’s, then client-server vs. PC, then a number passes at distributed computing, such as CORBA, then SOA and now the cloud.- Neil Raden

Ajay- Describe your career in math and technology and your current activities. How would you explain what you do for a living to a group of high school students who are wondering to take up mathematical and technical subjects or not.

Neil- I didn’t earn a dime at the career I was meant for, consulting, until I was 33 years old. So I would tell college students not to be in such a hurry to corner themselves into a career. It may take a while to figure out what you really want.

Though I went to college to study theatre, within a few weeks I was inspired by a math professor and switched my major. From that point on, it was pads of paper and sharp pencils. I was totally in my own head with math. I never took a statistics course, or even differential equations, because I was consumed by discrete math (graph theory too), topology and logic and later game theory/economics.

When I went looking for a job in 1974, in the midst of a deep recession, I was confronted with the stark reality (in New York ) that I could be a COBOL programmer or an actuary. I chose the latter. Working at AIG in New York in the 70’s was pretty exciting. We broke new ground in commercial property and casualty insurance and reinsurance every week. I was part of a small R&D group under the chief actuary, who reported directed to Maurice Greenberg, the legendary (but now maligned) inventor of AIG, and I loved the work.

I had to go back and teach myself probability and statistics to get through the exams, but ultimately, two kids and one on the way in NYC on one not-so-great salary was a deal-breaker. I left AIG and joined a software company doing modeling and prediction. The rest, as they say, is history. I formed my own consulting company in 1985 and I’m still at it.

To me, consulting isn’t something you do between jobs or a title you get because you implement software for clients. Consulting is a craft, it’s a career and it is rather easy to do but very difficult to learn. I work very hard to teach this to people who work for me. It’s about commitment, hard work and, most of all, ethics and being authentic with your client.

Ajay- Writing books is a lonely yet rewarding work. Could you briefly elucidate on your recent book, Smart (Enough) Systems?

Neil- I have to credit my partner, James Taylor, with the concept for the book. He was working at Fair Isaac (now FICO) at the time and this was exactly what he was doing there. It was a little tangential to my work, but when James approached me, he said he wanted a partner who was proficient in the data integration and analytics aspects of EDM (Enterprise Decision Management).

James made it pretty easy because

1) he is very prolific and 2) he took most of my comments and integrated them without argument.

I’d say I was pretty lucky and it went very well. I don’t know if I’ll ever write another book. I suppose I won’t know until the idea hits me. I’m sure it will be more difficult doing it on my own.

Ajay- What are the various stages that you have seen the BI industry go through. What are the next few years going to bring to us-

What is your wishlist for changes the industry makes for better customer ROI.

Neil- The history of BI pretty much follows the history of computing platforms. First we had time-sharing, then mainframes, then mini’s, then client-server vs. PC, then a number passes at distributed computing, such as CORBA, then SOA and now the cloud. But while the locus of BI storage, computing and presentation has changed, it’s focus changes very slowly.

Historically, there have been two major subject areas in BI: f inance and sales/marketing, All of the other subject areas still rest on periphery.

Complex Event Processing ( CEP ) for example, is making a lot of noise lately , but not much implementation. Visualization is here to stay . When the BI app and the Web a pp are the same, BI will be everywhere, but it will be a sort of pyrrhic victory because it won’t be recognized as such. Now you can take all of this with a grain of salt because I don’t really follow the industry per se, I’m more interested in how my clients can apply the technology to get the results they need.

Ajay- There is a lot of buzz about predictive analytics lately. Do you think it will have a noticeable impact or is it just the latest thing?

Neil- There are only so many people who understand quant itative meth ods and it isn’t going to grow very much. This puts a damper on PA (Predictive Analytics) because no manager is going to act on the recommendations of a black box without an articulate quant who can explain the methodology and the limits of its precision.

That isn’t a bad thing, and those who practice in predictive analytics will prosper.

On the other hand, I believe there will be an expansion of the use of generic PA models that have been vetted in practice. The FICO score is a good example, and the ability to develop and implement these applications (it’s much easier now thanks to PA software and computing environments in general) should allow for a nice market to develop around them. This is especially true with decision automation systems, like logistics, material handling, credit authorization, etc.

Ajay- What were your most interesting projects as an implementer? Most rewarding?

Neil- Most Interesting: I was the Chairman of an Advisory Board at Sandia National Laboratories for a few years.Our goal was to encourage the lab to adopt more modern and effective information management tools for their dual purpose of

1) designing and manufacturing nuclear weapons (frightening isn’t it?) and

2) certification of nuclear waste repositories.

I was able to work with scientists, physicists, engineers, geologists and computer sciences, all from backgrounds very different from those I normally engaged. The problems were monumental.

Most rewarding: We developed a data warehouse to capture the daily sales of products at the most detailed level for a cosmetics company. They never had this information before because the retailers were counters in hundreds of department stores. Thus they were able for the first time to truly understand the “sell through” of their products. Beyond just allowing a better understanding of the flow, they could tailor their promotions and, not much later, implement a continuous replenishment system.

The president of the company came to the launch and explained how we had allowed the company to do things it had never done before which would change it for the better. You don’t get those accolades from the CEO very often.

Ajay- You’ve written forty white papers. That’s a lot. What impact do you think they’ve had?

Neil- I couldn’t tell you. I don’t track downloads, my website doesn’t even require registration. I don’t see them quoted or cited very often, but then, people don’t quote or cite other’s work in this field very often anyway. I can say that I have many repeat customers among the vendors, so they must be deriving some value from them.

Ajay- What are your views on creating a community for the top 100 BI analysts in the world – a bit like a Reuters or a partnership firm. How pleased do you think will BI vendors be by this.

Neil- I was actually involved in an effort like this about a dozen years ago, called BI Alliance . Doug Hackney and I started it, and we had about a dozen BI luminaries in the organization. I’ll try to remember some: Sid Adelman, David Marco, Richard Winter, David Foote, Herb Edelstein.

You could only join if you were an independent or the head of your own firm.

It was a useful marketing tool as we were able to 1) share references and 2) staff projects. But it sort of lost its inertia after a few years.

But a few hundred BI analysts? Are there that many?? LOL I don’t know how the vendors would react, but I sort of doubt this sort of organization would have any kind of clout – too many divergent opinions.

Ajay- Do you think the work you do matters?

Neil- It certainly has an economic impact on my family! LOL I don’t know, I hope it does and proportionate to my income versus the size of the industry, yes, I guess it does. Not necessarily directly though .

A company in Dayton or Macon doesn’t make a decision because I said so, but I think I do influence some analysts and vendor s a nd to the extent I influence them, then I guess I do . I limit my analysis to my clients. If they think this work matters, then it does.


Neil Raden, consultant, analyst and author is followed by technology providers, consultants and even other analysts. His knowledge of the analytical applications is the result of thirty years of intensive work. He is the founder of Hired Brains, a research and advisory firm in Santa Barbara, CA, offering research and analysis services to technology providers as well as providing consulting and implementation services. Mr. Raden began his career as a casualty actuary with AIG before moving into software engineering and consulting in the application of analytics in fields as diverse as health care to nuclear waste management to cosmetics marketing. His blog can be found at intelligententerprise.com/experts/raden/. He is the author of dozens articles and white papers and he has has contributed to numerous books and is the co-author of “Smart (Enough) Systems” (Prentice Hall, 2007) with James Taylor. nraden@hiredbrains.com

Alternatively you can just follow Neil Raden at his twitter id neilraden