Interview SPSS Olivier Jouve

SPSS recently launched a major series of products in it’s text mining and data mining product portfolio and rebranded data mining to the PASW series. In an exclusive and extensive interview, Oliver Jouve Vice President,Corporate Development at SPSS Inc talks of science careers, the recent launches, open source support to R by SPSS, Cloud Computing and Business Intelligence.

Ajay: Describe your career in Science. Are careers in science less lucrative than careers in business development? What advice would you give to people re-skilling in the current recession on learning analytical skills?

Olivier: I have a Master of Science in Geophysics and Master of Science in Computer Sciences, both from Paris VI University. I have always tried to combine science and business development in my career as I like to experience all aspects � from idea to concept to business plan to funding to development to marketing to sales.

There was a study published earlier this year that said two of the three best jobs are related to math and statistics. This is reinforced by three societal forces that are converging � better uses of mathematics to drive decision making, the tremendous growth and storage of data, and especially in this economy, the ability to deliver ROI. With more and more commercial and government organizations realizing the value of Predictive Analytics to solve business problems, being equipped with analytical skills can only enhance your career and provide job security.

Ajay: So SPSS has launched new products within its Predictive Analytics Software (PASW) portfolio � Modeler 13 and Text Analytics 13? Is this old wine in a new bottle? What is new in terms of technical terms? What is new in terms of customers looking to mine textual information?

Olivier: Our two new products — PASW Modeler 13 (formerly Clementine) and PASW Text Analytics 13 (formerly Text Mining for Clementine) � extend and automate the power of data mining and text analytics to the business user, while significantly enhancing the productivity, flexibility and performance of the expert analyst.

PASW Modeler 13 data mining workbench has new and enhanced functionality that quickly takes users through the entire data mining process � from data access and preparation to model deployment. Some the newest features include Automated Data Preparation that conditions data in a single step by automatically detecting and correcting quality errors; Auto Cluster that gives users a simple way to determine the best cluster algorithm for a particular data set; and full integration with PASW Statistics (formerly SPSS Statistics).

With PASW Text Analytics 13, SPSS provides the most complete view of the customer through the combined analysis of text, web and survey data.   While other companies only provide the text component, SPSS couples text with existing structured data, permitting more accurate results and better predictive modeling. The new version includes pre-built categories for satisfaction surveys, advanced natural language processing techniques, and it supports more than 30 different languages.

Ajay: SPSS has supported open source platforms – Python and R � before it became fashionable to do so. How has this helped your company?

Olivier: Open source software helps the democratization of the analytics movement and SPSS is keen on supporting that democratization while welcoming open source users (and their creativity) into the analytics framework.

Ajay: What are the differences and similarities between Text Analytics and Search Engines? Can we mix the two as well using APIs?

Olivier: Search Engines are fundamentally top-down in that you know what you are looking for when launching a query. However, Text Analytics is bottom-up, uncovering hidden patterns, relationships and trends locked in unstructured data � including call center notes, open-ended survey responses, blogs and social networks. Now businesses have a way of pulling key concepts and extracting customer sentiments, such as emotional responses, preferences and opinions, and grouping them into categories.

For instance, a call center manager will have a hard time extracting why customers are unhappy and churn by using a search engine for millions of call center notes. What would be the query? But, by using Text Analytics, that same call center agent will discover the main reasons why customers are unhappy, and be able to predict if they are going to churn.

Ajay: Why is Text Analytics so important?  How will companies use it now and into the future?
Olivier –
Actually, the question you should ask is, “Why is unstructured data so important?” Today, more than ever, people love to share their opinions — through the estimated 183 billion emails sent, the 1.6 million blog posts, millions of inquiries captured in call center notes, and thousands of comments on diverse social networking sites and community message boards. And, let�s not forget all data that flows through Twitter. Companies today would be short-sighted to ignore what their customers are saying about their products and services, in their own words. Those opinions � likes and dislikes � are essential nuggets and bear much more insights than demographic or transactional data to reducing customer churn, improving satisfaction, fighting crime, detecting fraud and increasing marketing campaign results.

Ajay: How is SPSS venturing into cloud computing and SaaS?

Olivier: SPSS has been at the origin of the PMML standard to allow organizations to provision their computing power in a very flexible manner � just like provisioning computing power through cloud computing. SPSS strongly believes in the benefits of a cloud computing environment, which is why all of our applications are designed with Service Oriented Architecture components.  This enables SPSS to be flexible enough to meet the demands of the market as they change with respect to delivery mode.  We are currently analyzing business and technical issues related to SPSS technologies in the cloud, such as the scoring and delivery of analytics.  In regards to SaaS, we currently offer hosted services for our PASW Data Collection (formerly Dimensions) survey research suite of products.

Ajay: Do you think business intelligence is an over used term? Why do you think BI and Predictive Analytics failed in mortgage delinquency forecasting and reporting despite the financial sector being a big spender on BI tools?

Oliver: There is a big difference between business intelligence (BI) and Predictive Analytics. Traditional BI technologies focus on what�s happening now or what�s happened in the past by primarily using financial or product data. For organizations to take the most effective action, they need to know and plan for what may happen in the future by using people data � and that�s harnessed through Predictive Analytics.

Another way to look at it � Predictive covers the entire capture, predict and act continuum � from the use of survey research software to capture customer feedback (attitudinal data), to creating models to predict customer behaviors, and then acting on the results to improve business processes. Predictive Analytics, unlike BI, provides the secret ingredient and answers the question, �What will the customer do next?�

That being said, financial institutions didn�t need to use Predictive Analytics to see
that some lenders sold mortgages to unqualified individuals likely to default. Predictive Analytics is an incredible application used to detect fraud, waste and abuse. Companies in the financial services industry can focus on mitigating their overall risk by creating better predictive models that not only encompass richer data sets, but also better rules-based automation.

Ajay: What do people do at SPSS to have fun when they are not making complex mathematical algorithms?
Oliver: SPSS employees love our casual, friendly atmosphere, our professional and talented colleagues, and our cool, cutting-edge technology. The fun part comes from doing meaningful work with great people, across different groups and geographies. Of course being French, I have ensured that my colleagues are fully educated on the best wine and cuisine. And being based in Chicago, there is always a spirited baseball debate between the Cubs and White Sox. However, I am yet to convince anyone that rugby is a better sport.

Biography

Olivier Jouve is Vice President, Corporate Development, at SPSS Inc. He is responsible for defining SPSS strategic directions, growth opportunities through internal development, merger and acquisitions and/or tactical alliances. As a pioneer in the field of data and text mining for the last 20 years, he has created the foundation of Text Analytics technology for analyzing customer interactions at SPSS. Jouve is a successful serial entrepreneur and has had his works published internationally in the area of Analytical CRM, text mining, search engines, competitive intelligence and knowledge management.

terrific Tr.im trims Tweet time

Okay, the title of the post was bad attempt at a haiku. But the tr.im plugin for Firefox is incredible and helps you tweet interesting reading in matter of seconds. More importantly it shows you the analytics behind how many actual users went to that particular tr.im url. While Tr.im is yet another url shortening service like the tinyurl.com and bit.ly services, what makes Tr.im stand out in a terrific manner are the following innovations –

1) User friendly Firefox Plugin that can be downloaded from https://addons.mozilla.org/en-US/firefox/addon/10232/

See the screenshot of the Tr.im panel which conveniently opens on the left. The Statistics can be seen in the separate window ( note the Twitterfox application which is also open on the right – that is a separate application)

2) Analytics for tracking the locations, of people who click on the url and whether they were human or a bot.

3) Seamless Twitter integration even for multiple accounts

So it seems like you will run out of excuses to run away from Twitter soon, and all the additional social network data being generated could really help the next generation of response and online propensity models.

Tr.im that!!

screenshottrim

KXEN – Automated Regression Modeling

I have used KXEN many times for building and testing propensity models. The regression modeling feature of KXEN is awesome in the sense it can make model building very easy to build and deliver.

The KXEN package K2R is the package responsible for this and uses robust regression. A word of the basic mathematical theory behind KXEN’s automated modeling – the technique is called Structural Risk Minimization. You can read more on the basic mathematical technique here or http://www.svms.org/srm/. The following is an extract from the same source.

Structural risk minimization (SRM) (Vapnik and Chervonekis, 1974) is an inductive principle for model selection used for learning from finite training data sets. It describes a general model of capacity control and provides a trade-off between hypothesis space complexity (the VC dimension of approximating functions) and the quality of fitting the training data (empirical error). The procedure is outlined below.

  1. Using a priori knowledge of the domain, choose a class of functions, such as polynomials of degree n, neural networks having n hidden layer neurons, a set of splines with n nodes or fuzzy logic models having n rules.
  2. Divide the class of functions into a hierarchy of nested subsets in order of increasing complexity. For example, polynomials of increasing degree.
  3. Perform empirical risk minimization on each subset (this is essentially parameter selection).
  4. Select the model in the series whose sum of empirical risk and VC confidence is minimal.

Sewell (2006) SVMs use the spirit of the SRM principle.

Structural risk minimization (SRM) (Vapnik 1995) uses a set of models ordered in terms of their complexities. An example is polynomials of increasing order. The complexity is generally given by the number of free parameters. VC dimension is another measure of model complexity. In equation 4.37, we can have a set of decreasing ?i to get a set of models ordered in increasing complexity. Model selection by SRM then corresponds to finding the model simplest in terms of order and best in terms of empirical error on the data.”
Alpaydin (2004), pages 80-81

Now back to the automated regression modeling.

Robust Regression

(K2R) is a universal solution for Classification, Regression, and Attribute Importance. It enables the prediction of behaviors (nominal targets) or quantities (continuous targets).

Unlike traditional regression algorithms, K2R can safely handle a very high numbers of input attributes (over 10,000) in an automated fashion. K2R provides indicators and graphs to ensure that the quality and robustness of trained models can be easily assessed. K2R graphically displays the attribute importance, which provides the relative importance of each attribute for explaining a given business question. At the same time it gives a clear indication of which attributes either contain no relevant information or are redundant with other attributes.

Benefits: The business value of a data mining project is increased by either training more models or completing the project faster. The ability to train more models allows a larger number of scenarios to be tested at a higher level of granularity. For example, if a direct marketing campaign benefits from separate models trained per region, per customer, segment, per month, the automation of K2R allows all of these models to be trained and safely deployed using the same amount or fewer resources than with traditional tools. learn more

What: K2R is a regression algorithm that allows building models to predict categories or continuous variables.

Why: Traditionally, building robust predictive models required a lot of time and expertise, which prevented companies from using data mining as part of their every day business decisions. K2R makes it easy to build and deploy predictive models in the fraction of the time it takes using classical statistical tools.

How: K2R maps a set of descriptive attributes (model inputs) and target attributes (model output). It uses an algorithm patented by KXEN, which is a derivation of a principle described by V. Vapnik as “Structured Risk Minimization.” Instead of looking for the best performance on a known dataset, K2R automatically finds the best compromise between quality and robustness. The resulting models are expressed as a polynomial expression of the input numbers. The only element specified by the user is the polynomial degree. To improve modeling speed, K2R can also build multi-target models.

Benefits for the business user: K2R allows the business user to easily build and understand advanced predictive models without statistical knowledge. A model can be created in a matter of minutes. Two performance indicators describe model quality (Ki) and model reliability or the ability to produce similar on new data (Kr).

K2R graphically displays the individual variable contribution to the model, which helps to select the most important variables explaining a given business question. At the same time it avoids focusing on data that contains no information.

Models can directly be applied in a simulation mode for a single input dataset predicting the score for an individual business question in real time.

Benefits for the Data Mining expert: K2R frees time for Data Mining professionals to apply their expertise in areas where they add more value instead of spending several days to tune a model. K2R produces results within minutes (less than 15 seconds on a laptop with 50,000 lines and 20 variables).

Here is a case study from the company itself.

Marketing campaign usage scenario

* Send a “Test mailing” to 5000 customers to offer them a new product,
* Collect the results of your test mailing to build a “Training” data set that associates things you know about customers prior to the mailing with the answers to your business question
* Train a model to “predict” the Yes/No answer
* Check the quality and robustness of your model (Ki, Kr)
* Apply the model to the 1,000,000 other customers in your database: this model associates each individual customer with a probability for answering Yes. Because you are using a robust model, the sum of probabilities is a good indicator of how many people will answer yes to this mail
* Send your mailing only to those customers with a high probability to respond positively, or use our built-in profit curves to optimize your return on the campaign

Example: Regression: Dealer evaluation usage scenario

* Collect information about the past performance of your dealers two years ago and associate how much of your product they sold 1 year ago
* Train a model to predict how much a dealer will sell based on the available information
* Check the quality and robustness of the model (Ki, Kr)
* Apply the model to all of your dealers today: the model associates each dealer with an estimation of how many products he will sell,
* Sum up the estimates to predict how much you will sell next year. This is the base line for your sales forecast.

In my next post I would include screenshots on how to build an automated regression model using KXEN.

Ajay Disclaimer- I am a consultant to KXEN for social networks.

Twitterfox- Twitter for the busy people

Here is a nice firefox plugin for people who want to start using Twitter without losing too much time. It sits nicely in one corner and gives gentle tweets – think of it as a big instant messenger, big in terms of number of followers and need to use twitter and busy in terms of time, but very nice and comfortable. The screenshot says it all and all you need to do is start using Firefox and install this from http://twitterfox.net/.

Heavily recommended for non users of Twitter who are curious on what this thing is all about—-

Screenshots courtesy  myself and the gentlepeople at http://twitterfox.net/.

screenshot-decisionstats-e280ba-dashboard-e28094-wordpress-mozilla-firefox

TwitterFox is a Firefox extension that notifies you of your friends’ tweets on Twitter.

This extension adds a tiny icon on the status bar which notifies you when your friends update their tweets. Also it has a small text input field to update your tweets.

Install TwitterFox

If you want to get updates of TwitterFox, feel free to follow @TwitterFox.

New Features and Changes in Version 1.7.7.1

  • Supported Firefox 3.1b3
  • Added a context menu to each tweets which has:
    • Copy
    • Re-tweet
    • Open this tweet in new tab
    • Delete tweet
  • Auto extract is.gd and bit.ly links.
  • Added Mark all as read menu item to main context menu.
  • Increased contrast of background color of read/unread messages.
  • Added in-reply-to-status-id parameter for status update.
  • Added da-DK, th-TH, vi-VN, ar-SA, ar, and kw-GB translations.
  • Bug fixes.

Interview KNIME Fabian Dill

fabian We have covered KNIME.com ‘s open source platform earlier. On the eve of it’s new product launch, co-founder of Knime.com Fabian Dill reveals his thoughts in an exclusive interview.

From the Knime.com website

The modular data exploration platform KNIME, originally solely developed at the University of Konstanz, Germany, enables the user to visually create data flows – or pipelines, execute selected analysis steps, and later investigate the results through interactive views on data and models. KNIME already has more than 2,000 active users in diverse application areas, ranging from early drug discovery and customer relationship analysis to financial information integration.

Ajay – What prompted you personally to be part of KNIME and not join a big technology  company?  What does the future hold for KNIME in 2009-10?

Fabian -I was excited when I first joined the KNIME team in 2005. Back then, we were working exclusively on the open source version backed by some academic funding. Being part of the team that put together such a professional data mining environment from scratch was a great experience. Growing this into a commercial support and development arm has been a thrill as well. The team and the diverse experiences gained from helping get a new company off the ground and being involved in everything it takes to enable this to be successful made it unthinkable for me to work anywhere else.

We continue to develop the open source arm of KNIME and many new features lie ahead: text, image, and time series processing as well as better support for variables. We are constantly working on adding new nodes. KNIME 2.1 is expected in the fall and some of the ongoing development can already be found on the KNIME Labs page (http://labs.knime.org)

The commercial division is providing support and maintenance subscriptions for the freely available desktop version. At the same time we are developing products which will streamline the integration of KNIME into existing IT infrastructures:

  • the KNIME Grid Support lets you run your compute-intensive (sub-) workflows or nodes on a grid or cluster;

  • KNIME Reporting makes use of KNIME’s flexibility in order to gather the data for your report and provides simplified views (static or interactive=dashboards) on the resulting workflow and its results; and

  • the KNIME Enterprise Server facilitates company-wide installation of KNIME and supports collaboration between departments and sites by providing central workflow repositories, scheduled and remote execution, and user rights management.

Ajay -Software as a service and Cloud Computing is the next big thing in 2009. Are there any plans to put KNIME on a cloud computer and charge clients for the hour so they can build models on huge data without buying any hardware but just rent the time?

Fabian – Cloud computing is an agile and client-centric approach and therefore fits nicely into the KNIME framework, especially considering that we are already working on support for distributed computing of KNIME workflows (see above). However, we have no immediate plans for KNIME workflow processing on a per-use charge or similar. That’s an interesting idea, though. The way KNIME nodes are nicely encapsulated (and often even distributable themselves) would make this quite natural.

Ajay – What differentiates KNIME from other products such as RPro and Rapid Miner, for example? What are the principal challenges you have faced in developing it? Why do customers like and dislike it?

Fabian- Every tool has its strengths and weaknesses depending on the task you actually want to accomplish. The focus of KNIME is to support the user during his or her quest of understanding large and heterogeneous data and to make sense out of it. For this task, you cannot rely only on classical data mining techniques, wrapping them into a command line or otherwise configurable environment, but simple, intuitive access to those tools is required in addition to supporting visual exploration with interactive linking and brushing techniques.

By design, KNIME is a modular integration platform, which makes it easy to write own nodes (with the easy-to-use API) or integrate existing libraries or tools.

We integrated Weka, for example, because of its vast library of state-of-the-art machine learning algorithms, the open source program R – in order to provide access to a rich library of statistical functions (and of course many more) – and parts of the Chemistry Development Kit (CDK). All these integrations follow the KNIME requirements for easy and intuitive usage so the user does not need to understand the details of each tool in great depth.

A number of our commercial partners such as Schroedinger, Infocom, Symyx, Tripos, among others, also follow this paradigm and similarly integrate their tools into KNIME. Academic collaborations with ETH Zurich, Switzerland on the High Content Screening Platform HC/DC represent another positive outcome of this open architecture. We believe that this strictly result-oriented approach based on a carefully designed and professionally coded framework is a key factor of KNIME’s broad acceptance. I guess this is another big differentiator: right from the start, KNIME has been developed by a team consisting of SW developers with decades of industrial SW engineering experience.

Ajay – Any there any Asian plans for KNIME? Any other open source partnerships in the pipeline?

Fabian – We have a Japan-based partner, Infocom, who operates in the fields of life science. But we are always open for other partnerships, supporters, or collaborations.

In addition to the open source integrations mentioned above (Weka, R, CDK, HC/DC), there are many other different projects in the works and partnerships under negotiation. Keep an eye on our blog and on our Labs@KNIME page (labs.knime.org).

ABOUT

KNIME – development started in January 2004. Since then: 10 releases; approx. 350,000 lines of code; 25,000 downloads; an estimated 2000 active users. KNIME.com was founded in June 2008 in Zurich, Switzerland.

Fabian Dill – has been working for and with KNIME since 2005; co-founder of KNIME.com.

Interview Visual Numerics Alicia McGreevey

alicia

Here is an interview with the head of marketing of Visual Numerics, Alicia McGreevey.

Visual Numerics® is the leading provider of data analysis software, visualization solutions and expert consulting for technical, business and scientific communities worldwide (see http://www.vni.com ).

Ajay – Describe your career in science so far. How would explain embeddable analytics to a high school student who has to decide between getting a MBA or a Science degree.

Alicia – I think of analytics as analyzing a situation so you can make a decision. To do that objectively, you need data about your situation. Data can be anything: foreign currency exchange rates, the daily temperature here in Houston, or Tiger Wood’s record at the Master’s tournament when he’s not leading after the 3rd round.

Embedding analytics is simply making the analysis part of an application close to, or embedded with, your data. As an example, we have a customer in Germany, GFTA (Gesellschaft Fuer Trendanalysen), who has built an application that embeds analytics to analyze historic and live tick foreign exchange rate data. Their application gives treasuries and traders predictions on what is about to happen to exchange rates so they can make good decisions on when to buy or sell.

Embedding analytics is as much a business discipline as it is science. Historically, our analytics have been used predominantly by the government and scientific community to perform heavy science and engineering research. As business intelligence becomes increasingly important to compete in today’s marketplace, our analytics can now be found driving business decisions in industries like financial services, healthcare and manufacturing. Partners like Teradata and SAP are embedding our analytics into their software as a way to extend their current offerings. As their customers demand more custom BI solutions to fit unique data sets, our analytics provide a more affordable approach to meet that need. Customers now have an option to implement custom BI without incurring the massive overhead that you would typically find in a one-size-fits-all solution.

If you’re a student, I’d recommend you invest time and course work in the area of analytics regardless of the discipline you choose to study. The term analytics is really just a fancy term for math and statistics. I’ve taken math and statistics courses as part of a science curriculum and as part of a business curriculum. Being able to make optimal decisions by objectively analyzing data is a skill that will help you in business, science, engineering, or any area.

Ajay – You have been working behind the scenes quietly building math libraries that power many partners. Could you name a few success stories so far.

Alicia – One of the most interesting things about working at Visual Numerics is our customers. They create fascinating analytic applications using mathematic and statistical functions from our libraries. A few examples:

  • Total, who you probably know as one of the world’s super major oil companies, uses our math optimization routines in an application that automatically controls the blending of components in the production of gasoline, diesel and heavy fuels. By making best use of components, Total helps minimize their refining costs while maximizing revenue.

  • The Physics Department at the University of Kansas uses nonlinear equation solvers from our libraries to develop more efficient particle beam simulations. By simulating the behavior of particle beams in particle accelerators, scientists can better design particle accelerators, like the LHC or Large Hadron Collider, for high-energy research.

  • A final example that I think is interesting, given the current economic situation, is from one of our financial customers RiskMetrics Group. RiskMetrics uses functions from our libraries to do financial stress testing that allows portfolio fund managers simulate economic events, like the price of oil spiking 10% or markets diving 20%. They use this information to predict impacts on their portfolio and make better decisions for their clients.

Ajay – What have been the key moments in Visual Numerics path so far.

Alicia – Our company has been in business for over 38 years, rooted in the fundamentals of mathematics and statistics. It started off as IMSL, offering IMSL Numerical Libraries as a high performance computing tool for numerical analysis. Before visualization was fashionable, we saw visualization as an important part of the data analysis process. As a result, the company merged with Precision Visuals, makers of PV-WAVE (our visual data analysis product) in the 1990s to become what is now known as Visual Numerics.

Looking back at recent history, a major event for Visual Numerics was definitely when SAP AG licensed the libraries at the end of 2007. For several years leading up to 2007, we’d seen increased interest in our libraries from independent software vendors (ISVs). More and more ISVs with broad product offerings were looking to provide their customers with analytic capabilities, so we had invested considerably in making the libraries more attractive to this type of customer. Having SAP, one of the largest and most respected ISVs in the world, license our products gave us confidence that we could be a valued OEM partner to this type of customer.

Ajay – What are the key problems you face in your day to day job as a Visual Numerics employee. How do you have fun when not building math libraries.

Alicia – In marketing, our job is to help potential users of our libraries understand what it is we offer so that they can determine if what we offer is of value to them. Often the hardest challenge we face is simply finding that person. Since our libraries are embeddable, they’ve historically been used by programmers. So we’ve spent a lot of time at developer conferences and sponsoring developer websites, journals and academic programs.

One product update this year is that we’ve made the libraries available from Python, a dynamic scripting language. Making IMSL Library functions available from Python basically means that someone who is not a trained programmer can now use the math and stats capabilities in the IMSL Libraries just like a C, Java, .Net or Fortran developer. It’s an exciting development, though brings with it the challenge of letting a whole new set of potential users know about the capabilities of the libraries. It’s a fun challenge though.

On a more fun side of things, you may be interested to know that our expertise in math and statistics led us to some Hollywood fame. At one point in time, we were selected to review scripts for the crime busting drama, NUMB3RS. NUMB3RS, aired on CBS in the US and features an FBI Special Agent who recruits his brilliant mathematician brother to use the science of mathematics with its complex equations to solve the trickiest crimes in Los Angeles. So yes, the math behind the Show is real and it is exciting indeed to see how math can be applied in all aspects of our lives, including ferreting out criminals on TV!

AjayWhat is the story ahead. How do you think Visual Numerics can help demand forecasting and BI to say BYE to the recession.

We’re seeing more success stories from customers using analytics and data to make good decisions and I think the more organizations leverage analytics, the faster we’ll emerge from this economic slump.

As an example, we have a partner, nCode International, who makes software to help manufacturers collect and analyze test data and use the analysis to make design decisions. Using it, automobile manufacturers can, for example, analyze real-world driving pattern data for different geographic areas (e.g., emerging markets like China and India versus established markets like the USA and Europe) and design the perfect vehicle for specific markets.

So the analytic successes are out there and we know that organizations have multitudes of data. Certainly every organization that we work with has more data today than ever before. For analytics to help us say Bye to the recession, I think we need to continue to promote our successes, make analytic tools available to more users, and get users across multiple disciplines and industries using analytics to make the best possible decisions for their organizations.

Personal Biography:

As Director of Marketing for Visual Numerics, Alicia is an authority on how organizations are using advanced analytics to improve performance. Alicia brings over 15 years of experience working with scientists and customers in the planning and development of new technology products and developing go to market plans. She has a B.A. in Mathematics from Skidmore College and an M.B.A. from the University of Chicago Booth School of Business.

Does Twitter reduce Blogging ?

One more post on Twitter you may sigh, but wait. I am examine Twitter as an economic complementary  or substitute product to Blogging and trying to come up with a mathematical proving rule to dis prove the Null Hypothesis-

Twitter does not affect blogging of individuals or communities as  a whole. or does it ?

Twitter reduces blogging because

  1. Twitter is easier to do. Creating a blog is different ball game.

  2. Tweeting is two way and interactive while Blogging is mostly a one way broadcast.

  3. People respond to Tweets and re tweet them much more than they comment or forward blog posts. This is due to the inherent design of the softwares.

  4. Twitter is chaotic, but so is real life in which human brain processes different information from people like collegues, family, friends and sorts them. Blogging has a structure which helps the reader more than the writer

  5. It is easier to tweet and faster to get your point across than in Blogging.

  6. People allocate a set amount of time for social media activities and personal branding. Now this may be elastic but not totally so. Hence the rise of twitter time in people’ lives would mean lesser time to read and write blogs.

Now to a more quantitative study.

We get statistics from Technocrati – State of the Blogosphere and add in WordPress Stats to boot.

(credit -http://technorati.com/blogging/state-of-the-blogosphere/ )

A chart of total WordPress.com blogs since  launch:

(credit- http://en.wordpress.com/stats/ )

Note new signups can be seen for WordPress.com at http://en.wordpress.com/stats/signups/

Fatigue could be a reason why Twitter is hotting up while Blogging sees steady state growth.

The following figure from Technocrati’s 2008 report sums it best.

http://technorati.com/blogging/state-of-the-blogosphere/who-are-the-bloggers/

But if I compare June 2008 numbers of Blogging Frequency with the 2007 report – I am not able to compare the numbers

(Source -http://technorati.com/blogging/state-of-the-blogosphere/the-how-of-blogging/ )

http://www.sifry.com/alerts/archives/000493.html

It seems that Blog posts did get a boost with the 2008 elections and the current low traffic may simply be due to a lack of issues in Blogosphere. The rise in Twitter traffic is also due to creation of applications by third party providers and this trend has led to Twitter being the number 3 social media site.

Based on the data, it does not seem Twitter reduces Blog posts to a significant degree. After all Twitter is also a great medium to disseminate or spread the word on good blog posts.

It is simply too early to say that Twitter is reducing blogging though there seem clear trends along that line.

What about you ? If you were a blogger, is  your blog post frequency affected by your tweeting activities.