Does the Internet need its own version of credit bureaus

Data Miners love data. The more data they have the better model they can build. Consumers do not love data so much and find sharing data generally a cumbersome task. They need to be incentivize for filling out survey forms , and for signing to loyalty programs. Lawyers, and privacy advocates love to use examples of improper data collection and usage as the harbinger of an ominous scenario. George Orwell’s 1984 never “mentioned” anything about Big Brother trying to sell you one more loan, credit card or product.

Data generated by customers is now growing without their needing to fill out forms and surveys. This data is about their preferences , tastes and choices and is growing in size and depth because it is generated from social media channels on the Internet.It is this data that can be and is captured by social media analytics.

Mobile data is also growing, including usage of location based applications and usage of Internet from the mobile phone is leading to further increases in data about consumers.Increasingly , location based applications help to provide a much more relevant context to the data generated. Just mobile data is expected to grow to 15 exabytes by 2015.

People want to have more and more conversations online publicly , share pictures , activity and interact with a large number of people whom  they have never met. But resent that information being used or abused without their knowledge.

Also the Internet is increasingly being consolidated into a few players like Microsoft, Amazon, Google  and Facebook, who are unable to agree on agreements to share that data between themselves. Interestingly you can use Yahoo as a data middleman between Google and Facebook.

At the same time, more and more purchases are being done online by customers and Internet advertising has grown much above the rate of growth of other mediums of communication.
Internet retail sales have the advantage that better demand predictability can lead to lower inventories as retailers need not stock up displays to look good. An Amazon warehouse need not keep material to simply stock up it shelves like a K-Mart does.

Our Hypothesis – An Analogy with how Financial Data Marketing is managed offline

  1. Financial information regarding spending and saving is much more sensitive yet the presence of credit bureaus alleviates these concerns.
  2. Credit bureaus collect information from all sources, aggregate and anonymize the individual components accordingly.They use SSN as a unique identifier.
  3. The Internet has a unique number too , called the Internet Protocol Address (I.P) 
  4. Should there be a unique identifier like Internet Security Number for the Internet to ensure adequate balance between the need for privacy as well as the need for appropriate targeting? 

After all, no one complains about privacy intrusions if their credit bureau data is aggregated , rolled up, and anonymized and turned into a propensity model for sending them direct mailers.

Advertising using Social Media and Internet

https://www.facebook.com/about/ads/#stories

1. A business creates an ad
Let’s say a gym opens in your neighborhood. The owner creates an ad to get people to come in for a free workout.
2. Facebook gets paid to deliver the ad
The owner sends the ad to Facebook and describes who should see it: people who live nearby and like running.
The right people see the ad
3. Facebook only shows you the ad if you live in town and like to run. That’s how advertisers reach you without knowing who you are.

Adding in credit bureau data and legislative regulation for anonymizing  and handling privacy data can expand the internet selling market, which is much more efficient from a supply chain perspective than the offline display and shop models.

Privacy Regulations on Marketing using Internet data
Should laws on opt out and do not mail, do not call, lists be extended to do not show ads , do not collect information on social media. In the offline world, you can choose to be part of direct marketing or opt out of direct marketing by enrolling yourself in various do not solicit lists. On the internet the only option from advertisements is to use the Adblock plugin if you are Google Chrome or Firefox browser user. Even Facebook gives you many more ads than you need to see.

One reason for so many ads on the Internet is lack of central anonymize data repositories for giving high quality data to these marketing companies.Software that can be used for social media analytics is already available off the shelf.

The growth of the Internet has helped carved out a big industry for Internet web analytics so it is a matter of time before social media analytics becomes a multi billion dollar business as well. What new developments would be unleashed in this brave new world is just a matter of time, and of course of the social media data!

Google Webinar on Web Analytics

Google webinar on web analytics-

recommended for anyone with anything to do with the WWW

From

http://analytics.blogspot.com/2011/11/webinar-reaching-your-goals-with.html

 

Webinar: Reaching Your Goals with Analytics

 

 

Is your website performing as well as it could be? Do you want to get more out of your digital marketing campaigns, including AdWords and other digital media? Do you feel like you have gaps in your current Google Analytics setup?

We’ve heard from many of our users who want to go deeper into their Analytics — with so much data, it can be hard to know where to look first. If you’d like to move beyond standard “pageview” metrics and visitor statistics, then please join us next Thursday:

Webinar: Reaching Your Goals with Analytics
Date: Thursday, December 1
Time: 11am PST / 2pm EST
Sign up here!

During the webinar, we’ll cover:

  • Key questions to ask for richer insights from your data
  • How to define “success” (for websites, visitors, or campaigns)
  • How to set up and use Goals
  • How to set up and use Ecommerce (for websites with a shopping cart)
  • How to link AdWords to your Google Analytics account

Whatever your online business model — shopping, lead-generation, or pure content — these tools will deliver actionable insights into your buying cycle.

This webinar will be led by Joe Larkin, a technical specialist on the Google Analytics team, and it’s designed for intermediate users of Google Analytics. If you’re comfortable with the basics, but you’d like to do more with your data, then we hope you’ll join us next week!

Interview Zach Goldberg, Google Prediction API

Here is an interview with Zach Goldberg, who is the product manager of Google Prediction API, the next generation machine learning analytics-as-an-api service state of the art cloud computing model building browser app.
Ajay- Describe your journey in science and technology from high school to your current job at Google.

Zach- First, thanks so much for the opportunity to do this interview Ajay!  My personal journey started in college where I worked at a startup named Invite Media.   From there I transferred to the Associate Product Manager (APM) program at Google.  The APM program is a two year rotational program.  I did my first year working in display advertising.  After that I rotated to work on the Prediction API.

Ajay- How does the Google Prediction API help an average business analytics customer who is already using enterprise software , servers to generate his business forecasts. How does Google Prediction API fit in or complement other APIs in the Google API suite.

Zach- The Google Prediction API is a cloud based machine learning API.  We offer the ability for anybody to sign up and within a few minutes have their data uploaded to the cloud, a model built and an API to make predictions from anywhere. Traditionally the task of implementing predictive analytics inside an application required a fair amount of domain knowledge; you had to know a fair bit about machine learning to make it work.  With the Google Prediction API you only need to know how to use an online REST API to get started.

You can learn more about how we help businesses by watching our video and going to our project website.

Ajay-  What are the additional use cases of Google Prediction API that you think traditional enterprise software in business analytics ignore, or are not so strong on.  What use cases would you suggest NOT using Google Prediction API for an enterprise.

Zach- We are living in a world that is changing rapidly thanks to technology.  Storing, accessing, and managing information is much easier and more affordable than it was even a few years ago.  That creates exciting opportunities for companies, and we hope the Prediction API will help them derive value from their data.

The Prediction API focuses on providing predictive solutions to two types of problems: regression and classification. Businesses facing problems where there is sufficient data to describe an underlying pattern in either of these two areas can expect to derive value from using the Prediction API.

Ajay- What are your separate incentives to teach about Google APIs  to academic or researchers in universities globally.

Zach- I’d refer you to our university relations page

Google thrives on academic curiosity. While we do significant in-house research and engineering, we also maintain strong relations with leading academic institutions world-wide pursuing research in areas of common interest. As part of our mission to build the most advanced and usable methods for information access, we support university research, technological innovation and the teaching and learning experience through a variety of programs.

Ajay- What is the biggest challenge you face while communicating about Google Prediction API to traditional users of enterprise software.

Zach- Businesses often expect that implementing predictive analytics is going to be very expensive and require a lot of resources.  Many have already begun investing heavily in this area.  Quite often we’re faced with surprise, and even skepticism, when they see the simplicity of the Google Prediction API.  We work really hard to provide a very powerful solution and take care of the complexity of building high quality models behind the scenes so businesses can focus more on building their business and less on machine learning.

 

 

Business Metrics

Business Metrics (a partial extract from my upcoming book “R for Business Analytics”

Business Metrics are important variables that are collected on a periodic basis to assess the health and sustainability of a business. They should have the following properties-

1) What is a Business Metric-The absence of collection of regular update of the business metric could cause business disruption by incorrect and incomplete decision making.

2) Cost of Business Metrics- The costs of collection, storage and updating of the business metric is less than the opportunity costs of wrong decision making cause by lack of information of that business metric.

3) Continuity in your Business Metrics- The business metrics are continuous in comparing across time periods and business units- if necessary the assumptions for smoothing the comparisons should be listed in the business metric presentation itself.

4) Simplify your Business Metrics– Business metrics can be derived as well from other business metrics. If necessary and to avoid clutter only the most important business metrics should be presented, or the metrics with the biggest deviation from past trends should be mentioned.

5) Normalize your Business Metrics- Scale of the business metric units should be comparable to other business metrics as well as significant to emphasize the difference in numbers.

6) Standardize your Business Metrics– Dimension of business metrics should be increased to enhance comparison and contrasts without enhancing complexity. This means adding an extra dimension for analysis rather than a 2 by 2 comparison, to add time /geography/ employee/business owner as a dimension .

Funny Economics

Some wry observations from me  on the world on economics-

1) 150 years after humiliating their country in the Opium Wars, Chinese mandarins have somehow convinced their leaders and military to park 2 trillion assets in Anglo Saxon debt. If Greece geting a 50% discount on its loan is the new precedent, when will the USA force its lendors to the negotiation table.

2) Income inequality and protests are something the Arabs and Israelis have in common. Besides being the sons of Abraham of course. Note the Persians are not considered the same as Arabs.

3) Advance knowledge of geo-political events can and ensures Western financial dealers have an edge on the sovereign funds in the other hemisphere.  What used to be the playgrounds of Eton has now shifted to the pubs of Boston and So Cal.

4) After spending 1 trillion USD on arms in the past one decade (funded by guys in item 1), the United States military forces is in a much better more advanced position to wage simultaneous war.

5) Can a war in Korean peninsula affect war in the Persian sphere of influence. Just follow the money , baby.

6) Saudi Wahabis continue to fund terror despite losing a lot of money in the economic meltdown in past few years. For every 1 $ increase in Saudi oil revenue, western oil companies ,traders, financiers make more, much more.

7) Demographics is an important key to economics. An aging Japan, and stagnant West is one cause to shift from manpower intensive warfare to cyber warfare. Plus Cyber warfare is good business . Underpopulated Russia and Arabs continue to lack true economic potential.

8) There are new economic incentives to develop tools to disseminate as well as distort information flow in real time in a hyper connected digital world.

 

Google Doodle for Diwali Greetings

Hey,

If you like Diwali the festival and want Google to create a doodle for it, just send an email to proposals@google.com

 

http://en.wikipedia.org/wiki/Diwali

Diwali or Deepavali[note 1], popularly known as the “festival of lights,” is a festival celebrated between mid-October and mid-November for different reasons. For Hindus, Diwali is one of the most important festivals of the year and is celebrated in families by performing traditional activities together in their homes. For Jains, Diwali marks the attainment of moksha or nirvana by Mahavira in 527 BC.[1][2]

Diwali is an official holiday in India,[3] NepalSri LankaMyanmarMauritiusGuyanaTrinidad & TobagoSurinameMalaysiaSingapore,[4] andFiji.

The name “Diwali” is a contraction of “Dipawali” (Sanskrit: दीपावली Dīpāwalī), which translates into “row of lamps”.[5] Diwali involves the lighting of small clay lamps (diyas or dīpas) in Sanskrit: दीप) filled with oil to signify the triumph of good over evil. During Diwali, all the celebrants wear new clothes and share sweets and snacks with family members and friends.

The festival starts with Dhanteras on which most Indian business communities begin their financial year. The second day of the festival, Naraka Chaturdasi, marks the vanquishing of the demon Naraka by Lord Krishna and his wife SatyabhamaAmavasya, the third day of Diwali, marks the worship of Lakshmi, the goddess of wealth in her most benevolent mood, fulfilling the wishes of her devotees. Amavasya also tells the story of Lord Vishnu, who in his dwarf incarnation vanquished the Bali, and banished him to Patala. It is on the fourth day of Diwali, Kartika Shudda Padyami, that Bali went to patala and took the reins of his new kingdom in there. The fifth day is referred to as Yama Dvitiya (also called Bhai Dooj), and on this day sisters invite their brothers to their homes.

While the story behind Diwali and the manner of celebration varies from region to region (festive fireworks, worship, lights, sharing of sweets), the essence is the same – to rejoice in the Inner Light

from-

http://www.google.com/doodle4google/history.html

How can Google users/the public submit ideas for doodles?

The doodle team is open to user ideas; requests for doodles can be sent to proposals@google.com

 

 

How to make an analytics project?

Some of the process methodologies I have used and been exposed to while making analytics projects are-1) DMAIC/Six Sigma

While Six Sigma was initially a quality control system, it has also been very succesful in managing projects. The various stages of an analytical project can be divided using the DMAIC methodology.

DMAIC stands for

  • Define
  • Measure
  • Analyze
  • Improve
  • Control

Related to this is DMADV, ( “Design For Six Sigma”)

  • Define
  • Measure and identify CTQs
  • Analyze
  • Design
  • Verify

2) CRISP
CRISP-DM stands for Cross Industry Standard Process for Data Mining

CRISP-DM breaks the process of data mining into six major phases- and these can be used for business analytics projects as well.

  • Business Understanding
  • Data Understanding
  • Data Preparation
  • Modeling
  • Evaluation
  • Deployment

3) SEMMA
SEMMA  stands for

  • sample
  • explore
  • modify
  • model
  • assess

4) ISO 9001

ISO 9001 is a certification as well as a philosophy for making a Quality Management System to measure , reduce and eliminate error and customer complaints. Any customer complaint or followup has to be treated as an error, logged, and investigated for control.

5) LEAN
LEAN is a philosophy to eliminate Wastage in a process. Applying LEAN principles to analytics projects helps a lot in eliminating project bottlenecks, technology compatibility issues and data quality resolution. I think LEAN would be great in data quality issues, and IT infrastructure design because that is where the maximum waste is observed in analytics projects.

6) Demings Plan Do Check Act cycle.