Quantitative Modeling for Arbitrage Positions in Ad KeyWords Internet Marketing

Assume you treat an ad keyword as an equity stock. There are slight differences in the cost for advertising for that keyword across various locations (Zurich vs Delhi) and various channels (Facebook vs Google) . You get revenue if your website ranks naturally in organic search for the keyword, and you have to pay costs for getting traffic to your website for that keyword.
An arbitrage position is defined as a riskless profit when cost of keyword is less than revenue from keyword. We take examples of Adsense  and Adwords primarily.
There are primarily two types of economic curves on the foundation of which commerce of the  internet  resides-
1) Cost Curve- Cost of Advertising to drive traffic into the website  (Google Adwords, Twitter Ads, Facebook , LinkedIn ads)
2) Revenue Curve – Revenue from ads clicked by the incoming traffic on website (like Adsense, LinkAds, Banner Ads, Ad Sharing Programs , In Game Ads)
The cost and revenue curves are primarily dependent on two things
1) Type of KeyWord-Also subdependent on
a) Location of Prospective Customer, and
b) Net Present Value of Good and Service to be eventually purchased
For example , keyword for targeting sales of enterprise “business intelligence software” should ideally be costing say X times as much as keywords for “flower shop for birthdays” where X is the multiple of the expected payoffs from sales of business intelligence software divided by expected payoff from sales of flowers (say in Location, Daytona Beach ,Florida or Austin, Texas)
2) Traffic Volume – Also sub-dependent on Time Series and
a) Seasonality -Annual Shoppping Cycle
b) Cyclicality– Macro economic shifts in time series
The cost and revenue curves are not linear and ideally should be continuous in a definitive exponential or polynomial manner, but in actual reality they may have sharp inflections , due to location, time, as well as web traffic volume thresholds
Type of Keyword – For example ,keywords for targeting sales for Eminem Albums may shoot up in a non linear manner after the musician dies.
The third and not so publicly known component of both the cost and revenue curves is factoring in internet industry dynamics , including relative market share of internet advertising platforms, as well as percentage splits between content creator and ad providing platforms.
For example, based on internet advertising spend, people belive that the internet advertising is currently heading for a duo-poly with Google and Facebook are the top two players, while Microsoft/Skype/Yahoo and LinkedIn/Twitter offer niche options, but primarily depend on price setting from Google/Bing/Facebook.
It is difficut to quantify  the elasticity and efficiency of market curves as most literature and research on this is by in-house corporate teams , or advisors or mentors or consultants to the primary leaders in a kind of incesteous fraternal hold on public academic research on this.
It is recommended that-
1) a balance be found in the need for corporate secrecy to protest shareholder value /stakeholder value maximization versus the need for data liberation for innovation and grow the internet ad pie faster-
2) Cost and Revenue Curves between different keywords, time,location, service providers, be studied by quants for hedging inetrent ad inventory or /and choose arbitrage positions This kind of analysis is done for groups of stocks and commodities in the financial world, but as commerce grows on the internet this may need more specific and independent quants.
3) attention be made to how cost and revenue curves mature as per level of sophistication of underlying economy like Brazil, Russia, China, Korea, US, Sweden may be in different stages of internet ad market evolution.
For example-
A study in cost and revenue curves for certain keywords across domains across various ad providers across various locations from 2003-2008 can help academia and research (much more than top ten lists of popular terms like non quantitative reports) as well as ensure that current algorithmic wightings are not inadvertently given away.
Part 2- of this series will explore the ways to create third party re-sellers of keywords and measuring impacts of search and ad engine optimization based on keywords.

Does the Internet need its own version of credit bureaus

Data Miners love data. The more data they have the better model they can build. Consumers do not love data so much and find sharing data generally a cumbersome task. They need to be incentivize for filling out survey forms , and for signing to loyalty programs. Lawyers, and privacy advocates love to use examples of improper data collection and usage as the harbinger of an ominous scenario. George Orwell’s 1984 never “mentioned” anything about Big Brother trying to sell you one more loan, credit card or product.

Data generated by customers is now growing without their needing to fill out forms and surveys. This data is about their preferences , tastes and choices and is growing in size and depth because it is generated from social media channels on the Internet.It is this data that can be and is captured by social media analytics.

Mobile data is also growing, including usage of location based applications and usage of Internet from the mobile phone is leading to further increases in data about consumers.Increasingly , location based applications help to provide a much more relevant context to the data generated. Just mobile data is expected to grow to 15 exabytes by 2015.

People want to have more and more conversations online publicly , share pictures , activity and interact with a large number of people whom  they have never met. But resent that information being used or abused without their knowledge.

Also the Internet is increasingly being consolidated into a few players like Microsoft, Amazon, Google  and Facebook, who are unable to agree on agreements to share that data between themselves. Interestingly you can use Yahoo as a data middleman between Google and Facebook.

At the same time, more and more purchases are being done online by customers and Internet advertising has grown much above the rate of growth of other mediums of communication.
Internet retail sales have the advantage that better demand predictability can lead to lower inventories as retailers need not stock up displays to look good. An Amazon warehouse need not keep material to simply stock up it shelves like a K-Mart does.

Our Hypothesis – An Analogy with how Financial Data Marketing is managed offline

  1. Financial information regarding spending and saving is much more sensitive yet the presence of credit bureaus alleviates these concerns.
  2. Credit bureaus collect information from all sources, aggregate and anonymize the individual components accordingly.They use SSN as a unique identifier.
  3. The Internet has a unique number too , called the Internet Protocol Address (I.P) 
  4. Should there be a unique identifier like Internet Security Number for the Internet to ensure adequate balance between the need for privacy as well as the need for appropriate targeting? 

After all, no one complains about privacy intrusions if their credit bureau data is aggregated , rolled up, and anonymized and turned into a propensity model for sending them direct mailers.

Advertising using Social Media and Internet

https://www.facebook.com/about/ads/#stories

1. A business creates an ad
Let’s say a gym opens in your neighborhood. The owner creates an ad to get people to come in for a free workout.
2. Facebook gets paid to deliver the ad
The owner sends the ad to Facebook and describes who should see it: people who live nearby and like running.
The right people see the ad
3. Facebook only shows you the ad if you live in town and like to run. That’s how advertisers reach you without knowing who you are.

Adding in credit bureau data and legislative regulation for anonymizing  and handling privacy data can expand the internet selling market, which is much more efficient from a supply chain perspective than the offline display and shop models.

Privacy Regulations on Marketing using Internet data
Should laws on opt out and do not mail, do not call, lists be extended to do not show ads , do not collect information on social media. In the offline world, you can choose to be part of direct marketing or opt out of direct marketing by enrolling yourself in various do not solicit lists. On the internet the only option from advertisements is to use the Adblock plugin if you are Google Chrome or Firefox browser user. Even Facebook gives you many more ads than you need to see.

One reason for so many ads on the Internet is lack of central anonymize data repositories for giving high quality data to these marketing companies.Software that can be used for social media analytics is already available off the shelf.

The growth of the Internet has helped carved out a big industry for Internet web analytics so it is a matter of time before social media analytics becomes a multi billion dollar business as well. What new developments would be unleashed in this brave new world is just a matter of time, and of course of the social media data!

Occupy the Internet

BORN IN THE USA

Continue reading “Occupy the Internet”

Revolution Webinar Series #Rstats

Revolution Analytics Webinar-

 

Featured Webinar
David Champagne REGISTER NOW
Presenter David Champagne
CTO, Revolution Analytics
Date Tuesday, December 20th
Time 11:00AM – 11:30AM Pacific 
Click here for the webinar time in your local time zone

Big Data Starts with R

Traditional IT infrastructure is simply unable to meet

the demands of the new “Big Data Analytics” landscape.   Many enterprises are turning to the “R” statistical programming language and Hadoop (both open source projects) as a potential solution. This webinar will introduce the statistical capabilities of R within the Hadoop ecosystem.  We’ll cover:

  • An introduction to new packages developed by Revolution Analytics to facilitate interaction with the data stores HDFS and HBase so that they can be leveraged from the R environment
  • An overview of how to write Map Reduce jobs in R using Hadoop
  • Special considerations that need to be made when working with R and Hadoop.

We’ll also provide additional resources that are available to people interested in integrating R and Hadoop.

 

Upcoming Webinars
Wed, Dec 14th
11:00AM – 11:30AM PT
Revolution R Enterprise – 100% R and MoreR users already know why the R language is the lingua franca of statisticians today: because it’s the most powerful statistical language in the world. Revolution Analytics builds on the power of open source R, and adds performance, productivity and integration features to create Revolution R Enterprise. In this webinar, author and blogger David Smith will introduce the additional capabilities of Revolution R Enterprise.
 Archived Webinars-
Revolution Webinar: New Features in Revolution R Enterprise 5.0 (including RevoScaleR) to Support Scalable Data AnalysisRevolution R Enterprise 5.0 is Revolution Analytics’ scalable analytics platform.  At its core is Revolution Analytics’ enhanced Distribution of R, the world’s most widely-used project for statistical computing.  In this webinar, Dr. Ranney will discuss new features and show examples of the new functionality, which extend the platform’s usability, integration and scalability

 

Graphs in Statistical Analysis

One of the seminal papers establishing the importance of data visualization (as it is now called) was the 1973 paper by F J Anscombe in http://www.sjsu.edu/faculty/gerstman/StatPrimer/anscombe1973.pdf

It has probably the most elegant introduction to an advanced statistical analysis paper that I have ever seen-

1. Usefulness of graphs

Most textbooks on statistical methods, and most statistical computer programs, pay too little attention to graphs. Few of us escape being indoctrinated with these notions:

(1) numerical calculations are exact, but graphs are rough;

(2) for any particular kind of statistical data there is just one set of calculations constituting a correct statistical analysis;

(3) performing intricate calculations is virtuous, whereas actually looking at the data is cheating.

A computer should make both calculations and graphs. Both sorts of output should be studied; each will contribute to understanding.

Of course the dataset makes it very very interesting for people who dont like graphical analysis too much.

From http://en.wikipedia.org/wiki/Anscombe%27s_quartet

 The x values are the same for the first three datasets.

Anscombe’s Quartet
I II III IV
x y x y x y x y
10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58
8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76
13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71
9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84
11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47
14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04
6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25
4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50
12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56
7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91
5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89

For all four datasets:

Property Value
Mean of x in each case 9 exact
Variance of x in each case 11 exact
Mean of y in each case 7.50 (to 2 decimal places)
Variance of y in each case 4.122 or 4.127 (to 3 d.p.)
Correlation between x and y in each case 0.816 (to 3 d.p.)
Linear regression line in each case y = 3.00 + 0.500x (to 2 d.p. and 3 d.p. resp.)
But see the graphical analysis –
While R has always been great in emphasizing graphical analysis, thanks in part due to work by H Wickham and others, SAS products and  language has also modified its approach at http://www.sas.com/technologies/analytics/statistics/datadiscovery/
 SAS Visual Data Discovery combines top-selling SAS products (Base SASSAS/STAT® and SAS/GRAPH®), along with two interfaces (SAS® Enterprise Guide® for guided tasks and batch analysis and JMP® software for discovery and exploratory analysis).
 and  ODS Statistical Graphs at
While ODS Statistical graphs is still not as smooth as say R’s GGPLOT2 http://tinyurl.com/ggplot2-book, it still is a progressive step
Pretty graphs make for better decisions too !

 

 

My Digital Trail

Someone I know recently mentioned that I have an extensive Digital Trail. I do.

I have 7863 connections at http://www.linkedin.com/in/ajayohri, 31 likes at https://www.facebook.com/ajayohri and 19 likes at https://www.facebook.com/pages/Ajay-Ohri/157086547679568, 409 friends (and 13 subscribers) at https://www.facebook.com/byebyebyer .On twitter I have 499 followers at http://twitter.com/0_h_r_1 and 344 followers at http://twitter.com/rforbusiness , and even on Google Plus some 617 people circling me at https://plus.google.com/116302364907696741272 (besides 6 other pages on G+)

Even my Youtube channel at http://www.youtube.com/decisionstats is more popular than I am in non-digital life. my non existant video blog at http://videosforkush.blogspot.com/ and my poetry blog at http://poemsforkush.wordpress.com/, and my comments on other social media, and my blurbs on my tumblr http://kushohri.tumblr.com/, and you get a lot of my psych profile.

Why do I do leave so much trail digitally?

For one reason- I was a bit of introvert always and technology set me free, the opportunity to think and yet be relaxed in anonymous chatter.

For the second reason- I am divorced and my wife got my 4 yr old son’s custody. Even though I talk to him once a day for a couple of minutes, somehow I hope when he grows, he reads my digital trail , maybe even these words, on the kind of man I was and the phases and seasons of life I went through.

 

That is all.

 

 

What are you thankful for?

I am thankful for-1) God and Jesus Christ and Rama and Allah and Buddha and their priests and intermediaries and everybody taking care of me. or atleast trying hard to take care of me.2) Earth my planet for nourishing me despite me polluting her, her fresh air and her delicious fruit and marvellous wine and hearty bread.

3) Fellow Human Beings for being nice to me when they feel curt, for displaying civilized manners, and working together in a vast invisible web of commerce, trade and exchange to meet our needs.

4) Scientists and Engineers who create wonderful technology by spending hours , months , years of their lives and giving it up for free on the Internet.

5) Powerful people who take time to mentor unknown wild cards, and young people to rejuvenate with new exciting ideas.

6) people who appreciate my poetry and people who appreciate my technology. and people who criticize only in the intention of me striving to create something better.

Continue reading “What are you thankful for?”