I had given up on Blogspot ever having a makeover in favor of the nice themes at
wordpress, but man, the new CEO at google is really shaking some stuff here.
Check out the nice features for customizing the themes at Blogspot
I had given up on Blogspot ever having a makeover in favor of the nice themes at
wordpress, but man, the new CEO at google is really shaking some stuff here.
Check out the nice features for customizing the themes at Blogspot
For the past year and two I have noticed a lot of statistical analysis using #rstats /R on unstructured text generated in real time by the social network Twitter. From an analytic point of view , Google Plus is an interesting social network , as it is a social network that is new and arrived after the analytic tools are relatively refined. It is thus an interesting use case for evolution of people behavior measured globally AFTER analytic tools in text mining are evolved and we can thus measure how people behave and that behavior varies as the social network and its user interface evolves.
And it would also be a nice benchmark to do sentiment analysis across multiple social networks.
Some interesting use cases of using Twitter that have been used in R.

The Console lets you see and manage the following project information:
| Google+ API | Courtesy limit: 1,000 queries/day |
|---|
| API | Per-User Limit | Used | Courtesy Limit | |
|---|---|---|---|---|
| Google+ API | 5.0 requests/second/user | 0% | 1,000 queries/day |
API Calls
GET https://www.googleapis.com/plus/v1/people/userId
Different API methods require parameters to be passed either as part of the URL path or as query parameters. Additionally, there are a few parameters that are common to all API endpoints. These are all passed as optional query parameters.
|
|
|
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Resources in the Google+ API are represented using JSON data formats. For example, retrieving a user’s profile may result in a response like:
{
"kind": "plus#person",
"id": "118051310819094153327",
"displayName": "Chirag Shah",
"url": "https://plus.google.com/118051310819094153327",
"image": {
"url": "https://lh5.googleusercontent.com/-XnZDEoiF09Y/AAAAAAAAAAI/AAAAAAAAYCI/7fow4a2UTMU/photo.jpg"
}
}
While each type of resource will have its own unique representation, there are a number of common properties that are found in almost all resource representations.
|
|
|
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
In requests that can respond with potentially large collections, such as Activities list, each response contains a limited number of items, set by maxResults(default: 20). Each response also contains a nextPageToken property. To obtain the next page of items, you pass this value of nextPageToken to the pageTokenproperty of the next request. Repeat this process to page through the full collection.
For example, calling Activities list returns a response with nextPageToken:
{
"kind": "plus#activityFeed",
"title": "Plus Public Activities Feed",
"nextPageToken": "CKaEL",
"items": [
{
"kind": "plus#activity",
"id": "123456789",
...
},
...
]
...
}
To get the next page of activities, pass the value of this token in with your next Activities list request:
https://www.googleapis.com/plus/v1/people/me/activities/public?pageToken=CKaEL
As before, the response to this request includes nextPageToken, which you can pass in to get the next page of results. You can continue this cycle to get new pages — for the last page, “nextPageToken” will be absent.
I was looking at the site http://www.google.com/adplanner/static/top1000/index.html
and I saw this list (Below) and using a Google Doc at https://docs.google.com/spreadsheet/pub?hl=en_US&hl=en_US&key=0AtYMMvghK2ytdE9ybmVQeUxMeXdjWlVKYzRlMkxjX0E&output=html.
I then decided to divide pageviews by users to check the maths
Facebook is AAAAAmazing! and the Russian social network is amazing too!
or
The maths is wrong! (maybe sampling, maybe virtual pageviews caused by friendstream refresh)
but the average of 1,136 page views per unique visitor per month means 36 page views /visitor a Day!
Rank Site Category Unique Visitors (users) Page Views Views/Visitors
1 facebook.com Social Networks 880000000 1000000000000 1,136 29 linkedin.com Social Networks 80000000 2500000000 31 38 orkut.com Social Networks 66000000 4000000000 61 40 orkut.com.br Social Networks 62000000 43000000000 694 65 weibo.com Social Networks 42000000 2800000000 67 66 renren.com Social Networks 42000000 3300000000 79 84 odnoklassniki.ru Social Networks 37000000 13000000000 351 90 scribd.com Social Networks 34000000 140000000 4 95 vkontakte.ru Social Networks 34000000 48000000000 1,412
and
Rank Site Category Unique Visitors (users)Page Views Page Views/Visitors 1 facebook.com Social Networks 880000000 1000000000000 1,136 2 youtube.com Online Video 800000000 100000000000 125 3 yahoo.com Web Portals 590000000 77000000000 131 4 live.com Search Engines 490000000 84000000000 171 5 msn.com Web Portals 440000000 20000000000 45 6 wikipedia.org Dict 410000000 6000000000 15 7 blogspot.com Blogging 340000000 4900000000 14 8 baidu.com Search Engines 300000000 110000000000 367 9 microsoft.com Software 250000000 2500000000 10 10 qq.com Web Portals 250000000 39000000000 156
see complete list at http://www.google.com/adplanner/static/top1000/index.html Continue reading “Page Mathematics”
But after all that- I was quite happy to see Google Fusion Tables within Google Docs. Databases as a service ? Not quite but still quite good, and lets see how it goes.
https://www.google.com/fusiontables/DataSource?dsrcid=implicit&hl=en_US&pli=1
http://googlesystem.blogspot.com/2011/09/fusion-tables-new-google-docs-app.html
But what interests me more is
http://code.google.com/apis/fusiontables/docs/developers_guide.html
The Google Fusion Tables API is a set of statements that you can use to search for and retrieve Google Fusion Tables data, insert new data, update existing data, and delete data. The API statements are sent to the Google Fusion Tables server using HTTP GET requests (for queries) and POST requests (for inserts, updates, and deletes) from a Web client application. The API is language agnostic: you can write your program in any language you prefer, as long as it provides some way to embed the API calls in HTTP requests.
The Google Fusion Tables API does not provide the mechanism for submitting the GET and POST requests. Typically, you will use an existing code library that provides such functionality; for example, the code libraries that have been developed for the Google GData API. You can also write your own code to implement GET and POST requests.
Also see http://code.google.com/apis/fusiontables/docs/sample_code.html
| Language | Library | Public repository | Samples |
|---|---|---|---|
| Python | Fusion Tables Python Client Library | fusion-tables-client-python/ | Samples |
| PHP | Fusion Tables PHP Client Library | fusion-tables-client-php/ | Samples |
An easy way to learn how to use an API can be to look at sample code. The table above provides links to some basic samples for each of the languages shown. This section highlights particularly interesting samples for the Fusion Tables API.
| Language | Featured samples | API version |
|---|---|---|
| cURL |
|
SQL API |
| Google Apps Script |
|
SQL API |
| Java |
|
SQL API |
| Python |
|
Docs List API |
| Android (Java) |
|
SQL API |
| JavaScript – FusionTablesLayer | Using the FusionTablesLayer, you can display data on a Google Map
Also check out FusionTablesLayer Builder, which generates all the code necessary to include a Google Map with a Fusion Table Layer on your own website. |
FusionTablesLayer, Google Maps API |
| JavaScript – Google Chart Tools | Using the Google Chart Tools, you can request data from Fusion Tables to use in visualizations or to display directly in an HTML page. Note: responses are limited to 500 rows of data. | Google Chart Tools |
Google Fusion Tables is dedicated to providing code examples that illustrate typical uses, best practices, and really cool tricks. If you do something with the Google Fusion Tables API that you think would be interesting to others, please contact us at googletables-feedback@google.com about adding your code to our Examples page.
Updated-You can use Google Fusion Tables from within R from http://andrei.lopatenko.com/rstat/fusion-tables.R
ft.connect <- function(username, password) {
url = "https://www.google.com/accounts/ClientLogin";
params = list(Email = username, Passwd = password, accountType="GOOGLE", service= "fusiontables", source = "R_client_API")
connection = postForm(uri = url, .params = params)
if (length(grep("error", connection, ignore.case = TRUE))) {
stop("The wrong username or password")
return ("")
}
authn = strsplit(connection, "\nAuth=")[[c(1,2)]]
auth = strsplit(authn, "\n")[[c(1,1)]]
return (auth)
}
ft.disconnect <- function(connection) {
}
ft.executestatement <- function(auth, statement) {
url = "http://tables.googlelabs.com/api/query"
params = list( sql = statement)
connection.string = paste("GoogleLogin auth=", auth, sep="")
opts = list( httpheader = c("Authorization" = connection.string))
result = postForm(uri = url, .params = params, .opts = opts)
if (length(grep("<HTML>\n<HEAD>\n<TITLE>Parse error", result, ignore.case = TRUE))) {
stop(paste("incorrect sql statement:", statement))
}
return (result)
}
ft.showtables <- function(auth) {
url = "http://tables.googlelabs.com/api/query"
params = list( sql = "SHOW TABLES")
connection.string = paste("GoogleLogin auth=", auth, sep="")
opts = list( httpheader = c("Authorization" = connection.string))
result = getForm(uri = url, .params = params, .opts = opts)
tables = strsplit(result, "\n")
tableid = c()
tablename = c()
for (i in 2:length(tables[[1]])) {
str = tables[[c(1,i)]]
tnames = strsplit(str,",")
tableid[i-1] = tnames[[c(1,1)]]
tablename[i-1] = tnames[[c(1,2)]]
}
tables = data.frame( ids = tableid, names = tablename)
return (tables)
}
ft.describetablebyid <- function(auth, tid) {
url = "http://tables.googlelabs.com/api/query"
params = list( sql = paste("DESCRIBE", tid))
connection.string = paste("GoogleLogin auth=", auth, sep="")
opts = list( httpheader = c("Authorization" = connection.string))
result = getForm(uri = url, .params = params, .opts = opts)
columns = strsplit(result,"\n")
colid = c()
colname = c()
coltype = c()
for (i in 2:length(columns[[1]])) {
str = columns[[c(1,i)]]
cnames = strsplit(str,",")
colid[i-1] = cnames[[c(1,1)]]
colname[i-1] = cnames[[c(1,2)]]
coltype[i-1] = cnames[[c(1,3)]]
}
cols = data.frame(ids = colid, names = colname, types = coltype)
return (cols)
}
ft.describetable <- function (auth, table_name) {
table_id = ft.idfromtablename(auth, table_name)
result = ft.describetablebyid(auth, table_id)
return (result)
}
ft.idfromtablename <- function(auth, table_name) {
tables = ft.showtables(auth)
tableid = tables$ids[tables$names == table_name]
return (tableid)
}
ft.importdata <- function(auth, table_name) {
tableid = ft.idfromtablename(auth, table_name)
columns = ft.describetablebyid(auth, tableid)
column_spec = ""
for (i in 1:length(columns)) {
column_spec = paste(column_spec, columns[i, 2])
if (i < length(columns)) {
column_spec = paste(column_spec, ",", sep="")
}
}
mdata = matrix(columns$names,
nrow = 1, ncol = length(columns),
dimnames(list(c("dummy"), columns$names)), byrow=TRUE)
select = paste("SELECT", column_spec)
select = paste(select, "FROM")
select = paste(select, tableid)
result = ft.executestatement(auth, select)
numcols = length(columns)
rows = strsplit(result, "\n")
for (i in 3:length(rows[[1]])) {
row = strsplit(rows[[c(1,i)]], ",")
mdata = rbind(mdata, row[[1]])
}
output.frame = data.frame(mdata[2:length(mdata[,1]), 1])
for (i in 2:ncol(mdata)) {
output.frame = cbind(output.frame, mdata[2:length(mdata[,i]),i])
}
colnames(output.frame) = columns$names
return (output.frame)
}
quote_value <- function(value, to_quote = FALSE, quote = "'") {
ret_value = ""
if (to_quote) {
ret_value = paste(quote, paste(value, quote, sep=""), sep="")
} else {
ret_value = value
}
return (ret_value)
}
converttostring <- function(arr, separator = ", ", column_types) {
con_string = ""
for (i in 1:(length(arr) - 1)) {
value = quote_value(arr[i], column_types[i] != "number")
con_string = paste(con_string, value)
con_string = paste(con_string, separator, sep="")
}
if (length(arr) >= 1) {
value = quote_value(arr[length(arr)], column_types[length(arr)] != "NUMBER")
con_string = paste(con_string, value)
}
}
ft.exportdata <- function(auth, input_frame, table_name, create_table) {
if (create_table) {
create.table = "CREATE TABLE "
create.table = paste(create.table, table_name)
create.table = paste(create.table, "(")
cnames = colnames(input_frame)
for (columnname in cnames) {
create.table = paste(create.table, columnname)
create.table = paste(create.table, ":string", sep="")
if (columnname != cnames[length(cnames)]){
create.table = paste(create.table, ",", sep="")
}
}
create.table = paste(create.table, ")")
result = ft.executestatement(auth, create.table)
}
if (length(input_frame[,1]) > 0) {
tableid = ft.idfromtablename(auth, table_name)
columns = ft.describetablebyid(auth, tableid)
column_spec = ""
for (i in 1:length(columns$names)) {
column_spec = paste(column_spec, columns[i, 2])
if (i < length(columns$names)) {
column_spec = paste(column_spec, ",", sep="")
}
}
insert_prefix = "INSERT INTO "
insert_prefix = paste(insert_prefix, tableid)
insert_prefix = paste(insert_prefix, "(")
insert_prefix = paste(insert_prefix, column_spec)
insert_prefix = paste(insert_prefix, ") values (")
insert_suffix = ");"
insert_sql_big = ""
for (i in 1:length(input_frame[,1])) {
data = unlist(input_frame[i,])
values = converttostring(data, column_types = columns$types)
insert_sql = paste(insert_prefix, values)
insert_sql = paste(insert_sql, insert_suffix) ;
insert_sql_big = paste(insert_sql_big, insert_sql)
if (i %% 500 == 0) {
ft.executestatement(auth, insert_sql_big)
insert_sql_big = ""
}
}
ft.executestatement(auth, insert_sql_big)
}
}
I saw a posting for career with Revolution Analytics. Now I am probably on the wrong side of a H1 visa and the C,R skill-o-meter, but these look great for any aspiring R coder. Includes one free lance opp as well.
http://www.revolutionanalytics.com/aboutus/careers.php
We have many opportunities opening up—among them:
| Job Title | Location |
| Pre-sales Consultants / Technical Sales | Palo Alto, CA |
| Parallel Computing Developer | Palo Alto, CA or Seattle, WA |
| R Programmer (Freelance) | Palo Alto, CA |
| Software Training Course Developer (Freelance) | Palo Alto, CA |
| Build / Release Engineer | Seattle, WA |
| QA Engineer | Seattle, WA |
| Technical Writer | Seattle, WA |
Please send your resume to careers@revolutionanalytics.com
2) Indeed.com
Searching for “R” jobs and not just , R jobs, gives better results in search engines and job sites. It is still a tough keyword to search but it is getting better.
You can use this RSS feed http://www.indeed.co.in/rss?q=%22R%22++analytics+jobs or send by email option to get alerts
I Crunch Data has a good number of Analytics Jobs, and again using the keyword as R within quotes of “R” you can see lots of jobs here
http://www.icrunchdata.com/ViewJob.aspx?id=334914&keys=%22R%22
There used to be a Google Group on R jobs, but is too low volume compared to the actual number of R jobs out there.
Note the big demand is for analytics, and knowing more than one platform helps you in the job search than knowing just a single language.
Assume I am a blogger using both Adsense and Adwords.
Suppose Adwords costs me X dollars per click, and Adsense pays me Y dollars per click.
Then a unique arbitrage opportunity would arise if
Y times CTR on my blog > X times CTR on my Ad Campaign
Is it possible. Theoretically yes? Long Tail of Internet yes.
However since there is a lag of time in which the Rates would converge , the Adsense rate would go lower or Adwords rate would go higher
Is there a tool that you can use to pump keywords with short times arbitrage opportunities , much like trading algols and quants do in finance.
Just asking !
Hint- Its a trick math puzzle 🙂
I searched for-
Ajay Ohri
About 97,600 results (0.37 seconds)
Then I searched for
Sergey Brin-
About 3,470,000 results (0.29 seconds)
30 times more pages in 30% less time. Huh!!!
Why show me 300,000 pages of results anyway—-
The point I am trying to make is why does Google have more than 3 pages to show in the default view-
wouldnt they save time, bandwidth and storage if they just show 3 pages for every search result and give more than 3 pages only for Advanced Search.
I personally like the high quality Number 7 ranked search for Sergey Brin-
becuase it is ranked 3 places higher than his own blog http://too.blogspot.com/
