Apps for Google Drive

I kind of liked the fact that Google Drive has a lot of apps already- even though it is quite young.

Especially the mechanical engineer in me liked the AutoCAD app and the video editing apps, the online bitcoin wallet, free project scheduling app, the cloud’s first (?) open office document reader and etc

Developers would especially like playing with the OAuth Playground app for Google Drive on the Google Chrome platform.

Check out  for yourself.

Webscraping using iMacros

The noted Diamonds dataset in the ggplot2 package of R is actually culled from the website

However it has ~55000 diamonds, while the whole Diamonds search engine has almost ten times that number. Using iMacros – a Google Chrome Plugin, we can scrape that data (or almost any data). The iMacros chrome plugin is available at while notes on coding are at

Imacros makes coding as easy as recording macro and the code is automatcially generated for whatever actions you do. You can set parameters to extract only specific parts of the website, and code can be run into a loop (of 9999 times!)

Here is the iMacros code-Note you need to navigate to the web site before running it

TAG POS=1 TYPE=DIV ATTR=CLASS:paginate_enabled_next










and voila- all the diamonds you need to analyze!

The returning data can be read using the standard delimiter data munging in the language of SAS or R.

More on IMacros from


Automate your web browser. Record and replay repetitious work

If you encounter any problems with iMacros for Chrome, please let us know in our Chrome user forum at

Our forum is also the best place for new feature suggestions :-)

iMacros was designed to automate the most repetitious tasks on the web. If there’s an activity you have to do repeatedly, just record it in iMacros. The next time you need to do it, the entire macro will run at the click of a button! With iMacros, you can quickly and easily fill out web forms, remember passwords, create a webmail notifier, and more. You can keep the macros on your computer for your own use, use them within bookmark sync / Xmarks or share them with others by embedding them on your homepage, blog, company Intranet or any social bookmarking service as bookmarklet. The uses are limited only by your imagination!

Popular uses are as web macro recorder, form filler on steroids and highly-secure password manager (256-bit AES encryption).

Software Review- Google Drive versus Dropbox

Here are some notes from reviewing Google Drive vs Dropbox

1) Google Drive gives more free space upfront  than Dropbox.5GB versus 2GB

2) Dropbox has a referral system 500 mb per referral while there is no referral system for Google Drive

3) The sync facility with Google Docs makes Google Drive especially useful for prior users of Google Docs.

4) API access to Google Drive is only for Chrome apps which is intriguing!

Apps will not have any API access to files unless users have first installed the app in Chrome Web Store.

You can use the Dropbox API much more easily –

See the platforms at

Choose your platform:

iOS Android Python Ruby


(though I wonder if you set the R working directory to the local shared drive for Google Drive it should sync up as well but of course be slower –

5) Google Drive icon is ugly (seriously, dude!) , but the features in the Windows app is just the same as the Dropbox App. Too similar 😉


6) Upgrade space is much more cheaper to Google Drive than Dropbox ( by Google Drive prices being exactly  a quarter of prices on Dropbox and max storage being 16 times as much). This will affect power storage users. I expect to see some slowdown in Dropbox new business unless G Drive has outage (like Gmail) . Existing users at Dropbox probably wont shift for the small dollar amount- though it is quite easy to do so.


Install Google Drive on your local workstation and cut and paste your Dropbox local folder to the Google Drive local folder!!

7) Dropbox deserves credit for being first (like Hotmail and AOL) but Google Drive is almost better in all respects!

Google Drive

5 GB of Drive (0% used)
10 GB of Gmail (48% used)
1 GB of Picasa (0% used)


25 GB
2,49 $ / Month
+25 GB for Drive and Picasa
Bonus: Your Gmail storage will be upgraded to 25 GB.
Choose this plan

100 GB
4,99 $ / Month
+100 GB for Drive and Picasa
Bonus: Your Gmail storage will be upgraded to 25 GB.
Choose this plan

 Need more storage?

Up to 16 TB available


Current account type

Large DropboxDropbox Badge greenFree
Up to 18 GB (2 GB + 500 MB per referral)
Account info 

Other account types

Large DropboxDropbox Badge orange50 GB +
Pro 50
+1 GB per referral, up to +32 GB
$9.99/month or $99.00/year Upgrade to Pro 50
Large DropboxDropbox Badge purple100 GB +
Pro 100
+1 GB per referral, up to +32 GB
$19.99/month or $199.00/year Upgrade to Pro 100
Triple DropboxDropbox For Teams Badge1 TB +
Plans starting at 1 TB
Large shared quota, centralized admin and billing, and more!




April Fool's Day- Catblock!

Since Anonymous didnt disrupt the internet on April Fools Day by overloading the DNS Servers! , the best April Fool’s day imho goes to Adblock- that  nifty extension that allows you to block ads.

Well for today- it replaced ads with funny cats- and you can even buy the cats for ads extension  permanently. That’s right cats take over the Internet!

Only 2% of Chrome and Firefox users block ads! so what are you waiting for- this is how the NYTimes looks for me!!


Replace ads with cats-

for chrome here-

for firefox here-

read more on catblock here-

but if you want to buy catblock—

see this


Using R for Cloud Computing – made very easy and free by BioConductor

I really liked the no hassles way Biocnoductor has put a cloud AMI loaded with RStudio to help people learn R, and even try using R from within a browser in the cloud.

Not only is the tutorial very easy to use- they also give away 2 hours for free computing!!!

Check it out-

Step 1

Step 2

Step 3

and wow! I am using Google Chrome to run R ..and its awesome!

Interesting- check out two hours for free — all you need is a browser and internet connection

Note on Internet Privacy (Updated)and a note on DNSCrypt

I noticed the brouaha on Google’s privacy policy. I am afraid that social networks capture much more private information than search engines (even if they integrate my browser history, my social network, my emails, my search engine keywords) – I am still okay. All they are going to do is sell me better ads (maybe than just flood me with ads hoping to get a click). Of course Microsoft should take it one step forward and capture data from my desktop as well for better ads, that would really complete the curve. In any case , with the Patriot Act, most information is available to the Government anyway.

But it does make sense to have an easier to understand privacy policy, and one of my disappointments is the complete lack of visual appeal in such notices. Make things simple as possible, but no simpler, as Al-E said.


Privacy activists forget that ads run on models built on AGGREGATED data, and most models are scored automatically. Unless you do something really weird and fake like, chances are the data pertaining to you gets automatically collected, algorithmic-ally aggregated, then modeled and scored, and a corresponding ad to your score, or segment is shown to you. Probably no human eyes see raw data (but big G can clarify that)


( I also noticed Google gets a lot of free advice from bloggers. hey, if you were really good at giving advice to Google- they WILL hire you !)

on to another tool based (than legalese based approach to privacy)

I noticed tools like DNSCrypt increase internet security, so that all my integrated data goes straight to people I am okay with having it (ad sellers not governments!)

Unfortunately it is Mac Only, and I will wait for Windows or X based tools for a better review. I noticed some lag in updating these tools , so I can only guess that the boys of Baltimore have been there, so it is best used for home users alone.


Maybe they can find a chrome extension for DNS dummies.

Why DNSCrypt is so significant

In the same way the SSL turns HTTP web traffic into HTTPS encrypted Web traffic, DNSCrypt turns regular DNS traffic into encrypted DNS traffic that is secure from eavesdropping and man-in-the-middle attacks.  It doesn’t require any changes to domain names or how they work, it simply provides a method for securely encrypting communication between our customers and our DNS servers in our data centers.  We know that claims alone don’t work in the security world, however, so we’ve opened up the source to our DNSCrypt code base and it’s available onGitHub.

DNSCrypt has the potential to be the most impactful advancement in Internet security since SSL, significantly improving every single Internet user’s online security and privacy.


The DNSCurve project adds link-level public-key protection to DNS packets. This page discusses the cryptographic tools used in DNSCurve.

Elliptic-curve cryptography

DNSCurve uses elliptic-curve cryptography, not RSA.

RSA is somewhat older than elliptic-curve cryptography: RSA was introduced in 1977, while elliptic-curve cryptography was introduced in 1985. However, RSA has shown many more weaknesses than elliptic-curve cryptography. RSA’s effective security level was dramatically reduced by the linear sieve in the late 1970s, by the quadratic sieve and ECM in the 1980s, and by the number-field sieve in the 1990s. For comparison, a few attacks have been developed against some rare elliptic curves having special algebraic structures, and the amount of computer power available to attackers has predictably increased, but typical elliptic curves require just as much computer power to break today as they required twenty years ago.

IEEE P1363 standardized elliptic-curve cryptography in the late 1990s, including a stringent list of security criteria for elliptic curves. NIST used the IEEE P1363 criteria to select fifteen specific elliptic curves at five different security levels. In 2005, NSA issued a new “Suite B” standard, recommending the NIST elliptic curves (at two specific security levels) for all public-key cryptography and withdrawing previous recommendations of RSA.

Some specific types of elliptic-curve cryptography are patented, but DNSCurve does not use any of those types of elliptic-curve cryptography.


Does the Internet need its own version of credit bureaus

Data Miners love data. The more data they have the better model they can build. Consumers do not love data so much and find sharing data generally a cumbersome task. They need to be incentivize for filling out survey forms , and for signing to loyalty programs. Lawyers, and privacy advocates love to use examples of improper data collection and usage as the harbinger of an ominous scenario. George Orwell’s 1984 never “mentioned” anything about Big Brother trying to sell you one more loan, credit card or product.

Data generated by customers is now growing without their needing to fill out forms and surveys. This data is about their preferences , tastes and choices and is growing in size and depth because it is generated from social media channels on the Internet.It is this data that can be and is captured by social media analytics.

Mobile data is also growing, including usage of location based applications and usage of Internet from the mobile phone is leading to further increases in data about consumers.Increasingly , location based applications help to provide a much more relevant context to the data generated. Just mobile data is expected to grow to 15 exabytes by 2015.

People want to have more and more conversations online publicly , share pictures , activity and interact with a large number of people whom  they have never met. But resent that information being used or abused without their knowledge.

Also the Internet is increasingly being consolidated into a few players like Microsoft, Amazon, Google  and Facebook, who are unable to agree on agreements to share that data between themselves. Interestingly you can use Yahoo as a data middleman between Google and Facebook.

At the same time, more and more purchases are being done online by customers and Internet advertising has grown much above the rate of growth of other mediums of communication.
Internet retail sales have the advantage that better demand predictability can lead to lower inventories as retailers need not stock up displays to look good. An Amazon warehouse need not keep material to simply stock up it shelves like a K-Mart does.

Our Hypothesis – An Analogy with how Financial Data Marketing is managed offline

  1. Financial information regarding spending and saving is much more sensitive yet the presence of credit bureaus alleviates these concerns.
  2. Credit bureaus collect information from all sources, aggregate and anonymize the individual components accordingly.They use SSN as a unique identifier.
  3. The Internet has a unique number too , called the Internet Protocol Address (I.P) 
  4. Should there be a unique identifier like Internet Security Number for the Internet to ensure adequate balance between the need for privacy as well as the need for appropriate targeting? 

After all, no one complains about privacy intrusions if their credit bureau data is aggregated , rolled up, and anonymized and turned into a propensity model for sending them direct mailers.

Advertising using Social Media and Internet

1. A business creates an ad
Let’s say a gym opens in your neighborhood. The owner creates an ad to get people to come in for a free workout.
2. Facebook gets paid to deliver the ad
The owner sends the ad to Facebook and describes who should see it: people who live nearby and like running.
The right people see the ad
3. Facebook only shows you the ad if you live in town and like to run. That’s how advertisers reach you without knowing who you are.

Adding in credit bureau data and legislative regulation for anonymizing  and handling privacy data can expand the internet selling market, which is much more efficient from a supply chain perspective than the offline display and shop models.

Privacy Regulations on Marketing using Internet data
Should laws on opt out and do not mail, do not call, lists be extended to do not show ads , do not collect information on social media. In the offline world, you can choose to be part of direct marketing or opt out of direct marketing by enrolling yourself in various do not solicit lists. On the internet the only option from advertisements is to use the Adblock plugin if you are Google Chrome or Firefox browser user. Even Facebook gives you many more ads than you need to see.

One reason for so many ads on the Internet is lack of central anonymize data repositories for giving high quality data to these marketing companies.Software that can be used for social media analytics is already available off the shelf.

The growth of the Internet has helped carved out a big industry for Internet web analytics so it is a matter of time before social media analytics becomes a multi billion dollar business as well. What new developments would be unleashed in this brave new world is just a matter of time, and of course of the social media data!