Webscraping using iMacros

The noted Diamonds dataset in the ggplot2 package of R is actually culled from the website http://www.diamondse.info/diamond-prices.asp

However it has ~55000 diamonds, while the whole Diamonds search engine has almost ten times that number. Using iMacros – a Google Chrome Plugin, we can scrape that data (or almost any data). The iMacros chrome plugin is available at  https://chrome.google.com/webstore/detail/cplklnmnlbnpmjogncfgfijoopmnlemp while notes on coding are at http://wiki.imacros.net

Imacros makes coding as easy as recording macro and the code is automatcially generated for whatever actions you do. You can set parameters to extract only specific parts of the website, and code can be run into a loop (of 9999 times!)

Here is the iMacros code-Note you need to navigate to the web site http://www.diamondse.info/diamond-prices.asp before running it

VERSION BUILD=5100505 RECORDER=CR
FRAME F=1
SET !EXTRACT_TEST_POPUP NO
SET !ERRORIGNORE YES
TAG POS=6 TYPE=TABLE ATTR=TXT:* EXTRACT=TXT
TAG POS=1 TYPE=DIV ATTR=CLASS:paginate_enabled_next
SAVEAS TYPE=EXTRACT FOLDER=* FILE=test+3

 

 

 

 

 

 

 

 

 

and voila- all the diamonds you need to analyze!

The returning data can be read using the standard delimiter data munging in the language of SAS or R.

More on IMacros from

https://chrome.google.com/webstore/detail/cplklnmnlbnpmjogncfgfijoopmnlemp/details

Description

Automate your web browser. Record and replay repetitious work

If you encounter any problems with iMacros for Chrome, please let us know in our Chrome user forum at http://forum.iopus.com/viewforum.php?f=21

Our forum is also the best place for new feature suggestions :-)
----

iMacros was designed to automate the most repetitious tasks on the web. If there’s an activity you have to do repeatedly, just record it in iMacros. The next time you need to do it, the entire macro will run at the click of a button! With iMacros, you can quickly and easily fill out web forms, remember passwords, create a webmail notifier, and more. You can keep the macros on your computer for your own use, use them within bookmark sync / Xmarks or share them with others by embedding them on your homepage, blog, company Intranet or any social bookmarking service as bookmarklet. The uses are limited only by your imagination!

Popular uses are as web macro recorder, form filler on steroids and highly-secure password manager (256-bit AES encryption).


Facebook Gmail Killer Threatens to commit Hara Kari live on AOL Techcrunch if unsucessful

The Facebook headquarters in Palo Alto, CA (fr...
Image via Wikipedia

As per Techcrunchhttp://techcrunch.com/2010/11/11/facebook-gmail-titan/

Project Titan — a web-based email client that we hear is unofficially referred to internally as its “Gmail killer”. Now we’ve heard from sources that this is indeed what’s coming on Monday during Facebook’s special event, alongside personal @facebook.com email addresses for users.

Now Techcrunch always tells the Truth and the Gospel as per Mike is always right, especially when he is talking of gates of heaven and Angels.

Again as per the newly rich Mike Arringotn (who qualifies to be an Angel Investor himself except AOL has locked in his err wings)

Our understanding is that this is more than just a UI refresh for Facebook’s existing messaging service with POP access tacked on. Rather, Facebook is building a full-fledged webmail client, and while it may only be in early stages come its launch Monday, there’s a huge amount of potential here.

Facebook has the world’s most popular photos product, the most popular events product, and soon will have a very popular local deals product as well.  It can tweak the design of its webmail client to display content from each of these in a seamless fashion (and don’t forget messages from games, or payments via Facebook Credits). And there’s also the social element: Facebook knows who your friends are and how closely you’re connected to them; it can probably do a pretty good job figuring out which personal emails you want to read most and prioritize them accordingly.

Oh, and assuming our sources prove accurate, this explains the timing of the Google/Facebook slap fight over contact information.

In an exclusive chat with Decisionstats, Senior VP Eduard Patel Bumberg said- This is it. I am going to kill Gmail. This movie I just had  a small part in the mens room while they had the groupies. If we finally kill Gmail, I hope to get a much bigger part in Social Network 3.

New The new Facebook email gives you lesses spam (primarily) as it leans on its contacts in the Cosa Nostra of Spam- and tell them no spam to .fb books.

Yes Anyone is someone in spam has had a connection in the spam pie in Facebook, like creating duplicate 50 million accounts just before the movie got launched, inflating the number of daily Farmville players, invites, links .

Arringutan even covered some of it in an earlier FB game called scamville.

Saint Mark and Mike would have approved Senior VP Eduard Patel Bumberg decision to either kill Gmail or commit hara kari live on U Stream. It is good for the sequel.