Home » Posts tagged 'search engine'
Tag Archives: search engine
Using Torrents and Pirate Bay from within India using Amazon AWS
So okay, I was a bit harsh here . I apologize.
http://www.decisionstats.com/anonymous-operation-india-using-amazon-aws-to-go-to-piratebay/
Anonymous India is doing a protest (on a Saturday which is mostly an off in the Tech Industry in India)
The Indian govt-judiciary-pvt ISPs are blocking the Internet-particularly PirateBay (and OTHER sites).
You can access these sites by going through Amazon AWS.
Screenshots can be clicked to increase resolution below-
You can read how to configure an Amazon AWS Windows Instance for free here. Free as in free speech but not as free beer
You can attach your Amazon Windows Instance to your local drive here. See screenshot
Basically you edit the RDP file (Remote Desktop File) as below.
You can then download the torrent from Amazon to your local drive like this.
and you can now click on torrent to start your downloads like this.
Note- Pirate Bay is just a search engine for Torrents- the basic data is peer to peer.
Security Update-
You should change your default password in Windows which you got here-
2)
3)
Possible Digital Disruptions by Cyber Actors in USA Electoral Cycle
Some possible electronic disruptions that threaten to disrupt the electoral cycle in United States of America currently underway is-
1) Limited Denial of Service Attacks (like for 5-8 minutes) on fund raising websites, trying to fly under the radar of network administrators to deny the targeted fundraising website for a small percentage of funds . Money remains critical to the world’s most expensive political market. Even a 5% dropdown in online fund-raising capacity can cripple a candidate.
2) Limited Man of the Middle Attacks on ground volunteers to disrupt ,intercept and manipulate communication flows. Basically cyber attacks at vulnerable ground volunteers in critical counties /battleground /swing states (like Florida)
3) Electro-Magnetic Disruptions of Electronic Voting Machines in critical counties /swing states (like Florida) to either disrupt, manipulate or create an impression that some manipulation has been done.
4) Use search engine flooding (for search engine de-optimization of rival candidates keywords), and social media flooding for disrupting the listening capabilities of sentiment analysis.
5) Selected leaks (including using digital means to create authetntic, fake or edited collateral) timed to embarrass rivals or influence voters , this can be geo-coded and mass deployed.
6) using Internet communications to selectively spam or influence independent or opinionated voters through emails, short messaging service , chat channels, social media.
7) Disrupt the Hillary for President 2016 campaign by Anonymous-Wikileak sympathetic hacktivists.
Webscraping using iMacros
The noted Diamonds dataset in the ggplot2 package of R is actually culled from the website http://www.diamondse.info/diamond-prices.asp
However it has ~55000 diamonds, while the whole Diamonds search engine has almost ten times that number. Using iMacros – a Google Chrome Plugin, we can scrape that data (or almost any data). The iMacros chrome plugin is available at https://chrome.google.com/webstore/detail/cplklnmnlbnpmjogncfgfijoopmnlemp while notes on coding are at http://wiki.imacros.net
Imacros makes coding as easy as recording macro and the code is automatcially generated for whatever actions you do. You can set parameters to extract only specific parts of the website, and code can be run into a loop (of 9999 times!)
Here is the iMacros code-Note you need to navigate to the web site http://www.diamondse.info/diamond-prices.asp before running it
VERSION BUILD=5100505 RECORDER=CR
FRAME F=1
SET !EXTRACT_TEST_POPUP NO
SET !ERRORIGNORE YES
TAG POS=6 TYPE=TABLE ATTR=TXT:* EXTRACT=TXT
TAG POS=1 TYPE=DIV ATTR=CLASS:paginate_enabled_next
SAVEAS TYPE=EXTRACT FOLDER=* FILE=test+3
and voila- all the diamonds you need to analyze!
The returning data can be read using the standard delimiter data munging in the language of SAS or R.
More on IMacros from
https://chrome.google.com/webstore/detail/cplklnmnlbnpmjogncfgfijoopmnlemp/details
Description
Automate your web browser. Record and replay repetitious work
If you encounter any problems with iMacros for Chrome, please let us know in our Chrome user forum at http://forum.iopus.com/viewforum.php?f=21 Our forum is also the best place for new feature suggestions---- iMacros was designed to automate the most repetitious tasks on the web. If there’s an activity you have to do repeatedly, just record it in iMacros. The next time you need to do it, the entire macro will run at the click of a button! With iMacros, you can quickly and easily fill out web forms, remember passwords, create a webmail notifier, and more. You can keep the macros on your computer for your own use, use them within bookmark sync / Xmarks or share them with others by embedding them on your homepage, blog, company Intranet or any social bookmarking service as bookmarklet. The uses are limited only by your imagination! Popular uses are as web macro recorder, form filler on steroids and highly-secure password manager (256-bit AES encryption).
Note on Internet Privacy (Updated)and a note on DNSCrypt
I noticed the brouaha on Google’s privacy policy. I am afraid that social networks capture much more private information than search engines (even if they integrate my browser history, my social network, my emails, my search engine keywords) – I am still okay. All they are going to do is sell me better ads (maybe than just flood me with ads hoping to get a click). Of course Microsoft should take it one step forward and capture data from my desktop as well for better ads, that would really complete the curve. In any case , with the Patriot Act, most information is available to the Government anyway.
But it does make sense to have an easier to understand privacy policy, and one of my disappointments is the complete lack of visual appeal in such notices. Make things simple as possible, but no simpler, as Al-E said.
Privacy activists forget that ads run on models built on AGGREGATED data, and most models are scored automatically. Unless you do something really weird and fake like, chances are the data pertaining to you gets automatically collected, algorithmic-ally aggregated, then modeled and scored, and a corresponding ad to your score, or segment is shown to you. Probably no human eyes see raw data (but big G can clarify that)
( I also noticed Google gets a lot of free advice from bloggers. hey, if you were really good at giving advice to Google- they WILL hire you !)
on to another tool based (than legalese based approach to privacy)
I noticed tools like DNSCrypt increase internet security, so that all my integrated data goes straight to people I am okay with having it (ad sellers not governments!)
Unfortunately it is Mac Only, and I will wait for Windows or X based tools for a better review. I noticed some lag in updating these tools , so I can only guess that the boys of Baltimore have been there, so it is best used for home users alone.
Maybe they can find a chrome extension for DNS dummies.
http://www.opendns.com/technology/dnscrypt/
Why DNSCrypt is so significant
In the same way the SSL turns HTTP web traffic into HTTPS encrypted Web traffic, DNSCrypt turns regular DNS traffic into encrypted DNS traffic that is secure from eavesdropping and man-in-the-middle attacks. It doesn’t require any changes to domain names or how they work, it simply provides a method for securely encrypting communication between our customers and our DNS servers in our data centers. We know that claims alone don’t work in the security world, however, so we’ve opened up the source to our DNSCrypt code base and it’s available onGitHub.
DNSCrypt has the potential to be the most impactful advancement in Internet security since SSL, significantly improving every single Internet user’s online security and privacy.
and
http://dnscurve.org/crypto.html
The DNSCurve project adds link-level public-key protection to DNS packets. This page discusses the cryptographic tools used in DNSCurve.
Elliptic-curve cryptography
DNSCurve uses elliptic-curve cryptography, not RSA.
RSA is somewhat older than elliptic-curve cryptography: RSA was introduced in 1977, while elliptic-curve cryptography was introduced in 1985. However, RSA has shown many more weaknesses than elliptic-curve cryptography. RSA’s effective security level was dramatically reduced by the linear sieve in the late 1970s, by the quadratic sieve and ECM in the 1980s, and by the number-field sieve in the 1990s. For comparison, a few attacks have been developed against some rare elliptic curves having special algebraic structures, and the amount of computer power available to attackers has predictably increased, but typical elliptic curves require just as much computer power to break today as they required twenty years ago.
IEEE P1363 standardized elliptic-curve cryptography in the late 1990s, including a stringent list of security criteria for elliptic curves. NIST used the IEEE P1363 criteria to select fifteen specific elliptic curves at five different security levels. In 2005, NSA issued a new “Suite B” standard, recommending the NIST elliptic curves (at two specific security levels) for all public-key cryptography and withdrawing previous recommendations of RSA.
Some specific types of elliptic-curve cryptography are patented, but DNSCurve does not use any of those types of elliptic-curve cryptography.
Adding / to robots. text again
So I tried to move without a search engine , and only social sharing, but for a small blog like mine, that means almost 75% of traffic comes via search engines.
Maybe the ratio of traffic from search to social will change in the future,
I have now enough data to conclude search is the ONLY statistically significant driver of traffic ( for a small blog)
If you are a blogger you should definitely try and give the tools at Google Webmaster a go,
eg
https://www.google.com/webmasters/tools/googlebot-fetch
URL Googlebot type Fetch Status Fetch date
http://decisionstats.com/ Web Denied by robots.txt 1/19/12 8:25 PM
http://decisionstats.com/ Web Success URL and linked pages submitted to index 12/27/11 9:55 PM
Also from Google Analytics, I see that denying search traffic doesnot increase direct/ referral traffic in any meaningful way.
So my hypothesis that some direct traffic was mis-counted as search traffic due to Chrome, toolbar search – well the hypothesis was wrong
Also Google seems to drop url quite quickly (within 18 hours) and I will test the rebound in SERPs in a few hours. I was using meta tags, blocked using robots.txt, and removal via webmasters ( a combination of the three may have helped)
To my surprise search traffic declined to 5-10, but it did not become 0. I wonder why that happens (I even got a few Google queries per day) and I was blocking the “/” fron robots.txt.
Net Net- The numbers below show- as of now , in a non SOPA, non Social world, Search Engines remain the webmasters only true friend (till they come up with another panda or whatever update
)
Going off Search Radar for 2012 Q1
I just used the really handy tools at
https://www.google.com/webmasters/tools/crawl-access
, clicked Remove URL
and submitted http://www.decisionstats.com
and I also modified my robots.txt file to
User-agent: *
Disallow: /
Just to make sure- I added the meta tag to each right margin of my blog
“<meta name=”robots” content=”noindex”>”
Now for last six months of 2011 as per Analytics, search engines were really generous to me- Giving almost 170 K page views,
Source Visits Pages/Visit
1. google 58,788 2.14
2. (direct) 10,832 2.24
3. linkedin.com 2,038 2.50
4. google.com 1,823 2.15
5. bing 1,007 2.04
6. reddit.com 749 1.93
7. yahoo 740 2.25
8. google.co.in 576 2.13
9. search 572 2.07
I do like to experiment though, and I wonder if search engines just -
1) Make people lazy to bookmark or type the whole website name in Chrome/Opera toolbars
2) Help disguise sources of traffic by encrypted search terms
3) Help disguise corporate traffic watchers and aggregators
So I am giving all spiders a leave for Q1 2012. I am interested in seeing impact of this on my traffic , and I suspect that the curves would not be as linear as I think.
Is search engine optimization over rated? Let the data decide….
I am also interested in seeing how social sharing can impact traffic in the absence of search engine interaction effects- and whether it is possible to retain a bigger chunk of traffic by reducing SEO efforts and increasing social efforts!












