Home » Posts tagged 'search engine'

Tag Archives: search engine

Possible Digital Disruptions by Cyber Actors in USA Electoral Cycle

Some possible electronic disruptions  that threaten to disrupt the electoral cycle in United States of America currently underway is-

1) Limited Denial of Service Attacks (like for 5-8 minutes) on fund raising websites, trying to fly under the radar of network administrators to deny the targeted  fundraising website for a small percentage of funds . Money remains critical to the world’s most expensive political market. Even a 5% dropdown in online fund-raising capacity can cripple a candidate.

2)  Limited Man of the Middle  Attacks on ground volunteers to disrupt ,intercept and manipulate communication flows. Basically cyber attacks at vulnerable ground volunteers in critical counties /battleground /swing states (like Florida)

3) Electro-Magnetic Disruptions of Electronic Voting Machines in critical counties /swing states (like Florida) to either disrupt, manipulate or create an impression that some manipulation has been done.

4) Use search engine flooding (for search engine de-optimization of rival candidates keywords), and social media flooding for disrupting the listening capabilities of sentiment analysis.

5) Selected leaks (including using digital means to create authetntic, fake or edited collateral) timed to embarrass rivals or influence voters , this can be geo-coded and mass deployed.

6) using Internet communications to selectively spam or influence independent or opinionated voters through emails, short messaging service , chat channels, social media.

7) Disrupt the Hillary for President 2016 campaign by Anonymous-Wikileak sympathetic hacktivists.

 

 

Webscraping using iMacros

The noted Diamonds dataset in the ggplot2 package of R is actually culled from the website http://www.diamondse.info/diamond-prices.asp

However it has ~55000 diamonds, while the whole Diamonds search engine has almost ten times that number. Using iMacros – a Google Chrome Plugin, we can scrape that data (or almost any data). The iMacros chrome plugin is available at  https://chrome.google.com/webstore/detail/cplklnmnlbnpmjogncfgfijoopmnlemp while notes on coding are at http://wiki.imacros.net

Imacros makes coding as easy as recording macro and the code is automatcially generated for whatever actions you do. You can set parameters to extract only specific parts of the website, and code can be run into a loop (of 9999 times!)

Here is the iMacros code-Note you need to navigate to the web site http://www.diamondse.info/diamond-prices.asp before running it

VERSION BUILD=5100505 RECORDER=CR
FRAME F=1
SET !EXTRACT_TEST_POPUP NO
SET !ERRORIGNORE YES
TAG POS=6 TYPE=TABLE ATTR=TXT:* EXTRACT=TXT
TAG POS=1 TYPE=DIV ATTR=CLASS:paginate_enabled_next
SAVEAS TYPE=EXTRACT FOLDER=* FILE=test+3

 

 

 

 

 

 

 

 

 

and voila- all the diamonds you need to analyze!

The returning data can be read using the standard delimiter data munging in the language of SAS or R.

More on IMacros from

https://chrome.google.com/webstore/detail/cplklnmnlbnpmjogncfgfijoopmnlemp/details

Description

Automate your web browser. Record and replay repetitious work

If you encounter any problems with iMacros for Chrome, please let us know in our Chrome user forum at http://forum.iopus.com/viewforum.php?f=21

Our forum is also the best place for new feature suggestions :-)
----

iMacros was designed to automate the most repetitious tasks on the web. If there’s an activity you have to do repeatedly, just record it in iMacros. The next time you need to do it, the entire macro will run at the click of a button! With iMacros, you can quickly and easily fill out web forms, remember passwords, create a webmail notifier, and more. You can keep the macros on your computer for your own use, use them within bookmark sync / Xmarks or share them with others by embedding them on your homepage, blog, company Intranet or any social bookmarking service as bookmarklet. The uses are limited only by your imagination!

Popular uses are as web macro recorder, form filler on steroids and highly-secure password manager (256-bit AES encryption).


How to learn to be a hacker easily

1) Are you sure. It is tough to be a hacker. And football players get all the attention.

2) Really? Read on

3) Read Hacker’s Code

http://muq.org/~cynbe/hackers-code.html

The Hacker’s Code

“A hacker of the Old Code.”

  • Hackers come and go, but a great hack is forever.
  • Public goods belong to the public.*
  • Software hoarding is evil.
    Software does the greatest good given to the greatest number.
  • Don’t be evil.
  • Sourceless software sucks.
  • People have rights.
    Organizations live on sufferance.
  • Governments are organizations.
  • If it is wrong when citizens do it,
    it is wrong when governments do it.
  • Information wants to be free.
    Information deserves to be free.
  • Being legal doesn’t make it right.
  • Being illegal doesn’t make it wrong.
  • Subverting tyranny is the highest duty.
  • Trust your technolust!

4) Read How to be a hacker by

Eric Steven Raymond

http://www.catb.org/~esr/faqs/hacker-howto.html

or just get the Hacker Attitude

The Hacker Attitude

1. The world is full of fascinating problems waiting to be solved.
2. No problem should ever have to be solved twice.
3. Boredom and drudgery are evil.
4. Freedom is good.
5. Attitude is no substitute for competence.
5) If you are tired of reading English, maybe I should move on to technical stuff
6) Create your hacking space, a virtual disk on your machine.
You will need to learn a bit of Linux. If you are a Windows user, I recommend creating a VMWare partition with Ubuntu
If you like Mac, I recommend the more aesthetic Linux Mint.
How to create your virtual disk-
read here-
Download VM Player here
http://www.vmware.com/support/product-support/player/
Down iso image of operating system here
http://ubuntu.com
Downloading is the longest thing in this exercise
Now just do what is written here
http://www.vmware.com/pdf/vmware_player40.pdf
or if you want to try and experiment with other ways to use Windows and Linux just read this
http://www.decisionstats.com/ways-to-use-both-windows-and-linux-together/
Moving data back and forth between your new virtual disk and your old real disk
http://www.decisionstats.com/moving-data-between-windows-and-ubuntu-vmware-partition/
7) Get Tor to hide your IP address when on internet
https://www.torproject.org/docs/tor-doc-windows.html.en
8a ) Block Ads using Ad-block plugin when surfing the internet (like 14.95 million other users)
https://addons.mozilla.org/en-US/firefox/addon/adblock-plus/
 8b) and use Mafiafire to get elusive websites
https://addons.mozilla.org/en-US/firefox/addon/mafiaafire-redirector/
9) Get a  Bit Torrent Client at http://www.utorrent.com/
This will help you download stuff
10) Hacker Culture Alert-
This instruction is purely for sharing the culture but not the techie work of being a hacker
The website Pirate bay acts like a search engine for Bit torrents 
http://thepiratebay.se/
Visiting it is considered bad since you can get lots of music, videos, movies etc for free, without paying copyright fees.
The website 4chan is considered a meeting place to meet other hackers. The site can be visually shocking
http://boards.4chan.org/b/
You need to do atleast set up these systems, read the websites and come back in N month time for second part in this series on how to learn to be a hacker. That will be the coding part.
END OF PART  1
Updated – sorry been a bit delayed on next part. Will post soon.

Note on Internet Privacy (Updated)and a note on DNSCrypt

I noticed the brouaha on Google’s privacy policy. I am afraid that social networks capture much more private information than search engines (even if they integrate my browser history, my social network, my emails, my search engine keywords) – I am still okay. All they are going to do is sell me better ads (maybe than just flood me with ads hoping to get a click). Of course Microsoft should take it one step forward and capture data from my desktop as well for better ads, that would really complete the curve. In any case , with the Patriot Act, most information is available to the Government anyway.

But it does make sense to have an easier to understand privacy policy, and one of my disappointments is the complete lack of visual appeal in such notices. Make things simple as possible, but no simpler, as Al-E said.

 

Privacy activists forget that ads run on models built on AGGREGATED data, and most models are scored automatically. Unless you do something really weird and fake like, chances are the data pertaining to you gets automatically collected, algorithmic-ally aggregated, then modeled and scored, and a corresponding ad to your score, or segment is shown to you. Probably no human eyes see raw data (but big G can clarify that)

 

( I also noticed Google gets a lot of free advice from bloggers. hey, if you were really good at giving advice to Google- they WILL hire you !)

on to another tool based (than legalese based approach to privacy)

I noticed tools like DNSCrypt increase internet security, so that all my integrated data goes straight to people I am okay with having it (ad sellers not governments!)

Unfortunately it is Mac Only, and I will wait for Windows or X based tools for a better review. I noticed some lag in updating these tools , so I can only guess that the boys of Baltimore have been there, so it is best used for home users alone.

 

Maybe they can find a chrome extension for DNS dummies.

http://www.opendns.com/technology/dnscrypt/

Why DNSCrypt is so significant

In the same way the SSL turns HTTP web traffic into HTTPS encrypted Web traffic, DNSCrypt turns regular DNS traffic into encrypted DNS traffic that is secure from eavesdropping and man-in-the-middle attacks.  It doesn’t require any changes to domain names or how they work, it simply provides a method for securely encrypting communication between our customers and our DNS servers in our data centers.  We know that claims alone don’t work in the security world, however, so we’ve opened up the source to our DNSCrypt code base and it’s available onGitHub.

DNSCrypt has the potential to be the most impactful advancement in Internet security since SSL, significantly improving every single Internet user’s online security and privacy.

and

http://dnscurve.org/crypto.html

The DNSCurve project adds link-level public-key protection to DNS packets. This page discusses the cryptographic tools used in DNSCurve.

Elliptic-curve cryptography

DNSCurve uses elliptic-curve cryptography, not RSA.

RSA is somewhat older than elliptic-curve cryptography: RSA was introduced in 1977, while elliptic-curve cryptography was introduced in 1985. However, RSA has shown many more weaknesses than elliptic-curve cryptography. RSA’s effective security level was dramatically reduced by the linear sieve in the late 1970s, by the quadratic sieve and ECM in the 1980s, and by the number-field sieve in the 1990s. For comparison, a few attacks have been developed against some rare elliptic curves having special algebraic structures, and the amount of computer power available to attackers has predictably increased, but typical elliptic curves require just as much computer power to break today as they required twenty years ago.

IEEE P1363 standardized elliptic-curve cryptography in the late 1990s, including a stringent list of security criteria for elliptic curves. NIST used the IEEE P1363 criteria to select fifteen specific elliptic curves at five different security levels. In 2005, NSA issued a new “Suite B” standard, recommending the NIST elliptic curves (at two specific security levels) for all public-key cryptography and withdrawing previous recommendations of RSA.

Some specific types of elliptic-curve cryptography are patented, but DNSCurve does not use any of those types of elliptic-curve cryptography.

 

Adding / to robots. text again

So I tried to move without a search engine , and only social sharing, but for a small blog like mine, that means almost 75% of traffic comes via search engines.
Maybe the ratio of traffic from search to social will change in the future,

I have now enough data to conclude search is the ONLY statistically significant driver of traffic ( for a small blog)
If you are a blogger you should definitely try and give the tools at Google Webmaster a go,

eg

 

https://www.google.com/webmasters/tools/googlebot-fetch

URL Googlebot type Fetch Status Fetch date
http://decisionstats.com/ Web Denied by robots.txt 1/19/12 8:25 PM
http://decisionstats.com/ Web Success URL and linked pages submitted to index 12/27/11 9:55 PM

 

Also from Google Analytics, I see that denying search traffic doesnot increase direct/ referral traffic in any meaningful way.

So my hypothesis that some direct traffic was mis-counted as search traffic due to Chrome, toolbar search – well the hypothesis was wrong :)

Also Google seems to drop url quite quickly (within 18 hours) and I will test the rebound in SERPs in a few hours.  I was using meta tags, blocked using robots.txt, and removal via webmasters ( a combination of the three may have helped)

To my surprise search traffic declined to 5-10, but it did not become 0. I wonder why that happens (I even got a few Google queries per day) and I was blocking the “/” fron robots.txt.

 

Net Net- The numbers below show- as of now , in a non SOPA, non Social world, Search Engines remain the webmasters only true friend (till they come up with another panda or whatever update ;) )

Going off Search Radar for 2012 Q1

I just used the really handy tools at

https://www.google.com/webmasters/tools/crawl-access

, clicked Remove URL

https://www.google.com/webmasters/tools/crawl-access?hl=en&siteUrl=http://decisionstats.com/&tid=removal-list

and submitted http://www.decisionstats.com

and I also modified my robots.txt file to

User-agent: *
Disallow: /

Just to make sure- I added the meta tag to each right margin of my blog

“<meta name=”robots” content=”noindex”>”

Now for last six months of 2011 as per Analytics, search engines were really generous to me- Giving almost 170 K page views,

Source                            Visits          Pages/Visit
1. google                       58,788                       2.14
2. (direct)                     10,832                       2.24
3. linkedin.com            2,038                       2.50
4. google.com                1,823                       2.15
5. bing                              1,007                      2.04
6. reddit.com                    749                       1.93
7. yahoo                              740                      2.25
8. google.co.in                  576                       2.13
9. search                             572                       2.07

 

I do like to experiment though, and I wonder if search engines just -

1) Make people lazy to bookmark or type the whole website name in Chrome/Opera  toolbars

2) Help disguise sources of traffic by encrypted search terms

3) Help disguise corporate traffic watchers and aggregators

So I am giving all spiders a leave for Q1 2012. I am interested in seeing impact of this on my traffic , and I suspect that the curves would not be as linear as I think.

Is search engine optimization over rated? Let the data decide…. :)

I am also interested in seeing how social sharing can impact traffic in the absence of search engine interaction effects- and whether it is possible to retain a bigger chunk of traffic by reducing SEO efforts and increasing social efforts!

 

Automatically creating tags for big blogs with WordPress

I use the simple-tags plugin in WordPress for automatically creating and posting tags. I am hoping this makes the site better to navigate. Given the fact that I had not been a very efficient tagger before, this plugin can really be useful for someone in creating tags for more than 100 (or 1000 posts) especially WordPress based blog aggregators.

 

 

The plugin is available here -

Simple Tags is the successor of Simple Tagging Plugin This is THE perfect tool to manage perfectly your WP terms for any taxonomy

It was written with this philosophy : best performances, more secured and brings a lot of new functions

This plugin is developped on WordPress 3.3, with the constant WP_DEBUG to TRUE.

  • Administration
  • Tags suggestion from Yahoo! Term Extraction API, OpenCalais, Alchemy, Zemanta, Tag The Net, Local DB with AJAX request
    • Compatible with TinyMCE, FCKeditor, WYMeditor and QuickTags
  • tags management (rename, delete, merge, search and add tags, edit tags ID)
  • Edit mass tags (more than 50 posts once)
  • Auto link tags in post content
  • Auto tags !
  • Type-ahead input tags / Autocompletion Ajax
  • Click tags
  • Possibility to tag pages (not only posts) and include them inside the tags results
  • Easy configuration ! (in WP admin)

The above plugin can be combined with the RSS Aggregator plugin for Search Engine Optimization purposes

Ajay-You can also combine this plugin with RSS auto post blog aggregator (read instructions here) and create SEO optimized Blog Aggregation or Curation

Related -http://www.decisionstats.com/creating-a-blog-aggregator-for-free/

Follow

Get every new post delivered to your Inbox.

Join 839 other followers