Brilliant Search Engine Idea (Google, please steal it!)

Google’s search algorithm is being polluted by commercial entities that game the system.

Establish a criteria called Commercial  Density.  Commercial Density refers to the ratio of text content on a page to links to commercial sites (i.e., advertisements) on the same page:

  • if the ratio is low (or even zero), then the web page contains little or no ads
  • if the ratio is high, then the web pages contain lots of ads and relatively small amount of content.

Right now,  well-known consumer products or names of celebrities go to Youtube (no surprise) or metasites or category listings. When searching for Jessica Jay (see my previous article), I notice that most of the search results are meaningless–links to ringtones, song lyrics or Youtube videos. Actually, though the Youtube links are perhaps the cleanest search results I found (no ads), plus the comment section has more information than I find elsewhere.

If I had a way to filter out/exclude content with a high saturation of ads, that would help me find useful content. One reason I end up going to wikipedia first  is not that  I love wikipedia  (See my thoughts about wikipedia and Digital Maoism), but that it’s relatively ad-free.

Alas, even that is changing. Wikipedia’s anti-spam policies ironically is leading to a bias against independent media in favor of mainstream media outlets.  Wikipedia may be inadequate, but for now, it’s still all we have.

I don’t dislike ad-supported media. Far from it. For example, New Yorker, Time and even CNN sometimes have great content. But many database-driven sites are savvy at catching search queries regardless of relevance. One sign of this kind of gaming  is high commercial density. Google already has a way to filter adult content; why can’t it also filter out things by commercial density as well? 


Posted

in

, ,

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.