Our last blog post explained what duplicate content is and how it can affect your site’s search engine indexing and ranking.
In a nutshell, when multiple copies of a block of content or even an entire page live on more than one URL, search engines will attempt to pick the “best” version and filter out the copies when returning search results, and sometimes even drop duplicate pages from their indexes all together. The problems for webmasters include:
Google, Yahoo and MSN recently announced that they will all support a new tag called the Canonical URL Tag which you can use to specify which version of a page you want the search engine to crawl, index and choose when it needs to apply a filter.
The tag is a piece of code that you put in your page header:
<link rel="canonical" href="http://yoursite.com/whatever-your-<span class="caps">URL</span>-looks-like"/>
As a temporary solution Drupal users can already install Joost deValk’s Canonical URL module so you can manage this information directly through the CMS. However note that what is offered is simply a patched version of Global Redirect. There is a patch to officially add the functionality into the module. At ImageX Media we’re looking forward to the patch landing and plan to offer Canonical urls to customers as part of our standard SEO package.
Please note that search engines consider this tag a “hint” and not a “directive” — meaning that it doesn’t have to obey your suggestion (it will consider other factors to determine relevance and proper use of the tag). But it’s highly recommended that you use this tag to give search engines a big push in the right direction.
For more information, check out the Google Webmaster Central blog’s official statment:
http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html
Linda Bustos is an eCommerce Analyst for Elastic Path Software, an enterprise ecommerce framework. Linda blogs daily about Internet Marketing for online retail at the Get Elastic eCommerce Blog.
imagex_media: RT @Dries: Blog post: Drupal trademark policy: update after 11 months http://bit.ly/c2cKjE
imagex_media: How to Stay in a Web Agency's Good Books as a Freelancer http://bit.ly/c3hHdL #Drupal #contractors #jobs
imagex_media: New #Drupal SEO Module: ContentOptimzer http://bit.ly/byP5zA Review page content against keywords you input. Via @thomjjames
imagex_media: RT @alexventpap: 100 Best fonts ever: http://bit.ly/7S2Ux #fonts #type #design
imagex_media: RT @lisarex: Update on the drupal.org redesign: http://drupal.org/node/859916