What are Canonical URLs and how can I use them in Drupal to help SEO?

February 17th, 2009 by Linda Bustos

Our last blog post explained what duplicate content is and how it can affect your site’s search engine indexing and ranking.

In a nutshell, when multiple copies of a block of content or even an entire page live on more than one URL, search engines will attempt to pick the “best” version and filter out the copies when returning search results, and sometimes even drop duplicate pages from their indexes all together. The problems for webmasters include:

  1. Links from other pages and sites are sometimes split across different versions, whereas if they all pointed to one page, it would have a much better chance of ranking well against other website’s content (more links=more authority and trust).
  2. Search engines might not choose the version of your content that you want. For example, it might choose to return your blog’s home page rather than the post that matched the user’s search because your home page has more links and authority. This makes it hard for the searcher to locate the relevant content among the 10 blog posts displayed on your home page. Searcher hits the “back” button.
  3. Search engines might not index your site fully because it only will use so much bandwidth to crawl your site’s pages. If it spends this bandwidth indexing copies, it may keep valuable pages out of the index, and they can not be returned in search results.

The Canonical URL Tag to the Rescue!

Google, Yahoo and MSN recently announced that they will all support a new tag called the Canonical URL Tag which you can use to specify which version of a page you want the search engine to crawl, index and choose when it needs to apply a filter.

The tag is a piece of code that you put in your page header:

&lt;link rel="canonical" href="http://yoursite.com/whatever-your-<span class="caps">URL</span>-looks-like"/&gt;

As a temporary solution Drupal users can already install Joost deValk’s Canonical URL module so you can manage this information directly through the CMS. However note that what is offered is simply a patched version of Global Redirect. There is a patch to officially add the functionality into the module. At ImageX Media we’re looking forward to the patch landing and plan to offer Canonical urls to customers as part of our standard SEO package.

Please note that search engines consider this tag a “hint” and not a “directive” — meaning that it doesn’t have to obey your suggestion (it will consider other factors to determine relevance and proper use of the tag). But it’s highly recommended that you use this tag to give search engines a big push in the right direction.

For more information, check out the Google Webmaster Central blog’s official statment:

http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html

Linda Bustos is an eCommerce Analyst for Elastic Path Software, an enterprise ecommerce framework. Linda blogs daily about Internet Marketing for online retail at the Get Elastic eCommerce Blog.

6 Comments

February 18th, 2009 by dalin (not verified)

Just a minor correction. The canonical link doesn’t go inside of the title element. Like all link tags it just goes in the head element of the HTML doc.

It should also be noted that implementing this on your site isn’t all of a sudden going to put you in first place in the search results. In the real world there’s few URLs within a standard Drupal site that this will effect.

February 18th, 2009 by Rick Vugteveen (IXM)

@dalin – Thanks for pointing out the error. It was actually a formatting issue on our end. I’ve updated the post to correct it.

While I believe that using cannonical links will become an important best practice I agree that this wont have dramatic effects on your search results. However, I could see this being important for Drupal based sites if content can be accessed from multiple url aliases as well as via the node id. It will also cut down on any duplicate content showing up from various Views listings (blog teasers etc.). Other than cannonical urls the XML Sitemap module can also help tell search engines what to index.

February 24th, 2009 by Gregory Heller (not verified)

Setting your canonical url in your htaccess or at google webmaster tools is probably going to have more of an impact than making this change in your header. Bottom line is, you want your entire website to only resolve at one domain/url. Between having that setup right and using global redirect module on a drupal site, you should be fine.

February 24th, 2009 by Rick Vugteveen (IXM)

@Gregory – Thanks for the comment. I’m assuming that you’re referring to the main url for the site itself. Often sites can be access via www.example.com and example.com. I agree that this is a duplicate content issue that needs to be taken care of. Personally I prefer not to use www as it is simpler as “www” is technically a subdomain.

June 25th, 2009 by DevidHussay (not verified)

hi friends Search engines might not choose the version of your content that you want. For example, it might choose to return your blog’s home page rather than the post that matched the user’s search because your home page has more links and authority. This makes it hard for the searcher to locate the relevant content among the 10 blog posts displayed on your home page. Searcher hits the “back” button.
DevidHussa

September 1st, 2009 by felipeducaa (not verified)

The Canonical URL Meta Tag revealed on Friday, February 20th 2009 is a new Meta Tag announced by Google, MSN, and Yahoo that will deal with some complicated duplicate content issues. This new tag, added to the section of your HTML document will tell search engines exactly what URL you would like this page to be indexed as. Meaning if you have a long URL, or a URL containing parameters you can have it indexed as the URL you specify in the Canonical Meta Tag, rather than having it indexed more than once as different URL's. You can also use this to eliminate Google and other search engines from indexing your page URL's with or without the leading www. It will, for all intents and purposes, be treated as a 301 Redirect.

_____________________
codigo de barras

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li>
  • Lines and paragraphs break automatically.
  • Each email address will be obfuscated in a human readable fashion or (if JavaScript is enabled) replaced with a spamproof clickable link.

More information about formatting options

About the Author

Linda Bustos
eCommerce Analyst
Read More

About ImageX Media

We're a Drupal design and development firm based in Vancouver BC. We're passionate about making sites that aren't just great looking, but work great too. Learn More