Canonicalization refers to the tactic of telling search engines what pages on your website you want them to pay attention to. Search engines aren't always able to determine when similar URLs refer to the same page(s) or different pages. As an example, Google could possibly treat www.nbc.com, nbc.com and nbc.com/ as three different web pages even though they all go to the same content, or page.
This can be a serious problem in the context of search engine optimization (SEO). Rather than having a single page that you want Google to rank, it appears as if there are different pages with the same content. Which page is the one Google should display in the search results? If it’s not clear to a search engine, it is possible that none of these pages will appear in the results.
Canonicalization - an Overlooked SEO Tactic
When a website is properly canonicalized, search engines will be able to easily determine which page to pay attention to when it comes to ranking. The rel=canonical HTML element lets webmasters tell search engines which page is the most important page when there are other pages with duplicate or very similar content.
Benefits to SEO
Choosing a proper canonical URL for every set of similar, or duplicate URLs improves the SEO of your site. When search engines know which version is canonical, it can count all the links towards all the different versions, as links to that single version. Setting a canonical is similar to doing a 301 redirect, but without actually redirecting.
Here is an example where there are two pages that share the same content, but only one page should show up in the search results:
- http://canonicalization.com/wordpress/search-plugin/ (this is the most important page for SEO, or the canonical page)
- http://canonicalization.com/wordpress/plugins/search/ (this is the non-canonical page)
It’s up to you to decide which page is the important one. That page is the canonical version. The other page is the non-canonical page. A rel=canonical link is placed in the HTML of the non-canonical page, and it points to the canonical version:
Canonicalization Protects Your Keywords
Canonicalization also serves to prevent your website's keywords from cannibalizing one another. Consider an instance in which Google returns the wrong page on your site for a search term. Maybe a page on your site jumps in and out of a search engine. It is possible that these issues result from keyword cannibalization, also referred to as internal SEO cannibalization.
Keyword cannibalization occurs when multiple pages on a site compete for the exact same search word/phrase. It occurs when duplicate themes are used and the search engine can't figure out which page should appear for the word/phrase searched for. Common types of keyword cannibalization include subdomain conflict, internal keyword cannibalization and multiple unrelated keywords ranking for the homepage. A canonical keyword map will help prevent such cannibalization.
The Keyword Map
A canonical keyword map matches targeted keywords to specific canonical pages (the most important page).
Start by getting a view of all your website’s indexed content. This includes page titles, meta descriptions, headline tags, on-page content and internal link anchors. A great tool for this is Screaming Frog. This tool is free for up to 500 URLs. From the Screaming Frog dashboard you can export a list of all your pages that search engines can see, and include in their search results. Remove pages with a 3xx, 4xx or 5xx server status errors.
Next, create your own spreadsheet and add the Screaming Frog data. You will add a column for Canonical Keywords to your spreadsheet.
From here you can identify pages that have obvious overlap with keywords, page Title, headings and content.
At this stage you need to dig out your keyword research and start making decisions about which keywords belong to which pages. With pages where there is overlap you will have to decide which page is the canonical version and apply the rel=canonical tag to the other page(s).
Tip: If you’re not sure which page should be the canonical version, you can choose the one that is the most visited.
The Myth of Duplicate Content
One of the main reasons for using Canonicalization is to help search engines understand what is the most important page when there is duplicate or very similar content. There is plenty of confusion when it comes to duplicate content.
Definition of Duplicate Content from Google
Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Mostly, this is not deceptive in origin...
Google’s Andrey Lipattsev was adamant during a Google Hangout on June 16, 2016:
Google DOES NOT have a duplicate content penalty.
Likewise, John Mueller of Google says:
We don’t have a duplicate content penalty. It’s not that we would demote a site for having a lot of duplicate content.
Google won’t demote a site with lots of duplicate content, but they will certainly not rank it either.
SEO Best Practices for Duplicate Content
- Do NOT expect to rank high in Google with content found on other, more trusted sites, and don’t expect to rank at all if all you are using automatically generated pages with no ‘value add’ content.
- Do NOT use boilerplate content. Boilerplate is defined as: any text that is or can be reused in new contexts or applications without being greatly changed from the original. Google says to, “minimize boilerplate repetition.”
- Provide rich, unique, relevant, and informative content. It sounds simple, but content like this is the kind of content search engines want in their indexes.
Canonicalization the Final Piece
Canonicalization can protect and even enhance your overall SEO. Canonicalize your website's pages in the manner outlined above and you will inevitably enhance your SEO. Canonicalization for SEO does not require an abundance of resources, personnel, time or money. It is a fairly simple process that will help web searchers who are interested in your content, services, products or other offerings find your website.