A canonical link is a nifty, fairly new way to communicate with search engine crawlers and direct them to the preferred pages of your website. In the past, if you happened to have dynamically-generated directories or pages that would cause duplicate content issues, you’d probably be advised to exclude it via robots.txt.
Nowadays, since Google announced its preferred URL specification feature, the canonical tag, doing such a thing seems drastic. After all, you want to accommodate search engines as they crawl through your site as much as possible. We don’t want to be rude hosts, lest we get the spiders upset, and they exact cold revenge in the form of bad SERP results. Plus, you don’t really want to see hundreds or thousands of crawling issues appear in your Webmaster Tools account do you? So it’s best to listen to Google, who now recommends controlling your impulse to add exclusions to robots files; instead, turn to canonical links to battle internal duplicate content.
A Sadly Common SEO Horror Story
So you’ve done your research on SEO, and you’ve tattooed the slogan, “Content is king!” to your brain. You’ve written unique, top quality articles for every page on your website, each containing keywords that are relevant to your field. Once it all goes live, you anxiously check Google every day, waiting for your new website to appear on the first page. However, after a month or more of waiting, you find that it’s just not happening. In fact, you’re nowhere to be found on Google at all.
What could be the problem? You ask an SEO expert to have a look at your website, or maybe you run a page or two through Copyscape, and the bad news smacks you like a sack of bricks: you’ve got a nasty internal duplicate content issue. You see, you may have written all of your content yourself, vowing to never steal from another source , but all the while, under the hood, your website was a veritable duplicate content farm, churning out dynamically-generated variations of the same page over and over again, each containing the exact same article word-for-word.
This sort of thing can easily happen when you’re utilizing a blog or e-commerce platform for your website. In many cases, product pages or blog articles can be accessed through multiple paths and URLs, leading to over-indexation problems and penalties.
While your website may be a well-oiled machine where function and usability are concerned, it may be stirring up all sorts of problems with search engines behind the scenes. Fortunately, Google wants to be your friend, and for this reason, they’ve introduced the canonical tag.
Example of a Canonical Tag
Doing a little research on how to add canonical links to your website can save you a great deal of frustration – and grow your business or blog in the process. A canonical link appears under the hood of your website and looks something like this:
<link rel=”canonical” href=”http://www.youramazingwebsite.com/blog/dont-duplicate-this-content” />
What we have here is a canonical link added to a blog post entitled, “Don’t Duplicate This Content.” This code is communicating your preferred URL for the article. This way, in the event that a variant page with the same article is generated through sorting methods utilized by the blog (e.g. tags) or other factors, Google will recognize that the content stems back from a single, primary source (www.youramazingwebsite.com/blog/dont-duplicate-this-content), and all of the URLs with the same article that may have spawned dynamically after the fact shouldn’t lose you any brownie points.
If you’re seeing such issues arise within your website, you’re welcomed to utilize a mixture of robots.txt and canonical links, though it’s probably best if you first experiment with the addition of a canonical link. After some time, if you’re finding that your rankings still leave something to be desired, then you can either submit a request to Google, through Webmaster Tools, asking them to reconsider your website, or you can bust out the robots.txt and scorch any issues you’ve had by disallowing pages and directories – just don’t be surprised when the spiders complain, and your Webmaster Tools fills up with crawl errors.