It’s time to unwind the giant mess of 301’s, meta tag, and robots.txt hacks that we have in place — all aimed and eliminating "duplicate" content for search engines. We now have a simple way to tell search engines what the canonical representation of a url. That’s the promise of the new canonical tag, and I think it will work. Here’s the syntax:
<link type="canonical" href="/the/trusted/url/of/the/page">

More info here:
http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html

And note that it is also supported by Yahoo! and MSN.

Why am I so excited about it? Because I implemented it at Wikia, who was Google’s "trusted user" (note that Google mentions starwars.wikia.com as one of their examples)

Mediawiki has a problem with duplicate content. First, it has "soft" redirects, where two articles with different urls can point to the same content (which Google labels as "duplicates"). I had previously written extensions for Mediawiki that turn these into "hard" redirects (by issuing a Location: header with a 301 redirect). This showed a positive uplift for SEO, but it always felt like a hack. The canonical tag is a far more elegant solution, and improves performance by reducing 301’s.

Second, there are many entry points into an article in mediawiki:

/wiki/Article_Name
/wiki/index.php/Article_Name
/wiki/index.php?title=Article+Name
/wiki/Article_Name?action=view

All of the above urls will produce the *exact* same content in Mediawiki, but search engines will treat them as different urls, which splits page rank and may introduce the infamous duplication penalty.

Both of these problems can be easily solved with the new canonical tag, and it’s quite elegant.

I’ve written a new Mediawiki Extension for supporting the google canonical href tag at Wikia. It’s open source, and available at Wikia’s SVN repo for all to use. I will be contributing it to the core mediwiki software as an extension soon. Update: Now available in the Wikimedia SVN repo

This is a big help outside of Mediawiki as well, take "printable" pages as an example, or even urls with extra parameters in the query string - the canonical tag can funnel all of the page rank into one version of the page.

Kudos to Google (esp. Matt Cutts), Yahoo!, and MSN on coming together to provide a clean and elegant solution to help fight the duplicate content problem.

UPDATE: I believe this is part of Mediawiki core now, so the extension shouldn’t be necessary