Canonicalization is an internal content duplication issue on a website, and occurs when the same information can be accessed (and indexed by search engines) from different URLs. This occurs when URLs have not be standardised correctly, and the most common form is being able to access sites from both www and non-www headers.
For example, the same content can be accessed using 4 variations of the same homepage:
http://www.domain.com
http://www.domain.com/index.htm
http://domain.com
http://domain.com/index.htm
http://www.domain.com/index.htm?selection=26
Search engines may cache all the versions of the URLs resulting in site-wide duplication, which can lead to content duplication penalties. Surprisingly, canonicalization is a very common problem.
Canonicalization can be prevented by setting your preferred domain in Google webmaster tools, and using 301 re-directs to permanently direct the browser to the preferred URL.













