Duplicate content on your page can hurt your Search Engine Results Page (SERP) rankings, and it can also be easy to miss when trying to look for and resolve these issues. There are a few common areas in a website design that cause or create duplicate content. By knowing where to look, and by understanding some of the tools available, you can easily get your search engine rankings back to where they should be.
Today, I’ll discuss the most common sources of duplicate content and offer a few ideas about how to resolve the issues.
1. Reposted Articles on Blog and Parent Site
Many sites repost a few featured articles from their blog on their main page, and this can be a troublesome source of duplicate content. When a search engine like Google comes across duplicate content, it will attempt to determine the original source, which could draw traffic away from its intended target.
The canonical link element is a way to signify which source of content is the original or preferred version, and it is supported by Google, Yahoo, and Microsoft.
You can implement this strategy within the same domain by using the link element to any page containing duplicate content, linking it back to the primary page:
<link rel=”canonical” href=”http://www.rubberducks.com/blog/about-rubber-ducks”/>
If you create an <iframe> tag to embed content from your blog, you can avoid these issues. This will allow you to share content from your blog on a main page without duplicating its content on the site. You should also watch for duplicate content from social media posts.
2. Repeated Words in URL Parameters
Repeated words can be a common issue in URL parameters, especially when working with URL-driven category or search features. For example, this hypothetical URL creates duplicate content for a product page for orange ducks.
The primary page URL without the added parameters should read like this:
Avoid word repetition and understand how your search or sort features may be creating duplicate content in your URL. You can learn more about proper URL parameters implementation in Google’s Webmaster Tools here.
3. Printer-Friendly Pages
Multiple versions of pages, such as a printer-friendly page, are a common source of duplicate content. Google’s search covers text, but not images and media, so it might not know whether the primary version or the printer friendly version is the better target. Sites often generate a second URL for printer-friendly pages:
To avoid this second version, which creates duplicate content, you can incorporate “Noindex, follow” tags into the page. This instructs search engines not to index that version. This article from Search Engine Guide shares a few ways to create printer-friendly page versions without creating duplicate content.
4. Session ID Pages
Similar to printer friendly pages, some websites use session ID URL variations to track customer movement. By trawling through different user ID and search variations, Google can find a lot of URL variations that lead to the exact same page, and each of these variations can count as a source of duplicate content.
The easiest way to resolve this issue, while still implementing session ID tracking, is to use tags on any alternate URLs created. If you instruct any session ID to use the “Noindex, follow” tag to mark any URL variations, search engines will not index those pages. This tag can help resolve many URL-related sources of duplicate content. You can read more about tags, tools, and tips to resolve duplicate content issues in my upcoming CEM article on the subject.
Do you know other common sources of duplicate content? Share your tips in the comments below!