As entrepreneurs, marketers and bloggers, we fear duplicate content. We have heard so much about it and what sticks is the thought that it is a vague concept to avoid. Duplicate content has resulted in many, many discussions. Read, for instance, the article “Updated: Does Duplicate Content Matter for SEO?”, which has two updates and even a change in title.

In this blog, I will highlight the part of the discussion that is most interesting to me as translator: the fact that translations help prevent duplicate content. First, I will remind you of what duplicate content is. Then, I will show a brief overview of the discussion on when and if there are penalties. Finally, I will discuss how translations prevent duplicate content and what conditions apply.

‘The Truth About Translation and Duplicate Content’ I highlight the part of the duplicate-content discussion most interesting to me as translator: the fact that translations help prevent duplicate content. After a definition and brief overview of the discussion on when and if there are penalties, I discuss how translations prevent duplicate content and what conditions apply. Read the blog at http://budgetvertalingonline.nl/translations/the-truth-about-translation-and-duplicate-content/

What is duplicate content?

Google has defined duplicate content as “substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Mostly, this is not deceptive in origin.”

It is vital to understand that if you republish posts, press releases, news stories or product descriptions found on other sites, your pages are going to struggle to gain traction in Google’s search engine results pages.

The big penalties debate

However, does duplicate content lead to penalties? This appears not to be the case. Rather than penalizing your website, Google just avoids it, which could be considered the same result.

John Mueller clearly states in this video that Google does not have a duplicate content penalty: “It is not that we would demote a site for having a lot of duplicate content. You do not get penalized for having this kind of duplicate content.”

Shaun Anderson of Hobo-Web explains how this works: Google does not like using the word ‘penalty’ but if your entire site consists of republished content, Google does not want to rank it. If you have a multiple-site strategy selling the same products, you are probably going to cannibalize your traffic in the long run rather than dominate a niche, as you used to be able to do.

This all comes down to how a search engine filters duplicate content found on other sites – and the experience Google aims to deliver for its users and its competitors. Mess up with duplicate content on a website, and it might look like a penalty as the end-result is the same: important pages that once ranked might not rank again and new content might not get crawled as fast. Your website might even get a ‘manual action’ for thin content.

Andy Crestodina of Kissmetrics.com, however, claims that we are overreacting. He thinks we should consider the following:

Googlebot visits most sites every day. If it finds a copied version of something a week later on another site, it knows where the original appeared. Googlebot does not get angry and penalize. It moves on. That is pretty much all you need to know. Remember, Google has 2,000 math PhDs on staff. They build self-driving cars and computerized glasses. They are really, really good. Do you think they will ding a domain because they found a page of unoriginal text? A huge percentage of the internet is duplicate content. Google knows this. They have been separating originals from copies since 1997, long before the phrase “duplicate content” became a buzzword in 2005.

Best practice

I think we should use common sense and find the answer somewhere in the middle. As Anderson puts it: do not expect to rank high in Google with content found on other, more trusted sites, and do not expect to rank at all if all you are using is automatically generated pages with no added value.

There are exceptions to the rule (Anderson, for instance, claims that Google treats your own duplicate content on your own site differently), but it is best to have one single version of content on your site with rich, unique text content written specifically for that page. As Anderson says, Google wants to reward rich, unique, relevant, informative and remarkable content in its organic listings, which has raised the quality bar over the last few years.

Example of how easy it is to create duplicate content

If a product has 7 URLs (one URL for each size of a shirt, for example), the title, meta description and product description as well as other elements for these extra pages will probably rely on boilerplate techniques (boilerplate is any text that is or can be reused in new contexts or applications without being greatly changed from the original) to create them. By doing this, you create 7 URLs on the website that do not stand on their own; they are thus essentially duplicate text across pages.

Translated content does not cause a duplicate-content issue

Does translated content cause a duplicate content issue? No, a translation is not considered duplicate content by Google. Matt Cutts, head of the webspam team at Google, explains in the video below that an English version of a text and a French version of a text are different, as they have been written with a specific target audience in mind. To see the full explanation, please watch this video.

Condition #1: Human translation prevents duplicate content, auto-generated machine translations do not

Please note that I have used the word written in the section above, not auto-generated. Auto-generated text in whatever language is not rich, unique, relevant, informative and remarkable content. If you want to rank high in Google, do the work. Hire a professional to translate the content for you; he/she knows how to speak to the target audience, taking into account regional language and culture.

Condition #2: It is best to have the tailored translation on a regional site

Alexandra at Webmeup.com says that, when creating a foreign site for your company, you need to tailor its content to the segment of users you are trying to reach with it. They probably want a slightly different message than the one you have for English-speaking audiences.

Furthermore, of the three regional site variants example.es, es.example.com, example.com/es, Google thinks using top-level domain is the best practice of the three.

Condition #3: Tell Google the regional sites are regional sites rather than duplicates

Richard Michie of Global-lingo.com says that websites localized using the same language but designated for a different region can have a problem. Let us use Spanish as an example once more.

You have a Spanish language site targeted at the market in Spain, but you need it localized for the Mexican, Argentinian and the Spanish-speaking market in the USA. If all four websites had very similar content, only one may appear in the search results, even though they are all on different country domains.

Michie tells us not to panic, there is a solution. To avoid Google’s duplicate content penalty, you need to tell Google that certain sites are for selected territories. Not only will it help you avoid the penalty; it will also help the site’s visibility in the local searches.

Let us assume that your Spanish site (example.es) is the original site and that the others (example.mx, example.ar, and example.com) are localized versions. All you need to do is add a line of code in the header section of your site, which indicates to the search engine to treat the sites not as duplicates but as local versions.

The code to be placed in the header section of www.example.es is as follows:

The key part of the code is: hreflang=”es-us”. The first two characters denote the language version of the site and the characters after the dash indicate the region to which the language refers; in this case, a Spanish language website (es) for the market in the USA (us).

Correctly adding the code is essential to the success of this localization technique. Luckily the Google Webmaster tools page gives clear instructions on how to use the correct code, language and region indicators.

These explanations, tips and examples should help you get over your fear of duplicate content. Alternatively, they should give you an idea of how to prevent duplicate content. You can find many other tips online.