Though people have tried myth-busting the idea of a penalty associated with duplicate content, blatantly using content that has been copied and pasted from somewhere else still steers your website away from being optimized. There are exceptions to the rules associated with penalties and optimization when it comes to using duplicate content, such as in situations where:
- You’re “duplicating” content on LinkedIn or Medium
- You designate a canonical tag for the original source/master page
- You syndicate content on other websites with a canonical and/or meta no-index tag
Even by knowing these exceptions, it can be hard to be sure of what’s “allowed”—and what isn’t.
A 2015 Raven Tools study revealed that about 29% of pages crawled showed up with duplicate content. Of course, if you’re confident with the originality of your content (a tool like Grammarly Premium can help you run a plagiarism check), there’s really not much to worry about.
However, if you have multilingual websites, you might find yourself wondering if translations of content across different websites will be flagged. Let’s set the record straight right now—translation is not a duplicate content issue.
But that’s the short answer. Having a better idea of duplicate content, how it impacts SEO, and its causes can help you further optimize the websites you work with.
What is Duplicate Content?
Duplicate content is content that appears exactly the same, or almost exactly the same, on more than one website.
Google explains that “duplicate content generally refers to substantive blocks of content within or across domains that either completely matches other content or are appreciably similar.”
The company adds, “Mostly, this is not deceptive in origin.”
How Does Duplicate Content Affect Your SEO?
Though some SEOs argue that you’re not going to be directly penalized by Google for accidentally having duplicate content, it does impact your search engine ranking—especially when it comes to ranking on Google.
If there are multiple sources of similar content across the internet, Google can struggle to identify the most relevant result for any given query. Not knowing which content offering to rank higher, the search engine might not rank any of the pages with the same content (though this is an extreme example).
Not showing up on the first page of a search can be detrimental to a business. Not showing up anywhere is a surefire way to go down in flames.
According to Google, you’re only really going to be getting yourself in trouble with the search engine if you “engaged in deceptive practices.” If flagged, this can result in your website being removed from the search engine results completely.
Google explains, “Once you’ve made your changes and are confident that your site no longer violates our guidelines, submit your site for reconsideration.”
There is also a distinction between accidental duplicate content and obviously plagiarized content. If you believe someone else has stolen your content, you can request Google to remove it from its search results. On a similar note, you can also petition Google to disavow spammy backlinks through Search Console.
Managing Translated Content
If you think about Google’s goal—to provide the most relevant information to any given query—it should become immediately clear that translated content would not be considered duplicate content. Someone searching for information on coconut water in English is not going to find an answer in Spanish as similarly relevant to an answer in English.
According to Google’s former head of web spam, content in different languages—although identical in context—is still quite different, so it is not considered as duplicate content. However, if the original content was simply dumped into Google Translate and then copied and pasted, it could trigger spam flags.
This kind of flagging is due to the automated nature of the content translation process on tools such as Google Translate. Without being reviewed by a human, such content can be low-quality due to many grammatical issues. By curating the Google Translate content before publishing (perhaps by hiring a freelance writer who’s proficient in the languages you’re translating to and from), you can easily avoid this issue and ensure a better experience for those visiting your site.
What Can Cause Duplicate Content Issues?
The majority of duplicate content cases are not intentional. In fact, it’s very possible that you already have some duplication on your website.
One common issue where people end up creating duplication is if they are actively running and maintaining both http:// and https:// versions of a site with identical content.
If both sites are live, active, and visible to search engines, search engines will see the pages as duplication.
In the same way, if your site has a “regular” and “print” version of each article, they can be categorized as duplicate content. In such cases, it’s best to block crawlers from one with a noindex meta tag. If not, Google will choose one of them to list.
Content created by web scraping, most often described as an automated process of extracting data from a website, is prone to be seen as a duplicate content issue. This is often the case for e-commerce websites, as many of them sell multiple versions of the same products, where the product descriptions are scraped from somewhere else online (i.e., the original supplier) and are added to new e-commerce stores without change.
Localized domains can also be a source of duplicate content. When these geo-targeted websites—such as .co.uk for the United Kingdom or .ca for Canada—are all owned by the same company and are dialed into separate English speaking locations, it’s easy to make the mistake of thinking that posting the same content won’t be recognized as duplication (spoiler alert, it will).
However, if the content has been translated into a different language and curated for a geo-targeted website, you should have no issues.
Final Thoughts: Is Translation an SEO Duplicate Content Issue?
The bottom line is that publishing good translations of the same content and information elsewhere on your site will not adversely impact your SEO. However, you are setting yourself up to be flagged as spam if you are relying on automated translations and putting no additional work into curating the content you’re publishing—even if it is on a geo-targeted website.
This does not mean that you don’t have to worry about duplicate content. As noted, there are many different situations where you can accidentally end up publishing duplicate content that you may not have even realized are live on your website! Though this content may not cause you to be directly penalized, it moves your website away from being completely optimized.
At the very least, you need to ensure that you are not creating duplicate content out of laziness or in an attempt to deceive search engines. Being flagged for doing so can cause serious damage to your company and brand by your website being removed from search engine results until you have proved that you have remedied the situation. It’s simply not worth the risk.
Thankfully, professionally translating content and publishing it in multiple languages is not going to put you in that situation.
Tweet your questions about duplicate content @DigitalRDMS, and we’ll retweet our favorites with answers!