Thin content was one of the first SEO issues Google targeted with its Panda algorithm update in 2011. That update rocked the entire industry and kick-started the search giant’s war against low-quality content.
It also made life increasingly difficult for black hat SEOs trying to game the SERPs. However, there are plenty of genuine, technical reasons why you might end up with thin content on your website. In this article, we explain exactly what thin content is, how to find it on your site and what you need to do about it.
What is thin content?
Google describes thin content as having “little or no added value”. This is the description you’ll see if you’re unlucky enough to get a manual action warning in Google Search Console, informing you that you’ve been penalized for having thin content on your site.
You definitely don’t want one of those.
The question at this point is: what kind of content does Google consider to have “little or no added value”?
Back in the early Panda days, Google was mostly targeting deceptive uses of thin content – for example:
1. Content that’s automatically created
In this case we are looking at low-quality content, often created by basic machine concatenation, and offering limited, if any, value. For example, grabbing a news story in Spanish and then running it through Google Translate before adding it to your site – a big no-no.
We are starting to see examples of machines (or ‘robots’) writing high value content and this is something that will become more prevalent as AI and machine learning continue to improve. This does not fall into thin content but you would still want a human editor to review this type of content before publishing it.
2. Low-value affiliate content
Affiliate websites offering useful, comprehensive purchase advice have nothing to fear from Google. However, pages filled with affiliate links that offer no useful or relevant information for the end user are prime targets for getting hit by a search penalty.
If you’re in the affiliate game, stick to the following guidelines:
- Make sure your website has a purpose beyond that of any affiliate offering alone. Affiliate pages should contribute to a tiny percentage of your total website.
- Add something new to the affiliate audience. Not only will this provide access to new online niches, fuelling your affiliate ROI, but will create value to encourage SEO success.
- Be objective; ask yourself whether there’s a reason why a user should land on your website before going to the actual product/service originator website. Remember, your site is an added step in the process between the user and their end destination, so there has to be a value-enhancing reason for them to take this detour.
- Only offer affiliate opportunities that are closely matched to your target audience. This helps to overcome diluting your offering, mixed messaging signals and barriers with user engagement and interaction.
- When you refresh and improve your main website copy, remember to review, update and add depth of value to your affiliate content too. Don’t have scraped, duplicate affiliate content on your website – make it unique, better than any other examples and something of value to your website audience.
3. Content scraped (copied) from other sources
If you systematically add content to your website from external sources, you’re also at risk of a thin content penalty. There are a number of ways in which content is copied (or scraped) from other sources, a few of the more common ones being:
- Copying and pasting full articles that were not created by you.
- Adding external content in part, or in full, to your site without any extra unique value.
- Completing minor tweaks and changes to predominantly copied content.
- Using automated means to re-purpose content that exists externally, trying to display this content as unique.
- Embedding lots of other content types (video, images, infographics etc.) without bringing anything new or adding value.
4. Using doorway pages to rank in Google
Doorway pages are a means to spam the search engine results pages (SERPs) with very thin content that target a very specific term or close group of terms with the purpose of sending this traffic to another website or destination.
This creates a poor user search experience and adds unwanted steps for the user to get to their desired end result. Often, doorway pages mean that the user ends up on a lower quality and less relevant search result page than required, resulting in excessive searching to discover the content they needed.
It’s all about adding value
Essentially, if your content is copied from anywhere else, generated by software or you’re creating pages with little or no content, you could be in trouble. Even if you’re not trying to be deceptive (for example, reposting relevant news stories), you have to question why Google would choose to rank your page when it’s simply repeating content that’s already available – it has nothing new or valuable to offer.
As Google explains over at Search Console Help:
“One of the most important steps in improving your site’s ranking in Google search results is to ensure that it contains plenty of rich information that includes relevant keywords, used appropriately, that indicate the subject matter of your content.
“However, some webmasters attempt to improve their pages’ ranking and attract visitors by creating pages with many words but little or no authentic content. Google will take action against domains that try to rank more highly by just showing scraped or other cookie-cutter pages that don’t add substantial value to users.”
It all comes down to adding substantial value to the end user because this is what Google aims to deliver as a search engine.
For more info on thin content, take a look at this video from Google’s former head of web spam, Matt Cutts:
It’s not a particularly recent video but everything Matt Cutts says is still relevant today.
What are the dangers of thin content?
While the most publicized danger of thin content is getting hit by a Google search penalty, your problems run much deeper than this if you’ve got too much of it. If Google’s algorithms can tell you’re using thin content deceptively, then you can bet the majority of users who visit your site can see it as soon as they land on the page.
Whatever your objectives are with the page, you’re not going to convince many people to take action this way. You’ll struggle to keep users on the page, encourage them to engage with your brand or inspire them to convert.
Essentially, this is the real danger of thin content: your marketing objectives are going to fall flat.
Now, in terms of the Google Search penalties, these can be pretty devastating and it helps to understand how Google’s Panda algorithm works.
Thin content and Google Panda algorithm updates
The Google Panda update was first released in 2011 with the purpose of de-valuing low-value and thin websites, to stop them from appearing so prominently in SERPs.
The other, lesser communicated, side of this update was the additional ranking gains (tied to content quality signals) rewarding websites creating high-quality content.
Google Panda updates can impact (remember, this ‘impact’ can be positive or negative) a single page, a whole topic or theme, multiple themes, or entire websites.
The Panda filter applies a number of perceived content quality criteria as well as questions that the Google Quality Raters would be asking themselves when manually viewing content – things like:
- Does the content convey expertise, authority and trust (E-A-T)?
- Are the ‘Your Money or Your Life’ (YMYL) pages present and providing everything needed (think about pages tied to transactions, financial details, private information collection and more)?
- Is there depth of content? For example, do core service pages cover the main topic, plus supplemental information, and enable the user to immerse themselves into the topic (and discover more information easily, should they choose to)?
- Is the content accessible? Can it be accessed easily within the site structure? How quickly does the content load? Does the content work effectively on mobile devices?
The above is just the starting point for Panda protecting your website and content.
It is important to get a second opinion on your content. Be objective and honest with yourself and your team about the quality of what is being produced, and how it needs to improve.
Not all thin content is deceptive
While the penalties for having too much thin content can be severe, there are quite a lot of scenarios where you’re naturally going to end up with content that could fall into this category.
Search results pages
If you have a search function on your website, the results pages are going to offer very little or no original content. This can’t be helped, of course. The purpose of a search results page is to show snippets of other pages across your site and help users choose the most relevant option.
Solution: Prevent Google from indexing results pages by adding a disallow line for these pages in robot.txt file.
In many cases, it’s perfectly reasonable to have a photo or video gallery on your website. You might be a wedding photographer, a marquee hire company or a business with a bunch of video case studies to show off.
If the purpose of this page is to allow visitors to browse your photos or videos and choose which ones they want to view, this causes some thin content issues. You probably don’t want a load of text getting in the way on the gallery page itself and your problems get worse if each image or video has its own dedicated page.
Solution: This really depends on how you structure your gallery. You might choose to create content for your gallery page and no-index the individual image/video pages, for example. Or you might take the opposite approach and create unique content for each image/video and no-index the gallery page.
Alternatively, you could create a carousel that displays all images/videos on the same URL – it all depends on what you want to rank for and the kind of content you’re planning to create.
Shopping cart pages
Shopping cart pages aren’t there to provide users with valuable content; they’re designed to help people manage orders and complete purchases. Technically, we’re in thin content territory here but the fix is pretty simple.
Solution: Once again, stop Google from indexing these pages by no-indexing them in your robot.txt file.
Duplicate pages are a natural part of managing a website. Moving over to HTTPS from HTTP creates duplicates, as does having www and non-www domains while managing multilingual websites and recreating pages for multiple locations can also result in duplicates.
Technically, duplicate content isn’t quite the same thing as thin content but the two do overlap in certain cases.
In many cases, thin content isn’t detrimental to the user experience at all. In fact, it’s sometimes better to forget about content and simply deliver the functionality users need – eg: shopping carts.
Luckily, keeping these pages safe from search penalties is relatively simple. By no-indexing pages, telling Google which version to index (canonical tags) and/or using 301 redirects to send users to the right place, non-deceptive thin content shouldn’t be a problem.
Can I have thin content on product pages?
This is one of the most common scenarios where thin and/or duplicate content occurs on a website. This is especially true if you’re selling multiple versions of the same or very similar product.
Naturally, brands try to avoid having duplicate content across these pages but it’s difficult to say the same thing in a hundred different ways.
It becomes a battle of thin content vs duplicate content and this causes a lot of confusion for website owners, SEOs and marketers in general.
The truth is, duplicate content is the lesser of two evils here and it’s better to provide users with comprehensive product details – even if they’re the same or similar – than publishing pages with very little (albeit unique) content.
Here’s What Google’s Andrey Lipattsev had to say about duplicate product pages during a Q&A on duplicate content with fellow Googler John Mueller.
“And even, that shouldn’t be the first thing people think about. It shouldn’t be the thing people think about at all. You should think, I have plenty of competition in my space, what am I going to do? And changing a couple of words is not going to be your defining criteria to go on. You know, the thing that makes or breaks a business.”
More to the point, there is no search penalty for duplicate content but there is for thin content.
So, when it comes to product pages, don’t worry too much about duplicate content for very similar products or variations of the same product. Instead, focus on optimizing for the best experience and giving Google any clues you can about which page to prioritize in terms of indexing.
Here are some tips:
The key takeaway from the Q&A on duplicate content is that when pages are similar (or the same), Google is looking for a way to differentiate between them and product descriptions are just one of the hundreds of factors it looks at.
- Provide full product details on every page
- List the key benefits of each product
- Include images and videos where relevant
- Create unique content where you can
- Avoid copying product descriptions from other sites (eg: Nike’s descriptions of its shoes you’re selling)
- Allow users to select different versions of the same product from a single page (sizes, colors, etc.)
- Use canonical tags if you want Google to index one version of the same or very similar pages
- Focus on adding value beyond product descriptions: page speed, mobile optimization, navigation, etc.
How do you find thin content on your site?
There are a number of ways to discover thin content (levels of words, duplication, and value) and a few of the more common actions can be seen below.
Using Copyscape (and other free tools), you can crawl the web to look for any content that has been copied from your domain, as well as any content that may have been added to your own site over the years copied (in part or full) from external sites.
2. Google search operators
You can also use Google search operators to manually check Google for instances of content copying/scraping or duplication.
Here’s an example of what you need to do:
- Copy a selection of content that you feel may have been copied (consider more successful content types you have added to the site)
- Paste into Google (in this case assuming it was text content) within quotes (“”)
- Review the results
Here’s an example of the above in action. In this case checking any duplication of content from a post I created for Search Engine Journal:
As you can see, the first site appearing is the originator website, and as this content is opinion-driven, it is intended to be distributed, shared socially and used on other websites.
An important aspect of this is the purpose of the content, whether it’s to drive traffic back to the main website, encourage shares or something else.
3. Deep data platforms
I’ve been using our machine learning software Apollo Insights for nearly ten years. One of the ways in which I use the data is to locate pages that are not contributing towards total site success.
You can see this in action below (the ‘Page Activity’ widget):
Another metric I use Apollo Insights for is locating content with a limited word count.
Although more words doesn’t always mean better quality content, in most cases a page with very few words is unlikely to be providing the depth of user and search value needed to deliver an optimum search experience.
You can see this below using a deep data grid – in this case I am looking at depth of content based on expected content structural elements, things like the presence of multiple levels of header tags, and checking that the page is active and real:
Remaining with Apollo, ‘Auditor’ tells me how many pages have fewer words on them than I would expect from a high-quality website page. I can also look at the bigger picture and combine this knowledge with items like: external linking, framed content, pages orphaned off from the main website and much more.
How do you fix thin content?
The first stage in fixing thin content is understanding what high-quality and value-enhancing content looks like in the first place. The example below is from Think With Google: ‘The Customer Journey to Online Purchase‘.
Some of the key points which flag this as high quality for me include:
- The use of unique data to provide user meaning.
- The ability for the user to engage with the content and work with it to create new value.
- Mixed content types and content segmentation for easy understanding and skim reading.
- Responsive design, supporting universal access to information.
- Solving a problem – purposeful content is a key factor for truly valuable content creation.
- Detailed supporting information placing the report into context, backing up the stats and enabling further reading on the topic.
Using external comparisons is a great way to put in place the lowest benchmark for your own content quality. The goal is to create content on your website that is far better than any other examples available online.
Once you identify what ‘good’ looks like in your niche, you want to move towards creating ‘great’ content. At this stage, you need to find the content that doesn’t work at present (see previous section on ‘finding thin content’) and boost the content so that it can contribute more towards total site success, as well as its own standalone value.
“You will also need to find new opportunities for effective content creation. Don’t limit your content value by re-purposing alone, there is always an opportunity to create something amazing with digital content.”
Other tactics for creating new quality content include:
- Looking at real-time data changes for new content ideas and action points.
- Following social media trends to see what your audience needs.
- Keeping up to date with industry changes and regularly revisiting old and existing website copy.
- Looking at big data (all of the relevant data) so you can base decision making on more than gut feel.
- Creating tiered content strategies and aligning them – a blog post is great, but supporting this with an infographic and updating it from the data you receive after it goes live, is much better.
- Asking your audience what they want – after all, the content should be primarily to help them solve their needs.