Replytocom links, those links people click to reply to previous comments on your website, can cause a major headache when Google and other search engines try to index your website.
Google’s crawler (also called a spider) typically follows every link it finds, so it automatically follows every single reply link on your website. That’s right. If you have 100 blog posts with 10 comments each, Google follows all 1,000 replytocom links.
When Google downloads all those extra pages, it sees the same content over and over again. This error can hurt your SEO rankings if your WordPress website isn’t configured correctly.
But even if WordPress is configured correctly, Google will waste your bandwidth downloading the same page over and over again, so here’s how to fix your replytocom links.
Is Your Site Misconfigured?
How important is it for you to fix your replytocom links? That depends on whether your site is configured correctly.
If your site works correctly, every page includes a special reference link which tells Google the page’s official (canonical) Web address (URL). That way Google and other search engines don’t penalize you for duplicate content, such as thousands of replytocom links.
To check to see if your pages have this special reference link, go to one of your blog posts and use your Web browser to view the page source code. (Use right-click in Chrome and Firefox, and use the Edit menu in Internet Explorer.)
Near the top of the page, in the section, look for a tag like this:
Obviously the href part will be different for your blog. The important part is the rel=“canonical”. That tells Google and other search engines what the official (canonical) URL for each page is.
If you have a canonical reference link for a page and Google downloads that page under a different URL, it won’t penalize you for duplicate content in its search engine rankings.
WordPress automatically creates canonical reference links by default, but some plugins and themes can break them, so I do recommend that you use the instructions above to ensure that your WordPress is working correctly. You only need to check one page on your blog—if it works on one page, it will work on all pages.
If you don’t have canonical reference links, you could be in big trouble. Google probably thinks you have a large amount of duplicate content on your website, so it may be ranking your pages poorly in search engine results. Although there are a few ways to hack a solution, I highly recommend that you simply fix your WordPress so that it generates canonical reference links correctly.
Telling Google To Ignore Replytocom
Even if Google isn’t penalizing you by counting the same page multiple times, it’s still penalizing you by downloading the same page over and over again. This is particularly true if you have a popular site with many comments.
How big of a problem is this? I downloaded a page on Tips4PC.com the way Google does and it was 2.5 kilobytes. Multiply that by the over 1,000 articles on Tips4PC each with about 20 comments on average, and Google could waste over 25 megabytes each time it does a full index of the site.
Now, 25 megabytes isn’t a big deal. It’s the equivalent of 25 to 100 medium or large images. But if Google indexes your site everyday because it changes frequently, that’s over half a gigabyte each month of wasted bandwidth.
Again, that’s probably not a big deal, but getting rid of that waste can help speed up your site slightly for your real visitors—especially since you can do it by simply getting rid of wasted Google indexing traffic.
Here’s how to control how Google sees these replytocom links:
- Go to your Google Webmaster Tools Account.
- Click on Site Configuration in the left menu to expand.
- Click on the Settings and then URL parameters.
- Then click on Configure URL parameters on right side.
- Look for the replytocom link in the list.
- Click on Edit link of replytocom row.
You can also install a WordPress plugin to take care of replytocom links. For example: http://wordpress.org/plugins/replytocom-controller/
Block Replytocom Indexing
Website indexing services (called robots) such as Googlebot all check a file called robots.txt before they index pages on your site. This simple file tells them what pages they can’t visit.
See if your website already has a robots.txt file by visiting the following link: http://example.com/robots.txt. Replace example.com with your domain name.
If you have a file, you will need to update it to block replytocom links. If you don’t have a file, you will need to create a new file and place it in the root of your domain.
If you already have a file, download it to your computer and open it in a text editor such as Windows NotePad or Mac TextEdit. If you don’t have a robots.txt file, just open the text editor to a new file.
Most existing files start with a line that says: “User-Agent: *”. This means that section of the file applies to all website indexing services, whether they’re from Google, Microsoft Bing, or another company.
Underneath that line will be one or more lines labeled “Allow:” and “Disallow:”. It’s after the last one of these—but before you get to any other “User-Agent:” lines—that you want to add the following line:
Disallow: *?replytocom
Don’t worry—normal users will still be able to reply to comments on your site. The code above will only block robots from Google and other search engines from replytocom links.
If didn’t have a robots.txt file already on your website, create one with the following two lines:
User-Agent: *
Disallow: *?replytocom
Whether you edited an existing file or created a new one, save it as robots.txt and upload it to your website using FTP, SFTP, or your hosting company’s file manager. You want to put it in the root directory of your website, the same place you put the Google website verification page if you use Google’s webmaster tools.
There are also other options to use to ensure this error is fixed. For example you can simply tick a box in the thesis WordPress theme or the Yoast SEO plugin. Again you will need to verify that these options are in fact working as sometimes there is a clash of interest with other plugins and settings.
Within 24 hours, Google and other search engines will read the updated or new robots.txt file and stop trying to follow replytocom links.