Remember the host clustering impact from May 22nd of 2013? URL structure management and understanding the dynamics of their impact in rank tracking are still something we see as a hurdle in the industry.

URL Structure Management Is a Challenge and an Opportunity

URL structure management is harder than it used to be. Getting proper preferred landing page URLs to rank in the search engines has become exponentially more difficult since Penguin, even for branded terms.

Part of the problem is that when you build a new piece of content around specific terms (for a better user experience), you won’t see the impact of your marketing efforts until the new page outranks the current top ranking URLs.

Getting proper preferred landing page URLs to rank in the search engines has become exponentially more difficult since Penguin, even for branded terms.

Long gone are the days of watching new content scratch and fight its way to the top. Now, URLs abruptly jump pages. With this change, it has become trickier to measure our strategies.

Changing the URL structure of a website is something that we see happening with our clients on a daily basis. Site owners change their structure for many purposes ranging from a CMS migration all the way to making the URLs more user friendly and easier to read.

We also see sites changing their architecture to provide a more SEO-friendly URL structure. Though URL structures are easy to bungle, they’re also a great professional opportunity: a Web Presence Manager who understands all the impacts and complexities of URL structure is critical for organizational success.

Knowing the nuances of URL structure management will help the performance of your entire business. In this post, I’ll share 10 URL challenges and opportunities to be aware of when you’re managing your web presence:

1. Duplicate content issues

Google will see all of the following URLs as distinct and as a result may have difficulty in determining which pattern to display in their results.

www.mydomain.com

www.mydomain.com/

www.mydomain.com/index.htm

www.mydomain.com/index.htm?sid=1234

mydomain.com

mydomain.com/

mydomain.com/index.htm

mydomain.com/index.htm?sid=1234

If you don’t pick one single URL pattern and stick with it, you’ll wreck havoc on how search bots interact with your site. Each site is only allocated a certain amount of attention from Google’s hardware and we need to ensure we are providing a clean infrastructure to remove confusion. (This is where proper 301 rules and canonical link elements play a crucial role.)

2. Orphaned pages

Beware of orphaned pages on your site. These URLs are no longer linked from your internal navigation, becoming what Google refers to as doorway pages. Doorway pages are often built for SEO without regard for user experience, prime targets for algorithm updates like Penguin, which looks for aggressive tactics.

Sometimes, a page is orphaned by design (not for nefarious SEO purposes), and you must make sure they don’t dilute your site’s rankings. Since we work in a world of 1s and 0s, we need to follow strategies that will not create false negatives to the autonomous users and bots.

3. Special characters and case sensitivity

Having special characters or case sensitivity within your directory path is a formula for constant headaches. Your 404 report will often get flooded from bad internal and external links trying to access these pages in different ways.

Make sure proper patterns and best practices are followed. A web presence manager should create strict rules for URL creation along with a set of redirect rules for any existing cases.

4. Robots.txt havoc

There have been two instances in my 15+ year career where a client has run to me in a frenzy because their site migration completely killed their organic visibility. It is this reason why I tell all of my clients who may be going through a redesign to be cautions of moving all files from a staging environment to production.

It is our job to verify the URL patterns included in the robots.txt file. Twice I have seen this file moved over when a site has gone live with some adverse impact. The first occasion took about a month to recover. The second happened to a large publisher, and surprisingly they recovered in less than a week after a two week hit.

5. Blanket redirects

Another traffic killer we see sites fall into is the blanket redirect to a specific page (often the home page). If you have products that go out of stock or are discontinued then the last thing you should do from a natural search perspective is redirect the user back to the home page.

Most likely, the home page does not have a high relevance factor for the terms that were previously driving traffic to the old product. As a result, much of the equity that had been built for the initial URL is lost. As hard as it is to earn links and citations, throwing them away seems absurd.

6. Improper redirects

Unfortunately, many back end engineers in today’s marketplace still do not know the industry best practices when it comes to applying redirects on a site. For example, there is a time and a place for a 302 temporary redirect. 302s can create some confusion to the end user and also create some inconsistencies in the URL served in the search results.

Going back to the duplicate content issue above, it is important to request and educate the engineering team on the importance of using the proper redirects for the situation.

7. Canonical issues

Canonical link element can be confusing to people not involved in Internet Marketing. Even though they do an amazing job in fixing a lot of duplicate content issues they will also be ignored site-wide if implemented incorrectly.

Make sure the implementation is spot on so the preferred URL can harvest as much equity as possible. Avoid blanket canonicals, use an absolute URL, and most importantly make sure that all URLs do not just default canonical to the same URL. Implement it properly the first time.

8. Pagination issues

One huge issue we still see in the e-commerce world is the way sites are implementing their paginated URLs. These pages appear to the search engines as duplicate content based on the similarities of their HTML mark up. The only difference of the paginated results is a list of products.

Google may not crawl your site very deep if the pages themselves do not have distinct uniqueness. An available but seldom-used solution is the rel=”next” and rel=”prev” HTML link elements to direct the bots to crawl as much of your site and product pages as possible.

9. Skipping Fetch (Google Webmastertools)

The Fetch command in Google Webmastertools is one of my favorite ways to introduce new URLs to the Google index. Trust me on this one. When we promote new content, it needs to get found. This process is the fire that starts the wick.

10. Measuring impact (tracking number of unique landing pages)

When it comes to navigating through the vast landscape of URLs that drive traffic to your site, Analytics and Content Insights are the way to cast light on your strategies.

While I admit it is important to watch the index rate in the search engines, the rise or fall does not necessarily indicate an issue (you may see a drastic drop when cleaning up duplicate content issues).

Creating content segments for URL structure patterns along with tracking the number of unique URLs and their performance is how we measure our success. With keywords now showing as “Not Provided,” we must now turn to URL and content performance as top KPI’s in our reporting.

With keywords now showing as “Not Provided,” we must now turn to URL and content performance as top KPI’s in our reporting.

Can you think of an 11th url structure challenge or opportunity? Share with us and we will append this post with the best addition (giving you credit obviously).