Today, Google is most commonly used search engines in the world. Basically, it means that your online survival depends on Google or more precisely on the position Google attributes to your site in the result page. Consequently each time Google announces an algorithm update, SEO analysts throw themselves into exhaustive studies to pinpoint what changed and how to react. Unfortunately for SEO – fortunately for Internet users – Google is secretive and no one can claim to know what is going on precisely at Google. So what can analysts concretely study to reach conclusions? The only option you have is to follow the ranking of a bunch of websites, to quantify as many parameters as possible and to study how well each parameter can predict the ranking. Here is the recipe for the rank correlations that are populating the Internet with fancy graphs.


Why do I say rank correlations are dangerous? I think – it is a personal statement – that no matter how exact ranking values are and how many studied parameters are really used, a correlation does not prove that it is really happening. Correlations do not explain ranking and barely give an idea of what really matters inside the ranking algorithm. Everyone is allowed to disagree and this is the easiest part. Now, and not the least, I would like to tell you why I do not believe in correlation.

The goal of a correlation is to allow predicting something we do not know yet from an existing dataset. For example, if we have 3 URLs with 100, 200 and 300 backlinks and they are placed at the positions 100, 50 and 1 respectively, we could predict that with 160 backlinks the URL would be a the position 70. Prediction is possible only when the ranking is linearly correlated to the parameters. Unfortunately, there are many examples where the correlation is not linear. We can cite keywords where the more is not the better. After a certain threshold, keywords become spam and have a negative effect on the ranking. For each correlation, we compare a single parameter with ranking. This is implicitly meaning that each parameter is independent and has an effect on itself on the ranking. We very well know this is not true. For example, 1 backlink does not always equal 1 backlink. Backlinks are at least weighted by the PageRank value from where they originate. Backlinks are clearly not independent and cannot be pulled out as single units acting on ranking. This is also true on social media networks; backlinks are now weighted by the author rank of the person sharing the links and the complexity goes further with the people re-sharing the posts. How can we then put a single social value for a site and correlate it with the rank when there is such an intricate pathway behind.

Basically, the relationship between many parameters and the ranking may look like linear but the truth is we do not know what underlies the correlation i.e. the real cause of the relationship remains mysterious. If the rank correlations are unique pieces of data, we must be sure to not interpret them beyond our abilities.

Read more: