social analytics suck

Yes, you’re getting analytics wrong.

Now, don’t go jumping up and down calling me every bad name in the book. It’s not your fault. Most businesses are getting their analytics wrong because they’re not:

  • monitoring the right metrics
  • creating insightful data visualizations
  • using advanced analytics like predictive analytics
  • making informed decisions based on thorough analysis

Monitor the right metrics

What are the right metrics?

  • They measure controllable factors, such as cost, output, reach, and sales; as well as uncontrollable factors such as inflation and competition.
  • conversion funnelThey correlate with your (SMART) goals. SMART goals are Specific, Measurable, Achievable, Realistic, and have a Time factor. For example, to increase sales (specific) by 10% (measurable, achievable and realistic) within the next 6 months (time). Not only must goals be SMART, they need to match your mission and vision, as well as reflecting intermediate goals, not just terminal goals. An example of intermediate goals comes from the conversion funnel, where you might set intermediate goals for top of funnel outcomes.
  • They don’t measure vanity metrics or, if they do include these, more substantive metrics are also included. What’s a “vanity metric”? Metrics such as Facebook likes, Twitter followers, Page visits, and Pins are vanity metrics. While there’s some validity behind these metrics in principle (the more likes the more your message is amplified) the correlation between them and outcome variables (such as sales) is very weak. Instead, focus on metrics that more closely align with your outcome variables, such as inquiries, access to location information, request for pricing, as well as metrics reflecting movement down the conversion funnel, such as sentiment.

Craft insightful data visualizations

According to the Interaction Design Foundation,

Data visualization is only successful to the degree that it encodes information in a manner that our eyes can discern and our brains can understand. Getting this right is much more a science than an art, which we can only achieve by studying human perception. The goal is to translate abstract information into visual representations that can be easily, efficiently, accurately, and meaningfully decoded.

More important than metrics, data visualizations reflect the relationship of one metric to another (crosstabs) or patterns among data points.

all marketing metricsThe simplest form of data visualization is a chart or graph, such as a pie chart.

More insightful analytics might track variances, such as the difference between expected and actual results. Using this analytics helps you make better predictions as well as focusing attention on why sizable differences occurred.

Adding color to your visualizations improves understanding and focuses attention on problem areas. For instance, P&G uses a data visualization in their delivery system. The visualization color codes receivers based on whether delivery is scheduled as on-time (green), possibly late (yellow to orange), or late (red) using GPS systems on trucks. Business size (as a function of their yearly order size) is represented by the size of their block in the visualization. Thus, if a big block, say for a business like WalMart, turns yellow, managers focus on improving delivery or diverting trucks destined for other buyers to fulfill WalMart’s needs on time. When a block is green, managers feel safe in ignoring that delivery to focus on those that are more in danger of failure.

The US Census and other dominant players in information, continue to upgrade analytics to focus more on visualizations like this one:

getting analytics wrong

Constructed from US census data, this graphical depiction of job by gender clearly articulates which professions are gaining more male of female workers and which remain unchanged. As a politician or social agency, using this data visualization helps focus attention on problem areas, such as computer science where male domination continues to increase.

As a politician, you can argue that wage discrimination doesn’t exist, it’s a function of more women in lower paying jobs, such as secretarial, clerical, and teachers.

Use predictive analytics

Predictive analytics reflect relationships in a way that allows managers to predict the future. While predictive analytics don’t show what will happen, it shows what is LIKELY to happen.

Unfortunately, businesses tend to misuse predictive analytics by extending a linear graph. The underlying assumption is that factors are linear, which is likely not the case. Acting as though a graph will continue its linear pathway, can lead to disastrous consequences — such as the collapse of the housing market as homeowners kept buying new homes even after the market began turning. Thus, they’re getting analytics wrong.

Instead, businesses should build models (predictive models) based on appropriate data — hence the emphasis on big data. But, as your dataset gets larger, forcing everything into a predictive model on generates spurious correlations that lead to poor decision-making. Often you’ll end up with nonsense such as sugar consumption being linked to sports ticket sales; other times you find underlying causal factors outside the model. For instance, the length of women’s skirts correlates with economic conditions. Shortening women’s skirts won’t improve the economy because there isn’t a causal link between the two. Instead, both represent consumer confidence, which, when high, results in both shortening skirts and spending more that improves the economy.

Predictive models take variables LIKELY to affect or be effected by other variables based on theoretical assumptions, not by simply correlating all the variables in your data. Tools like SPSS, R, SAS are great for building theoretical models reflecting relationships between variables by doing linear regression, logit, or even clustering and MANOVA.

Econometrics takes this process one step further by fitting data into existing theoretical models. For instance, I built an algorithm for a client using an existing econometric model that fit my data — in this case a decay model. The resulting model did a good job of predicting and scoring leads for the sales force from the many tire kickers in the subscriber database using readership patterns.

If you’re not lucky enough to have an econometric model that fits your data, you can still collect data for variables you believe SHOULD correlate with your outcome variable. Using half the data, construct a model of the relationship looking something like this:

sales = .352 + .142 (ad spend) + .425 (salesforce strength) + .024 (GDP) + .23 (competitor strength)

Then, use the other half of the data to validate your findings by plugging them into the equation to ensure you get a reasonable approximation of the outcome variable. This is called a jackknife approach.

A few caveats or you’ll do analytics wrong:

  1. you can only interpret ß-values accurately (those numbers in front of the variable) if the variables are assessed on the same scale
  2. update your algorithm as relationships might change over time with the addition of new variables, elimination of existing ones, or different ß-values
  3. note, you only control the first 2 variables — ad spend and salesforce strength. Indirectly, you might exert some control over competitor strength by improving your own. GDP is normally beyond the control of firms.
  4. factors outside your model may be at work. Look at the R² to determine how much variance is explained by your model. If you’re explaining very little variance (low R²), you should add more variables to your model so you’re explaining more of the variance.
  5. relationships might not be linear. Consider log or other possible relationship types by graphing your data first. Graphical analysis might also uncover interactions between variables which should be added to your model.

Informed decision-making

So far, we’ve talked mainly about modeling and presenting data. There’s a bigger problem that leads to doing analytics wrong. Failing to use them to guide decision-making.

I know everyone says they use data to drive decisions, but that’s not always the case. Often analytics don’t reach decision makers in a timely fashion — hence the focus on real-time analytics.

In other cases, decision-makers don’t understand analytics (this is where visualizations really help) or deny the findings. I’ve worked with a number of clients who discounted analyses in favor of their own notions of how things work by saying the findings don’t apply to their industry, persona, or time. While that might be a valid argument for not using analyses, some testing should be done to ensure the findings don’t work in the current context. Rejecting analyses because “that’s not how we’ve always done it” makes little sense.

Politics often wags the dog in organizations. Instead of making data-driven decisions, decision makers default to an alternative pathway due to political pressures or fear of reprisals.

Assuming correlation informs causation is a persistent problem in firms leading to getting analytics wrong.

The problem is that correlation is different from causation. Correlation is when two or more things or events tend to occur at about the same time and might be associated with each other, but aren’t necessarily connected by a cause/effect relationship.

Dr. Wheeler, Carson-Newman University

Don’t get your analytics wrong

If businesses are all doing analytics wrong, how do you do analytics right?

Getting analytics wrong is easy, getting analytics right takes hard work and concerted effort.

  1. Hire the right people. Too often businesses hand off analytics to unskilled employees and new hires without industry knowledge. While this might save money in the short-run, the cost of poor decision-making is very high. Hiring the right people with BI (business intelligence) training and industry experience (to ask the right questions of the data) makes money in the long-run.
  2. Invest in great analytics tools to not only gather data, but create appropriate visualizations. Raw data has little value in driving good decision-making. And, please, don’t entrust your analytics effort to popular software analysis tools without careful vetting. I’ve personally found many tools produce faulty data (even Google Analytics produces different metrics than Google Adwords, which is a serious concern since you’re paying for Adwords based on clicks). Others generate rather weak reporting. Especially use interactive tools allowing users to choose which variables to see and which relationships to explore as users might want to look at data differently.
  3. Build predictive models based on REAL data and constantly update those models.