A good user experience equals more money. But how do we measure user experience? How do we know if it’s getting better or worse?
What is User Experience?
There are many definitions for “user experience.” To keep things concise, NN/g defines it like this: “’User experience’ encompasses all aspects of the end-user’s interaction with the company, its services, and its products.”
Continuing, it’s important to distinguish this from UI (user interface). Here’s a quote from Peep further defining UX:
“First of all, UI (user interface) is not UX (user experience). A car with all its looks, dashboard and steering wheel is the UI. Driving it is the UX. So the interface directly contributes to the experience (beautiful car interior makes a better experience sitting in one), but is not the experience itself.
Visual beauty is important for websites, but visual design is only one step in the process. A beautiful website might make a great first impression, but if it has terrible usability, users can’t figure out what to do, forms on the site don’t quite work, the error messages are not helpful and the copy on the website is vague, the overall experience will be quite bad.
Experience is also personal and subjective – and is greatly affected by our past experiences, personal preferences, mood and a myriad of other things.”
User experience is a subjective feeling, as each individual experiences the world through their own lens. The total experience of each user could depend on a variety of external factors, like how they began their day to their mood to their socio-economic status and so on.
Still, though, there are a variety of valid ways to measure usability and the overall user experience, and how people are interacting with each part of your site as well as holistically.
Why Measure User Experience?
“Measurement is the first step that leads to control and eventually to improvement. If you can’t measure something, you can’t understand it. If you can’t understand it, you can’t control it. If you can’t control it, you can’t improve it.”
― H. James Harrington
There are lots of reasons to measure the user experience, the main one being that you can pinpoint problem areas and work to improve them.
The other reason is to identify, quantify, and communicate UX to stakeholders. Finally, measuring user experience can give you greater clarity as to your positioning and competitive advantage.
UsabilityGeek puts it well, saying that, “ultimately, the primary objective of usability metrics is to assist in producing a system or product that is neither under- nor over-engineered.”
There are many metrics used to measure UX. For this article, however, I’ll focus on usability measurements for satisfaction. These break down more broadly into two categories: task level satisfaction and test level satisfaction.
Task Level Satisfaction Measurements
For both types of user satisfaction metrics, the method by which these numbers are determined are short questionnaires. With task level satisfaction, users should immediately be giving a questionnaire after they complete a task (whether or not they manage to complete the goal).
There are a few different types of these questionnaires, some more popular than others, but all attempt to gauge and quantify how difficult or easy it was to complete a certain task in a user test.
Some of the more popular ones are:
- ASQ: After Scenario Questionnaire (3 questions)
- NASA-TLX: NASA’s task load index (5 questions)
- SMEQ: Subjective Mental Effort Questionnaire (1 question)
- UME: Usability Magnitude Estimation (1 question)
- SEQ: Single Ease Question (1 question)
1. After Scenario Questionnaire (ASQ)
The After Scenario Questionnaire features three questions, post-task:
The ASQ is commonly used, and research has supported that it “ has acceptable psychometric properties of reliability, sensitivity, and concurrent validity, and may be used with confidence in other, similar usability studies.”
The NASA-TLX is “a widely-used, subjective, multidimensional assessment tool that rates perceived workload in order to assess a task, system, or team’s effectiveness or other aspects of performance.” It’s also been cited in over 4,400 studies.
The questionnaire is broken into two parts, the first part being divided into six subscales that are represented on a single page:
The next part lets the user weight the measurements based on what they thought was more important to the task.
3. Subjective Mental Effort Questionnaire (SMEQ)
SMEQ is made up of just one scale, and it measures the mental effort that people feel was involved in a certain task.
According to Jeff Sauro in Quantifying the User Experience, SMEQ correlates highly with SUS scores, as well as completion time, completion rates, and errors.
SMEQ is easy to administer, and is supported by a good amount of research.
4. Usability Magnitude Estimation (UME)
Magnitude estimations is a technique standardly applied in psychophysics to measure judgments of sensory stimuli. According to The University of Edinburgh:
“The magnitude estimation procedure requires subjects to estimate the magnitude of physical stimuli by assigning numerical values proportional to the stimulus magnitude they perceive. Highly reliable judgments can be achieved for a whole range of sensory modalities, such as brightness, loudness, or tactile stimulation.”
Similarly, this process has been adopted for usability studies to measure perceived difficulty of tasks. In Quantifying the User Experience, Jeff Sauro wrote that, “the goal of UME is to get a measurement of usability that enables ratio measurement, so a task (or product) with a perceived difficulty of 100 is perceived as twice as difficult as a task (or product) with a perceived difficulty of 50.”
5. Single Ease Question (SEQ)
Finally, there’s the Single Ease Question, which is most recommended for task level satisfaction for its ease and correlation with other usability metrics.
It consists of just one question after a task:
What Do You Do With These Numbers?
The general purpose of task level questionnaires is to assign more quantitative measure to the task experience (problems encountered or the number of steps to complete a task). As Jeff Sauro put it, “task level satisfaction metrics will immediately flag a difficult task, especially when compared to a database of other tasks.
It also helps to compare actual task difficult and completion rates to expected task-difficult. Here’s what Sauro said about that:
“It’s helpful to use a survey question that’s been vetted psychometrically like the Single Ease Question (SEQ). Even with the right question and response options you’ll want some comparable.
Tip: Consider comparing the actual task difficulty rating with the expected task-difficulty rating. You can ask the same users or different users how difficult they think the task will be. The gap in expectations and retrospective accounts can reveal interaction problems.”
Really, though, the goal of task level questionnaires is to improve them. If you can improve the mean rating over time, you can quantify how designs have improve the UX. Fixing these usability bottlenecks, should, in turn, improving conversions (and revenue in the long run).
Test Level Satisfaction
If task level satisfaction is measured directly after each task is completed (successfully or not), then test level satisfaction is a formalized questionnaire given at the end of the session. It measures their overall impression of the usability and experience.
There are, again, a variety of questionnaires used, but I’m going to focus on two popular ones:
- SUS: System Usability Scale (10 questions)
- SUPR-Q: Standardized User Experience Percentile Rank Questionnaire (13 questions)
1. System Usability Scale (SUS)
The SUS asks ten questions:
- I think that I would like to use this system frequently.
- I found the system unnecessarily complex.
- I thought the system was easy to use.
- I think that I would need the support of a technical person to be able to use this system.
- I found the various functions in this system were well integrated.
- I thought there was too much inconsistency in this system.
- I would imagine that most people would learn to use this system very quickly.
- I found the system very cumbersome to use.
- I felt very confident using the system.
- I needed to learn a lot of things before I could get going with this system.
For each questions, users are given a scale of 1-5:
To score the SUS:
- For odd items: subtract one from the user response.
- For even-numbered items: subtract the user responses from 5
- This scales all values from 0 to 4 (with four being the most positive response).
- Add up the converted responses for each user and multiply that total by 2.5. This converts the range of possible values from 0 to 100 instead of from 0 to 40.
The SUPR-Q tries to judge a website based on if it is usable, credible and visually appealing. It has 8 questions (including 1 NPS question). The trust questions vary based on whether or not the site is commerce-oriented. They are as follows:
- The website is easy to use. (usability)
- It is easy to navigate within the website. (usability)
- I feel comfortable purchasing from the website. (trust for commerce sites)
- I feel confident conducting business on the website. (trust for commerce sites)
- The information on the website is credible (trust for non-commerce)
- The information on the website is trustworthy (trust for non-commerce)
- How likely are you to recommend this website to a friend or colleague? (loyalty)
- I will likely return to the website in the future. (loyalty)
- I find the website to be attractive. (appearance)
- The website has a clean and simple presentation. (appearance)
Again, they are scored on a 1-5 scale
As for scoring it, you just add up the responses for the first 12 questions and add this to 1/2 the score of the NPS question. The lowest possible score is a 12 and the maximum possible score is a 65. You can then compare your score to industry benchmarks.
3. Net Promoter Score (NPS)
NPS isn’t necessarily in the same category as the other questionnaires here, but I wanted to add it because it is a popular and effective method of measuring user experience and satisfaction (and it’s one of the questions on the SUPR-Q)
You’ve likely already heard all about the Net Promoter Score (NPS). NPS is calculated by asking one question…
“How likely are you to recommend (product or service) to a friend?”
Then, you break up responses into three chunks:
- Promoters (9-10). These are your happiest and most loyal customers that are most likely to refer you to others. Use them for testimonials, affiliates, etc.
- Passives (7-8). These customers are happy, but are unlikely to refer you to friends. They may be swayed to a competitor fairly easily.
- Detractors (0-6). Detractors are customers that are unhappy and can be dangerous for your brand, by spreading negative messages and reviews. Figure out their problems and fix them.
Then you simply subtract the percentage of detractors from the percentage of promoters to get your NPS.
It’s important to note that your score isn’t necessarily for comparing with your competitors, more that it is a benchmark for keeping track of how well you’re doing.
In this way, NPS is much like your conversion rate, in that a good NPS is better than the one you had last month.
While NPS is quite popular right now, there are also skeptics. Here’s how Craig Morrison of Usability Hour addresses that:
“When talking about the Net Promoter Score, you’ll often hear people say it isn’t accurate, or it doesn’t work, or that it depends on how you phrase the question, etc.
But the thing is, what are you actually doing right now to keep track of how your users experience with your product?
Anything? Surveys? Interviews?
Many startups I work with are doing absolutely nothing. So while this system might have its flaws, it’s way better than doing nothing at all.
It’s the best way to keep track of how the changes you’re making to your product are effective your user experience.”
What Do You Do With These Numbers?
Again, as with task level numbers, the goal with test level numbers is to improve the score.
However, there’s also value in comparing your numbers to competitors to get a realistic idea of where your customer satisfaction level is at. Here’s how Jeff Sauro put it:
“What do users think of the overall usability of the application? The System Usability Scale (SUS) is a popular usability questionnaire that’s free and short (at 10 items). It provides an overall estimate of usability.
Tip: Take your raw SUS score and convert it to a percentile rank relative to 500 other products or several industry benchmarks. If the product you are testing is consumer/productivity software you can compare your SUS score with 17 other products.”
You can also compare your Net Promoter Score to those in your industry, but there are also come cool and well documented uses specific to the NPS. Sure, it could remain simply a benchmark to how you place within your industry. However, NPS, used wisely, can have much greater benefits than benchmarking.
For example, check out this RJMetrics data. Your best customers spend far and away the most money with you:
You can assume that your promoters are your best customers, or at least correlate highly with those that spend more. Therefore, knowing your ‘promoters,’ those that love your brand, can be valuable in planning targeting promotions and campaigns. As Questback said, “help your promoters tell everyone they ♥ you.”
Similarly, you can dig a lot of user experience problems from the detractors. Which issues keep cropping up? Are there any trends?
Some say ignore the passives; some say don’t. Often, you can look at them as a growth segment, and figure out how to turn them into promoters.
So for NPS:
- Measure your score over time (hopefully you’re improving).
- Measure individual trends over time. Investigate those trends that are improving or those that aren’t.
- Help promoters tell the world about you. Also help them buy more stuff.
- Find insights via the detractors. Let them teach you things you can improve about your user experience.
A quick note, though, with these metrics: According to NN/g, “users generally prefer designs that are fast and easy to use, but satisfaction isn’t 100% correlated with objective usability metrics.” Actually, they’ve found users prefer the design with the highest usability metrics 70% of the time.
Though they found only weak paradoxes in the data, it’s important then to consider both performance metrics as well as preference metrics (such as those above). And most importantly, these usability questionnaires should be used as a tool in the pursuit of optimizing your UX to generate the most revenue and customer LTV. So don’t quit A/B testing any time soon :)
There is no single best way to measure your site’s UX. As Jeff Sauro put it, “there isn’t a usability thermometer to tell you how usable your software or website is.”
So instead, there are a few different metrics to measure different aspects of usability and the user experience. This article outlined only satisfaction metrics, so it necessarily left out other important metrics like completion rates, errors, task time, and the all important conversion rate. Still, the task level and the test level satisfaction numbers can give you a good indicator of your users’ satisfaction, at least as a benchmark to your optimization.
You don’t have to use all of the above questionnaires (obviously), so pick one from each category and start measuring.
Also, I highly recommend reading Jeff Sauro’s book, Quantifying the User Experience, as well as taking his Udemy course, “Practical Statistics for the User Experience.”