9 things most get wrong about usability testing – and how to fix them

January 28, 2022

Read time 9 min

How often do you conduct usability tests to find out whether your digital product-in-the-making is as good as you’d hope? 

If you’re like most, it’s not that often. It can somehow feel like not worth the effort. 

Whatever your reason, one thing is for sure: the think-aloud usability test, the cornerstone of usability science that you were taught at some respectable institute, is not serving you well. 

I should know. I design interaction for a living at Reaktor. Usability testing has all too often felt too laborious and unproductive, hardly something that I would recommend organizing.

But the good news is, we can do tweaks to make usability testing better! We’ll start with a couple of easy fixes that you might have in order already but many have not, and then switch gears to kick your usability testing on a totally new level. We’ll start with the 3 easy fixes you can do to improve the results and increase the value of usability testing. Then, we’ll look at the 6 things you most likely were taught about usability testing – and that are completely wrong.

The 3 Easy Fixes You Can Do to Improve Your Usability Testing: 

Fix 1. You can test your design yourself

Fix 2. Pay attention to action, not words

Fix 3. Realistic test tasks

Then, The 6 Things You Were Taught About Usability Testing that Are Completely Wrong: 

  1. You are doing science
  2. The test setup should be reproducible
  3. The results need to be thoroughly documented
  4. Every user should use the same prototype
  5. Test tasks should be the same for every user
  6. Test tasks should be defined by you beforehand

Ready? Let’s start!

Fix #1: You can test your design yourself

Some people see an ethics problem in designers testing their own designs. The thinking is that hired design consultants will deliberately find problems to gain more design work. Or that designers will try to avoid seeing the problems revealed by the tests, since they’re so in love with their perfect design baby. 

Both of those problems may be real in some circumstances, but hiring an outside agency is by no means clear of other ethical problems. The most pressing issue is that when you hire someone to find problems, problems they will find – no matter if they’re there or not. 

It all boils down to your attitude. If you’re sincerely trying to make your design work, a usability test organized by you is going to be significantly more cost-effective than one organized by an outside agency. You know the domain, the typical tasks, and you have a direct communication channel with developers. 

Just do it. 

Fix #2: Pay attention to action, not words

It’s surprising how many supposed experts in the field of usability repeat the mantra of  “we must listen to the users”. It’s as if whatever the users say would be gospel of some kind.

Let’s face it: although your aunt Miriam is a witty lady who represents your intended user group, that doesn’t make her a reputable expert. Yes, she has her own valuable point of view, but so does everyone else. Being a user doesn’t make her words golden. At best, her comments bring in a fresh perspective. At worst, they’re worthless. 

What matters is how aunt Miriam acts in the think-aloud test. She might complain out loud  about a ton of things, but if she manages to book the trip she was supposed to, the complaints aren’t that significant. On the other hand, she might fail to reserve her seats on the train in complete silence, without even noticing it herself. 

Usability testing is like any other user observation. Trust what the users do, not what they say, and focus your test efforts accordingly. The interview at the end of the test should be a short one to fill in the gaps and gather background knowledge. If you test for 20 minutes and interview for 40 minutes, you’re doing something else than testing. 

Fix #3: Realistic test tasks

The tasks in your test should be as realistic as possible. It’s tempting to define them a bit too much, like “Find the first train from Helsinki to Turku on next Saturday that arrives before 1 pm”  instead of an actual use case. That example is hardly a good representation of what humans do in the real world. Because it lacks an actual purpose, it’s too mechanic and unrealistic 

Real tasks are rich in context and a bit vague and ill-defined on the result side. Think of something like this:  “You are planning a trip from Helsinki to visit your relatives in Turku for the first time next Saturday. You should arrive at their place before 2 pm and you are returning home for the evening. Their address is…” 

The realism allows your aunt Miriam to immerse in the task, use her existing knowledge to make good decisions, and not approach the app like it was a Mensa test. 

These 3 fixes alone would make most of the usability tests I’ve seen at least three times more productive. But let’s look at some core tenets of scientific usability testing. You’ve perhaps thought to hold on to them even when you are just trying to make a software product better. Let’s see what happens if we realize them to be false.

Fallacy #0: You are doing science

Trying to make some software better is actually pretty different than doing usability science. Nobody is interested in your chi-squared test hypothesis​​. The things that matter are what’s wrong with the software and how we should fix it. The scientific usability test is good for proving things and writing scientific papers, but most likely that’s not what you’re doing. Admit that, and you’ll be rewarded with new possibilities. 

Fallacy #1: The test setup should be reproducible

In a scientific context, doing a reproducible setup is important. It enables others to verify your results by making the same test. But this is not your case. You’re developing a unique snowflake of a software, and nobody will ever want to repeat what you’re doing. They can’t! Thus it doesn’t matter at all if your test setup isn’t that well defined and reproducible. It’s a temporary thing that you’ve thrown together to learn.

Fallacy #2: The results need to be thoroughly documented

Why would you create a fancy, detailed report when you could discuss one problem / post-it documentation with your team immediately? Ditch the video – you’re not going to watch it anyway, it’s boring and a huge waste of time. Ask the team to fix the obvious problems, and be on the lookout for evidence of problems you’re not sure of. The results matter, not the documentation.

Fallacy #3: Every user should use the same prototype

In a scientific usability test, you keep on testing the prototype when you already know some of its problems based on the previous test rounds. This is because you want to gather data on how many users bump into the same issue. But if you’re simply making a design better, you don’t have to follow that rule. If there is an obvious flaw or a bug, have it fixed. Then continue with more tests. When you don’t need the numbers, there’s no point in testing what you already know isn’t working. 

Fallacy #4: Test tasks should be the same for everyone

One of the biggest problems of test tasks is that the users have a hard time orienting themselves in the unfamiliar situation of the task. Your cousin Zaid with no kids would find it difficult to immerse into a task where you’re supposed to book a train trip for a family of four. Zaid’s choices would not be realistic, because the situation is completely foreign to him. That setup is perfectly normal for aunt Miriam, though.

There is an easy fix: use tasks that suit the user. Singles like Zaid should book trips for one, family heads like Miriam should book trips for families. The same goes for all data. Singles most likely don’t want to stay in family hotels, for example. Thus you have one set of tasks for Zaid and another for Miriam.

This is pure heresy in the world of scientific usability testing. Now we can’t know how many users solved the particular task in a similar way. But in your testing, that wasn’t the point anyway, so it’s all good!

Fallacy #5: Test tasks should be defined beforehand

Let’s take a step further: why do you have to give your users some tasks in the first place? I would argue the best possible way to go about tasks is that they tell you what they would like to achieve first, and then you watch them do exactly that.

This may sound totally ludicrous at first. But if you have managed to recruit people that could actually use your software, there’s a pretty big chance they have had a situation in their past where your system would have been useful. Thus you interview Zaid and Miriam first to understand their actual use cases, and then ask them to use your software for the same purpose.

Pros: you collect and understand realistic cases of your users and increase your knowledge. Cons: You must have pretty well  functioning prototype at hand and be willing to give up some of sense of control. But if you can pull this off, you get at least double the amount of data than you would from a traditional usability test.

Needless to say, this is again pure bonkers blasphemy from the point of view of scientific usability testing. But in practice and in your case, it works like a charm!

If you take all the steps above, you end up with something that resembles a classic usability test, but is different in meaningful ways. The essence is the same: users do tasks with your prototype for the first time, you focus on learnability problems, they think out loud while doing the tasks, you steer clear from guiding them, and so forth.

But now, you’ve also removed many of the constraints that make testing so laborious. You don’t need test tasks, you don’t need the users to participate on the same or consecutive days, you can already fix problems between test sets, and you don’t create extensive reports.

The process is much lighter, produces better results for your needs, and is much more tempting to do. Perhaps we should call it “the practical usability test”. 

Sign up for our newsletter

Get the latest from us in tech, business, design – and why not life.