Pros and Cons of Remote Usability Testing

by Nate Bolt

Originally published on Johnnyholland.org in June 2010.

In-person user research used to be the only game in town, and as with most industry practices, its procedures were developed, refined, standardized, and then became entrenched in the corporate R&D product development cycle. Practically everything gets tested in a lab, hallway, or conference room nowadays: commercial web sites, professional and consumer software, even video games. But nowadays we’ve got remote usability testing.

Part of the appeal of formal lab research was that it provided a scientific-seeming basis for making decisions by using observational data, instead of someone’s error-prone gut instincts. Stakeholders appreciated the firm protocol and apparent reliability of properly managed lab research. But for all of us who have sat through formal studies with one-way mirrors, we know that there is a lot of bullshit that goes on—participants pretending to care, moderators pretending to understand, and stakeholders pretending to be open-minded. The appeal of what the kids call “guerilla testing”—informal testing, where you simply grab someone within shouting distance and ask them to use your interface—is clear. It’s easy, fast, and can produce great results. Lots of user research practitioners continue to use in-person methods because it’s what people have been doing for a long time.

There’s nothing specifically wrong with in-person research. But there is that whole Internet thing that’s been happening. It does have some unique properties we can take advantage of to do things that weren’t possible with old-school research. Like these things:

Insane Cost Savings

Usertesting.com is $39 per user. Compare that to flying to Chicago for three days to watch twelve people talk behind a one-way mirror, and that’s thousands of dollars in savings. Rolf Molich has been organizing the Comparative Usability Evaluation study (CUE) for eight years, where different usability methods and teams independently evaluate the same site. He knows something about comparing different research techniques, and makes the point that while there are advantages and disadvantages to a remote method like UserTesting.com, the “price/performance ratio was amazing” (that was before a price increase, but the cost is still quite low). Beyond travel expenses, other costs associated with in-person testing may be reduced or eliminated when you test remotely. Unless you’re doing guerrilla testing. With tools like Silverback, guerrilla in-person methods don’t have to cost much more than remote, but you are usually more limited by the audience. So in terms of cost comparison, let’s just say that remote testing will usually offer a big cost savings.

Time-aware Research

Catching people in the middle of a task with a web or software intercept like ethnio (note: this is a product of ourselves) and calling them within a few seconds to share their screen and watch them use a tool remotely on their own timeline. It’s a degree of accuracy that never existed before. You could argue that it’s ethnographic in a way that is not possible with physical observation, but you could also spend your whole life arguing about that. Let’s not. Tools like Revelation and the Track Your Happiness project at Harvard use native timelines to gain insight. That’s a really big deal.

By now UX researchers are familiar with the importance of understanding the usage context of an interface—the physical environment where people are normally using an interface. Remote research opens the door to conducting research that also happens at the moment in people’s real lives when they’re performing a task of interest. This is possible because of live recruiting (the subject of Chapter 3 of the book), a method that allows you to instantly recruit people who are right in the middle of performing the task you’re interested in, using anything from the Web to text messages. Time-awareness in research makes all the difference in user motivation: it means that users are personally invested in what they’re doing because they’re doing it for their own reasons, not because you’re directing them to; they would have done it whether or not they were in your study.

Consider the difference between these two scenarios:

You’ve been recruited for some sort of computer study. The moderator shows you this online map Web app you’ve never heard of and asks you to use it to find some random place you’ve never heard of. This task is a little tricky, but since you’re sitting in this quiet lab and focusing—and you can’t collect your incentive check and leave until you finish—you figure it out eventually. Not so bad.
You’ve been planning a family vacation for months, but you’ve been busy at work so you procrastinated a bit on the planning, and now it’s the morning of the trip and you’re trying to quickly print out directions between finishing your packing and getting your kids packed. Your coworker told you about this MapTool Web site you’ve never used before, so you decide to give it a shot, and it’s not so bad—that is, until you get stuck because you can’t find the freaking button to print out the directions, and you’re supposed to leave in an hour, but you can’t until you print these damn directions, but your kids are jumping up and down on their suitcases and asking you where everything is. Why can’t they just make this stupid crap easy to use? Isn’t it obvious what’s wrong with it? Haven’t they ever seen a real person use it before?

Circumstances matter a lot in user research, and someone who’s using an interface in real life, for real purposes, is going to behave a lot differently—and give more accurate feedback—than someone who’s just being told to accomplish some little task to be able to collect an incentive check. Time-awareness is an important concept, so we’ll bring it up again throughout this book to demonstrate how the concept relates to different aspects of the remote research process (recruiting, moderating, and so on).

Circumstances matter a lot in user research, and someone who’s using an interface in real life, for real purposes, is going to behave a lot differently—and give more accurate feedback—than someone who’s just being told to accomplish some little task to be able to collect an incentive check.

Technological Ecosystem

Some interfaces just don’t make any sense to test outside their intended usage environment. If you need the users to have their own photos and videos to use in a video editing tool, having them bring their laptop or media to a lab is an amazing hassle. Or, let’s say you’re testing a recipe Web site that guides users step-by-step through preparing a meal; it wouldn’t make much sense to take people out of their kitchen, where they’re unable to perform the task of interest. When this is the case, remote research is usually the most practical solution, unless the users also lack the necessary equipment. We also call this the participant’s “technological ecosystem” because it implies that their devices and computing environment have an impact on how they interact.

Democratization of User Testing

That’s right, I said it. Democracy. As in, anyone in the world no matter how far removed from their potential audience can conduct user testing with less obstacles than before. After ten years of user research, 260 studies, and 3,000 participants at bolt | peters, we’ve noticed a trend lately that more people are doing their own research than ever before. And it’s great. There’s no reason to hire a specialist to observe real-world technology behavior. And that’s coming from a specialist.

Geographic Diversity

Even if you do have a lab, the users you want to talk to may not be able to get to it. This is actually the most common scenario: your interface, like most, is designed to be accessed and used all around the world, and you want to talk to users from around the world to get a range of perspectives. Will Chinese players like my video game? Is my online map widget intuitive even for users outside Silicon Valley? Big companies like Nokia and Microsoft are often able to conduct huge, ambitious research projects to address these questions, coordinating research projects in different labs around the world, flying researchers around in first class. If you don’t have the cash for an international longitudinal Gorillas-in-the-Mist project, then remote research is a no-brainer solution. If you can’t get to where your users are, test them remotely.

And Why Not?

Both in-person and remote UX research share the same broad purpose: to understand how people interact and behave with the interface you’ve made. There’s no need to set up a false opposition between the two approaches—one isn’t inherently better than the other. Despite the versatility of remote research, there are lots of reasons you might want to conduct an in-person study instead, most of which have to do with timing, security, equipment, or the type of interaction you want to have with participants.

Both in-person and remote UX research share the same broad purpose: to understand how people interact and behave with the interface you’ve made. There’s no need to set up a false opposition between the two approaches—one isn’t inherently better than the other.

Security

Security is often a concern for institutions like banks and hospitals, which deal in sensitive information, or companies concerned with guarding certain types of intellectual property. If you’re testing a top-secret prototype, you obviously don’t want to let people access something from their home computer, where it could be saved or screen-captured. On the other hand, you might also be doing a study on users who would be secretive about sharing what’s on their screen—government employees, doctors, or lab technicians, for instance. Either way, you’ll want to test users in a controlled lab environment to keep things confidential, especially if what you’re testing is so hush-hush that you’ve got to have your users sign a nondisclosure form.

Inability to Use Screen-sharing

You might also want to use a lab if your users are unable to share their screen over the Internet, for whatever reason. Some studies (of rural users, cybercafe patrons, etc.) may require you to talk to users who don’t have reliable high-speed Internet connections, who own computers too slow or unstable to use screen sharing services effectively, or who have operating systems incompatible with the screen sharing tools you’re using. These restrictions only apply to moderated studies, for which you need to see what’s on your users’ screens.

The Need for Special Equipment

Depending on the interface you’re testing, you may require certain special software or physical equipment to run the study properly; this is most often the case with software that’s still under development. Getting users to install and configure tools to run elaborate software can be a pain (though that’s not unheard of), and requiring users to have certain equipment can make recruiting needlessly difficult.

The Importance of Seeing the User’s Body

Some kinds of research will require you to study certain things about the user that are difficult to gather remotely. UX research has recently begun using eye-tracking studies, and for that kind of study, you’d need to bring the users to the eye-tracking device. Other studies might require you to attend to the participants’ physical movements, which may be difficult to capture with a stationary webcam.

Conclusion

You don’t necessarily have to choose between lab and remote methods. You can even conduct multiple studies on the same interface, using the findings from one study to add nuance to another. Probably excessive for the average study, but for really large-scale projects where you just want to gather every bit of information you can (a new version of a complex software program, an overhauled IA, etc.), being comprehensive can’t hurt.

You should have a good idea of whether or not remote research suits you. Give it a try—if it’s not your thing, you can always go back to lab testing. We won’t tell anyone.

Published here on July 15, 2010.

How to Win Stakeholders & Influence Decisions program

Gain the power skills you need to grow your influence on critical product decisions.

Get mentored and coached by Jared Spool in a 16-week program.

Learn more about our How to Win Stakeholders & Influence Decisions program today!