Eight is Not Enough
One of the most popular questions we hear from web designers and usability professionals is: “How many users is enough when conducting usability tests?” Until recently, we had believed—and told our clients—that it wasn’t usually necessary to test with more than eight users. We based our recommendation on the widely held theory that eight users will detect almost all of your web site or software usability problems.
According to a paper published by Robert Virzi (1992), 80% of all usability problems can be found with four or five users. Additionally, the first few users will most likely detect the biggest usability problems. Jakob Nielsen and Thomas Landauer (1993) provided further support for this theory when they found that the first five users will uncover about 70% of the major usability problems and the next few users will find nearly all the remaining problems, up to 85% or so.
In a recent study, we decided to put the widely held belief that “eight users is enough” to the test. We conducted usability tests on an e-commerce web site using a very straightforward task: buying a CD from an online music store. We chose users who had a history of purchasing music online. We asked these users to make a shopping list of CDs they wanted to buy and gave them money to spend on these items. Based on previous findings, we expected to detect 85% of the site’s usability problems with the first eight users. We also suspected that all of the serious obstacles would be evident early. Additionally, after each progressive test, we expected to see an increase in repeat usability problems.
When we tested the site with 18 users, we identified 247 total obstacles-to-purchase. Contrary to our expectations, we saw new usability problems throughout the testing sessions. In fact, we saw more than five new obstacles for each user we tested.
Equally important, we found many serious problems for the first time with some of our later users. What was even more surprising to us was that repeat usability problems did not increase as testing progressed.
These findings clearly undermine the belief that five users will be enough to catch nearly 85% of the usability problems on a web site. In our tests, we found only 35% of all usability problems after the first five users. We estimated over 600 total problems on this particular online music site. Based on this estimate, it would have taken us 90 tests to discover them all!
When we explored the difference between our results and previous findings, the disparity started to make sense to us. While we had tested users on an e-commerce site, Virzi and others had tested users on software products. Today’s web sites, particularly e-commerce sites, can be more complex than standard software products that often confine users to a very limited set of activities. Web tasks are also vastly more complex than those users have with most software applications. For example, our tests asked users to complete shopping tasks. No two users looked for the same product and no two users approached the site in the same way. The tasks were dependent on individual user characteristics and interests. Because of the increased complexity of web sites, it’s understandable that more users are needed to detect the majority of usability problems.
These findings may sound daunting, but we’re not really advocating that you plan to bring in 90 users for your next major usability study. Rather, if you’re working on a large e-commerce site—or any web site at all—the usability of your site would likely benefit from ongoing testing. Instead of thinking of usability testing as a discrete activity that takes place every 6 months and involves six, eight or twelve users, think about the advantages of ongoing usability testing, bringing in a user or two every week.
With this kind of plan, you’ll see over 20 users in six months. With more users testing your site, you’ll get more feedback, find more problems, and have more data, but there may be some less obvious advantages as well. When a design team gets into the mindset of regular testing, they can try out new ideas and find out whether these work without making the live site the testing ground. Because web sites undergo many incremental, seemingly small design changes in between drastic redesigns, there’s always new fodder for testing.
Also, the constant exposure to users can benefit the whole team, which, with regular testing, will be less estranged from their user base. Understanding who uses the site, what language is important to them, what they’re trying to accomplish, and whether they can do it benefits the team, the site, and the company.