You are on page 1of 6

Usability testing

1
Usability testing
Usability testing is a technique used in user-centered interaction design to evaluate a product by testing it on users.
This can be seen as an irreplaceable usability practice, since it gives direct input on how real users use the system.
[1]
This is in contrast with usability inspection methods where experts use different methods to evaluate a user interface
without involving users.
Usability testing focuses on measuring a human-made product's capacity to meet its intended purpose. Examples of
products that commonly benefit from usability testing are foods, consumer products, web sites or web applications,
computer interfaces, documents, and devices. Usability testing measures the usability, or ease of use, of a specific
object or set of objects, whereas general human-computer interaction studies attempt to formulate universal
principles.
What usability testing is not
Simply gathering opinions on an object or document is market research or qualitative research rather than usability
testing. Usability testing usually involves systematic observation under controlled conditions to determine how well
people can use the product.
[2]
However, often both qualitative and usability testing are used in combination, to better
understand users' motivations/perceptions, in addition to their actions.
Rather than showing users a rough draft and asking, "Do you understand this?", usability testing involves watching
people trying to use something for its intended purpose. For example, when testing instructions for assembling a toy,
the test subjects should be given the instructions and a box of parts and, rather than being asked to comment on the
parts and materials, they are asked to put the toy together. Instruction phrasing, illustration quality, and the toy's
design all affect the assembly process.
Methods
Setting up a usability test involves carefully creating a scenario, or realistic situation, wherein the person performs a
list of tasks using the product being tested while observers watch and take notes. Several other test instruments such
as scripted instructions, paper prototypes, and pre- and post-test questionnaires are also used to gather feedback on
the product being tested. For example, to test the attachment function of an e-mail program, a scenario would
describe a situation where a person needs to send an e-mail attachment, and ask him or her to undertake this task.
The aim is to observe how people function in a realistic manner, so that developers can see problem areas, and what
people like. Techniques popularly used to gather data during a usability test include think aloud protocol,
Co-discovery Learning and eye tracking.
Hallway testing
Hallway testing is a general method of usability testing. Rather than using an in-house, trained group of testers, just
five to six random people are brought in to test the product, or service. The name of the technique refers to the fact
that the testers should be random people who pass by in the hallway.
[3]
Hallway testing is particularly effective in the early stages of a new design when the designers are looking for "brick
walls," problems so serious that users simply cannot advance. Anyone of normal intelligence other than designers
and engineers can be used at this point. (Both designers and engineers immediately turn from being test subjects into
being "expert reviewers." They are often too close to the project, so they already know how to accomplish the task,
thereby missing ambiguities and false paths.)
Usability testing
2
Remote Usability Testing
In a scenario where usability evaluators, developers and prospective users are located in different countries and time
zones, conducting a traditional lab usability evaluation creates challenges both from the cost and logistical
perspectives. These concerns led to research on remote usability evaluation, with the user and the evaluators
separated over space and time. Remote testing, which facilitates evaluations being done in the context of the users
other tasks and technology can be either synchronous or asynchronous. Synchronous usability testing methodologies
involve video conferencing or employ remote application sharing tools such as WebEx. The former involves real
time one-on-one communication between the evaluator and the user, while the latter involves the evaluator and user
working separately.
Asynchronous methodologies include automatic collection of users click streams, user logs of critical incidents that
occur while interacting with the application and subjective feedback on the interface by users. Similar to an in-lab
study, an asynchronous remote usability test is task-based and the platforms allow you to capture clicks and task
times. Hence, for many large companies this allows you to understand the WHY behind the visitors' intents when
visiting a website or mobile site. Additionally, this style of user testing also provides an opportunity to segment
feedback by demographic, attitudinal and behavioural type. The tests are carried out in the users own environment
(rather than labs) helping further simulate real-life scenario testing. This approach also provides a vehicle to easily
solicit feedback from users in remote areas quickly and with lower organisational overheads.
Numerous tools are available to address the needs of both these approaches. WebEx and Go-to-meeting are the most
commonly used technologies to conduct a synchronous remote usability test.
[4]
However, synchronous remote
testing may lack the immediacy and sense of presence desired to support a collaborative testing process. Moreover,
managing inter-personal dynamics across cultural and linguistic barriers may require approaches sensitive to the
cultures involved. Other disadvantages include having reduced control over the testing environment and the
distractions and interruptions experienced by the participants in their native environment. One of the newer methods
developed for conducting a synchronous remote usability test is by using virtual worlds. In recent years, conducting
usability testering asynchronously has also become prevalent and allows testers to provide their feedback at their free
time and in their own comfort at home. Many tools are available online that facilitate this process including
UXArmy, Optimizely, Usabilla, UserZoom etc.
Expert review
Expert review is another general method of usability testing. As the name suggests, this method relies on bringing in
experts with experience in the field (possibly from companies that specialize in usability testing) to evaluate the
usability of a product.
A Heuristic evaluation or Usability Audit is an evaluation of an interface by one or more Human Factors experts.
Evaluators measure the usability, efficiency, and effectiveness of the interface based on usability principles, such as
the 10 usability heuristics originally defined by Jakob Nielsen in 1994.
Nielsens Usability Heuristics, which have continued to evolve in response to user research and new devices,
include:
Visibility of System Status
Match Between System and the Real World
User Control and Freedom
Consistency and Standards
Error Prevention
Recognition Rather Than Recall
Flexibility and Efficiency of Use
Aesthetic and Minimalist Design
Help Users Recognize, Diagnose, and Recover from Errors
Usability testing
3
Help and Documentation
Automated expert review
Similar to expert reviews, automated expert reviews provide usability testing but through the use of programs
given rules for good design and heuristics. Though an automated review might not provide as much detail and
insight as reviews from people, they can be finished more quickly and consistently. The idea of creating surrogate
users for usability testing is an ambitious direction for the Artificial Intelligence community.
A/B Testing
Main article: A/B testing
In web development and marketing, A/B testing or split testing is an experimental approach to web design
(especially user experience design), which aims to identify changes to web pages that increase or maximize an
outcome of interest (e.g., click-through rate for a banner advertisement). As the name implies, two versions (A and
B) are compared, which are identical except for one variation that might impact a user's behavior. Version A might
be the currently used version, while Version B is modified in some respect. For instance, on an e-commerce website
the purchase funnel is typically a good candidate for A/B testing, as even marginal improvements in drop-off rates
can represent a significant gain in sales. Significant improvements can be seen through testing elements like copy
text, layouts, images and colors. Multivariate testing or bucket testing is similar to A/B testing, but tests more than
two different versions at the same time.
How many users to test?
In the early 1990s, Jakob Nielsen, at that time a researcher at Sun Microsystems, popularized the concept of using
numerous small usability teststypically with only five test subjects eachat various stages of the development
process. His argument is that, once it is found that two or three people are totally confused by the home page, little is
gained by watching more people suffer through the same flawed design. "Elaborate usability tests are a waste of
resources. The best results come from testing no more than five users and running as many small tests as you can
afford." Nielsen subsequently published his research and coined the term heuristic evaluation.
The claim of "Five users is enough" was later described by a mathematical model
[5]
which states for the proportion
of uncovered problems U
where p is the probability of one subject identifying a specific problem and n the number of subjects (or test
sessions). This model shows up as an asymptotic graph towards the number of real existing problems (see figure
below).
Usability testing
4
In later research Nielsen's claim has eagerly been questioned with both empirical evidence
[6]
and more advanced
mathematical models.
[7]
Two key challenges to this assertion are:
1. 1. Since usability is related to the specific set of users, such a small sample size is unlikely to be representative of
the total population so the data from such a small sample is more likely to reflect the sample group than the
population they may represent
2. Not every usability problem is equally easy-to-detect. Intractable problems happen to decelerate the overall
process. Under these circumstances the progress of the process is much shallower than predicted by the
Nielsen/Landauer formula.
[8]
It is worth noting that Nielsen does not advocate stopping after a single test with five users; his point is that testing
with five users, fixing the problems they uncover, and then testing the revised site with five different users is a better
use of limited resources than running a single usability test with 10 users. In practice, the tests are run once or twice
per week during the entire development cycle, using three to five test subjects per round, and with the results
delivered within 24 hours to the designers. The number of users actually tested over the course of the project can
thus easily reach 50 to 100 people.
In the early stage, when users are most likely to immediately encounter problems that stop them in their tracks,
almost anyone of normal intelligence can be used as a test subject. In stage two, testers will recruit test subjects
across a broad spectrum of abilities. For example, in one study, experienced users showed no problem using any
design, from the first to the last, while naive user and self-identified power users both failed repeatedly. Later on, as
the design smooths out, users should be recruited from the target population.
When the method is applied to a sufficient number of people over the course of a project, the objections raised above
become addressed: The sample size ceases to be small and usability problems that arise with only occasional users
are found. The value of the method lies in the fact that specific design problems, once encountered, are never seen
again because they are immediately eliminated, while the parts that appear successful are tested over and over. While
it's true that the initial problems in the design may be tested by only five users, when the method is properly applied,
the parts of the design that worked in that initial test will go on to be tested by 50 to 100 people.
Usability testing
5
Example
A 1982 Apple Computer manual for developers advised on usability testing:
1. 1. "Select the target audience. Begin your human interface design by identifying your target audience. Are you
writing for businesspeople or children?"
2. 2. Determine how much target users know about Apple computers, and the subject matter of the software.
3. 3. Steps 1 and 2 permit designing the user interface to suit the target audience's needs. Tax-preparation software
written for accountants might assume that its users know nothing about computers but are expert on the tax code,
while such software written for consumers might assume that its users know nothing about taxes but are familiar
with the basics of Apple computers.
Apple advised developers, "You should begin testing as soon as possible, using drafted friends, relatives, and new
employees":
Our testing method is as follows. We set up a room with five to six computer systems. We schedule two
to three groups of five to six users at a time to try out the systems (often without their knowing that it is
the software rather than the system that we are testing). We have two of the designers in the room. Any
fewer, and they miss a lot of what is going on. Any more and the users feel as though there is always
someone breathing down their necks.
Designers must watch people use the program in person, because
Ninety-five percent of the stumbling blocks are found by watching the body language of the users.
Watch for squinting eyes, hunched shoulders, shaking heads, and deep, heart-felt sighs. When a user hits
a snag, he will assume it is "on account of he is not too bright": he will not report it; he will hide it ... Do
not make assumptions about why a user became confused. Ask him. You will often be surprised to learn
what the user thought the program was doing at the time he got lost.
Usability Testing Education
Usability testing has been a formal subject of academic instruction in different disciplines.
References
[1] [1] Nielsen, J. (1994). Usability Engineering, Academic Press Inc, p 165
[2] http:/ / jerz.setonhill. edu/ design/ usability/ intro.htm
[3] [3] ; references
[4] http:/ / www. boxesandarrows.com/ view/ remote_online_usability_testing_why_how_and_when_to_use_it
[5] [5] Virzi, R.A., Refining the Test Phase of Usability Evaluation: How Many Subjects is Enough? Human Factors, 1992. 34(4): p. 457-468.
[6] http:/ / citeseer. ist. psu. edu/ spool01testing.html
[7] Caulton, D.A., Relaxing the homogeneity assumption in usability testing. Behaviour & Information Technology, 2001. 20(1): p. 1-7
[8] Schmettow, Heterogeneity in the Usability Evaluation Process. In: M. England, D. & Beale, R. (ed.), Proceedings of the HCI 2008, British
Computing Society, 2008, 1, 89-98
External links
Usability.gov (http:/ / www. usability. gov/ )
A Brief History of the Magic Number 5 in Usability Testing (http:/ / www. measuringusability. com/ blog/
five-history. php)
Article Sources and Contributors
6
Article Sources and Contributors
Usability testing Source: http://en.wikipedia.org/w/index.php?oldid=607543272 Contributors: 137.28.191.xxx, A Quest For Knowledge, Aapo Laitinen, Aconversationalone, Al Tereego, Alan
Pascoe, Alhussaini h, Alvin-cs, Antariksawan, Arthena, Azrael81, Bihco, Bkillam, Bkyzer, Bluerasberry, Brandon, Breakthru10technologies, Bretclement, ChrisJMoor, Christopher Agnew,
Cjohansen, Ckatz, Conversion script, Crnica, DXBari, Dennis G. Jerz, Dickohead, Diego Moya, Dobrien, DrJohnBrooke, Dvandersluis, EagleFan, Farreaching, Fgnievinski, Fraggle81,
Fredcondo, Geoffsauer, Gmarinp, Gokusandwich, GraemeL, Gubbernet, Gumoz, Harrison Mann, Headbomb, Hede2000, Hooperbloob, Hstetter, Itsraininglaura, Ivan Pozdeev, JDBravo, JaGa,
Jean-Frdric, Jeff G., Jetuusp, Jhouckwh, Jmike80, Josve05a, Jtcedinburgh, Karl smith, Kolyma, Kuru, Lakeworks, Leonard^Bloom, LizardWizard, Malross, Mandalaz, Manika, MaxHund,
Mchalil, Miamichic, Michael Hardy, MichaelMcGuffin, MikeBlockQuickBooksCPA, Milan.simeunovic, Millahnna, Mindmatrix, Omegatron, Op47, Pavel Vozenilek, Pghimire, Philipumd,
Pigsonthewing, Pindakaas, Pinecar, QualMod, RHaworth, Ravialluru, Researcher1999, Rich Farmbrough, Rlsheehan, Ronz, Rossami, Rtz92, Schmettow, Shadowjams, Siddhi, Spalding, Taigeair,
Tamarkot, Technopat, Tobias Bergemann, Toghome, Tomhab, Toomuchwork, TwoMartiniTuesday, UniDIMEG, UsabilityCDSS, Uxmaster, Vijaylaxmi Sharma, Vmahi9, Wikinstone, Wikitonic,
Willem-Paul, Woohookitty, Wwheeler, Yettie0711, Ylee, ZeroOne, 158 anonymous edits
Image Sources, Licenses and Contributors
Image:Virzis Formula.PNG Source: http://en.wikipedia.org/w/index.php?title=File:Virzis_Formula.PNG License: Public Domain Contributors: Original uploader was Schmettow at
en.wikipedia. Later version(s) were uploaded by NickVeys at en.wikipedia.
License
Creative Commons Attribution-Share Alike 3.0
//creativecommons.org/licenses/by-sa/3.0/

You might also like