Professional Documents
Culture Documents
Introducing Evaluation
www.id-book.com
2011
The aims
How can the usability of a system be
evaluated?
How can usability problems be found and
improvements suggested?
www.id-book.com
2011
2011
Types of evaluation
3 broad categories, depending on the setting,
user involvement and level of control.
Controlled settings involving users
- usability testing & experiments in laboratories
and living labs.
Natural settings involving users
- field studies to see how the product is used in
the real world.
Any settings not involving users
- consultants critique; to predict, analyze &
model aspects of the interface analytics.
www.id-book.com
2011
use
Living labs
Peoples use of technology in their
everyday lives can be evaluated in
living labs.
Such evaluations are too difficult to
do in a usability lab.
Eg the Aware Home was embedded
with a complex network of sensors
and audio/video recording devices
(Abowd et al., 2000).
www.id-book.com
2011
www.id-book.com
2011
Evaluation methods
Method
Controlled Natural
settings
settings
Observing
Asking
users
x
x
Asking
experts
Testing
x
x
Modeling
www.id-book.com
Without
users
2011
Usability Testing
Usability testing refers to evaluating a product, website,
mobile app or system by testing it with representative
users with real-life scenarios and user task.
The goal is to identify any usability problems, collect
qualitative and quantitative data and determine the
participant's satisfaction with the product.
www.id-book.com
10
2011
Usability Testing
1. Get representative users
5 10 participants
2. Define criteria for evaluation
Time to complete a task.
Time to complete a task after a specified time
away from the product.
Number and type of errors per task.
Number of errors per unit of time.
Number of navigations to online help or
manuals.
Number of users making a particular errors.
Number of users completing
a task successfully.
www.id-book.com
11
2011
Usability Testing
3. Develop test scenario:setup+context+task
Choose relevant scenarios (typical vs extreme)
Keep task duration shorter than 30 minutes
Ensure identical conditions for all participants
4. Consider ethical issues
De-brief participants, get consent, etc.
5. Run pilot tests & refine design
Practice with staff and observers
6. Actual testing
Instruction of participants
Carry out test and record data
www.id-book.com
12
2011
Usability Testing
7. Analysis
Statistics eg. Mouse events, menu selection
Screen design : gaze tracking and course of task
completion
Post task video confrontation and user interview
8. Report results and make recommendations
for improvement.
www.id-book.com
13
2011
U s a b ilit y la b w it h o b s e r v e r s
w a t c h in g a u s e r & a s s is t a n t
w w w .id - b o o k .c o m
2011
2011
2011
U s a b ilit y t e s t in g & r e s e a r c h
U s a b ilit y t e s t in g
Im p ro v e p ro d u c ts
F e w p a r t ic ip a n t s
R e s u lt s in f o r m d e s ig n
U s u a lly n o t
c o m p le t e ly r e p lic a b le
C o n d it io n s c o n t r o lle d
a s m u c h a s p o s s ib le
P r o c e d u r e p la n n e d
R e s u lt s r e p o r t e d t o
d e v e lo p e r s
w w w .id - b o o k .c o m
E x p e r im e n t s fo r
resea rch
D is c o v e r k n o w le d g e
M a n y p a r t ic ip a n t s
R e s u lt s v a lid a t e d
s t a t is t ic a lly
M u s t b e r e p lic a b le
S t r o n g ly c o n t r o lle d
c o n d it io n s
E x p e r im e n t a l d e s ig n
S c ie n t if ic r e p o r t t o
s c ie n t if ic c o m m u n it y
2011
2011
P o r t a b le e q u ip m e n t fo r u s e in
t h e f ie ld
2011
Task
iBook
eBay
Time Magazine
Kayak
Experiments
Predict the relationship between two or
more variables.
Independent variable is manipulated by
the researcher.
Dependent variable depends on the
independent variable.
Typical experimental designs have one or
two independent variable.
Validated statistically & replicable.
www.id-book.com
19
2011
E x p e r im e n t a l d e s ig n s
D if f e r e n t p a r t ic ip a n t s - s in g le g r o u p
o f p a r t ic ip a n t s is a llo c a t e d r a n d o m ly
t o t h e e x p e r im e n t a l c o n d it io n s .
S a m e p a r t ic ip a n t s - a ll p a r t ic ip a n t s
a p p e a r in b o t h c o n d it io n s .
M a t c h e d p a r t ic ip a n t s - p a r t ic ip a n t s
a r e m a t c h e d in p a ir s , e . g . , b a s e d o n
e x p e r t is e , g e n d e r , e t c .
2011
D if f e r e n t , s a m e , m a t c h e d
p a r t ic ip a n t d e s ig n
D e s ig n
A d v a n ta g e s
D is a d v a n t a g e s
D iffe r e n t
N o o rd e r e ffe c ts
M a n y s u b je c ts &
in d iv id u a l d iffe r e n c e s a
p r o b le m
Sam e
F e w in d iv id u a ls , n o
in d iv id u a l d iffe r e n c e s
C o u n t e r - b a la n c in g
needed because of
o r d e r in g e ffe c t s
M a tch e d
S a m e a s d if fe r e n t
p a r t ic ip a n t s b u t
in d iv id u a l d if fe r e n c e s
re d u ce d
C a n n o t b e su re o f
p e r f e c t m a t c h in g o n a ll
d iffe r e n c e s
2011
F ie ld s t u d ie s
F ie ld s t u d ie s a r e d o n e in n a t u r a l s e t t in g s .
in t h e w ild is a t e r m f o r p r o t o t y p e s b e in g
u s e d f r e e ly in n a t u r a l s e t t in g s .
A im t o u n d e r s t a n d w h a t u s e r s d o n a t u r a lly
a n d h o w t e c h n o lo g y im p a c t s t h e m .
F ie ld s t u d ie s a r e u s e d in p r o d u c t d e s ig n t o :
- id e n t if y o p p o r t u n it ie s f o r n e w t e c h n o lo g y ;
- d e t e r m in e d e s ig n r e q u ir e m e n t s ;
- d e c id e h o w b e s t t o in t r o d u c e n e w
t e c h n o lo g y ;
- e v a lu a t e t e c h n o lo g y in u s e .
2011
U b iF it G a r d e n : A n in t h e w ild
s tu d y
2011
A n a ly t ic a l e v a lu a t io n
D e s c r ib e t h e k e y c o n c e p t s a s s o c ia t e d
w it h in s p e c t io n m e t h o d s .
E x p la in h o w t o d o h e u r is t ic e v a lu a t io n
a n d w a lk t h r o u g h s .
E x p la in t h e r o le o f a n a ly t ic s in e v a lu a t io n .
D e s c r ib e h o w t o p e r f o r m t w o t y p e s o f
p r e d ic t iv e m e t h o d s , G O M S a n d F it t s L a w .
2011
I n s p e c t io n s
S e v e r a l k in d s .
E x p e r t s u s e t h e ir k n o w le d g e o f u s e r s &
t e c h n o lo g y t o r e v ie w s o f t w a r e u s a b ilit y .
E x p e r t c r it iq u e s ( c r it s ) c a n b e f o r m a l o r
in f o r m a l r e p o r t s .
H e u r is t ic e v a lu a t io n is a r e v ie w g u id e d
b y a s e t o f h e u r is t ic s .
W a lk t h r o u g h s in v o lv e s t e p p in g t h r o u g h
a p r e - p la n n e d s c e n a r io n o t in g p o t e n t ia l
p r o b le m s .
2011
H e u r is t ic e v a lu a t io n
D e v e lo p e d J a c o b N ie ls e n in t h e e a r ly
1990s.
B a s e d o n h e u r is t ic s d is t ille d f r o m a n
e m p ir ic a l a n a ly s is o f 2 4 9 u s a b ilit y
p r o b le m s .
T h e s e h e u r is t ic s h a v e b e e n r e v is e d f o r
c u r r e n t t e c h n o lo g y .
H e u r is t ic s b e in g d e v e lo p e d f o r m o b ile
d e v ic e s , w e a r a b le s , v ir t u a l w o r ld s , e t c .
D e s ig n g u id e lin e s f o r m a b a s is f o r
d e v e lo p in g h e u r is t ic s .
2011
N ie ls e n s o r ig in a l h e u r is t ic s
1. Visibility of system status
2. Match between system and the real world
Speak the user's language, follow real-world conventions,
make information appear in a natural and logical order
3. User freedom and control
Provide a clearly marked emergency exit to leave an
unwanted state (undo and redo)
4. Consistency and standards
Users should not have to wonder whether different words,
situations, or actions means the same thing.
5. Error prevention
2011
N ie ls e n s o r ig in a l h e u r is t ic s
6. Recognition rather than recall
7. Flexibility and efficiency of use
Cater both inexperienced and experienced users, allow to
tailor frequent actions
8. Aesthetic and minimalist design
Provide no irrelevant or rarely needed info
9. Help users recognize, diagnose and recover from errors
Error messages in plain language (no codes), precisely
indicate the problem, suggest a solution
10. Help and documentation
Provide help and documentation, easy to search, focus on
user task, list concrete steps to be carried out, not too large2011
D is c o u n t e v a lu a t io n
H e u r is t ic e v a lu a t io n is r e f e r r e d t o
a s d is c o u n t e v a lu a t io n w h e n 5
e v a lu a t o r s a r e u s e d .
E m p ir ic a l e v id e n c e s u g g e s t s t h a t
o n a v e r a g e 5 e v a lu a t o r s id e n t if y
7 5 - 8 0 % o f u s a b ilit y p r o b le m s .
2011
N o . o f e v a lu a t o r s & p r o b le m s
2011
3 s t a g e s f o r d o in g h e u r is t ic
e v a lu a t io n
B r ie f in g s e s s io n t o t e ll e x p e r t s w h a t t o
do.
E v a lu a t io n p e r io d o f 1 - 2 h o u r s in w h ic h :
E a c h e x p e r t w o r k s s e p a r a t e ly ;
T a k e o n e p a s s to g e t a fe e l fo r th e p ro d u c t;
T a k e a s e c o n d p a s s t o f o c u s o n s p e c if ic
fe a tu re s .
D e b r ie f in g s e s s io n in w h ic h e x p e r t s
w o r k t o g e t h e r t o p r io r it iz e p r o b le m s .
2011
A d v a n t a g e s a n d p r o b le m s
F e w e t h ic a l & p r a c t ic a l is s u e s t o
c o n s id e r b e c a u s e u s e r s n o t in v o lv e d .
C a n b e d if f ic u lt & e x p e n s iv e t o f in d
e x p e rts.
B e s t e x p e r t s h a v e k n o w le d g e o f
a p p lic a t io n d o m a in & u s e r s .
B ig g e s t p r o b le m s :
I m p o r t a n t p r o b le m s m a y g e t m is s e d ;
M a n y t r iv ia l p r o b le m s a r e o f t e n id e n t if ie d ;
E x p e r t s h a v e b ia s e s .
2011
H e u r is t ic s f o r w e b s it e s fo c u s
o n k e y c r it e r ia (B u d d , 2 0 0 7 )
C la r it y
M in im iz e u n n e c e s s a r y c o m p le x it y &
c o g n it iv e lo a d
P r o v id e u s e r s w it h c o n t e x t
P r o m o t e p o s it iv e & p le a s u r a b le u s e r
e x p e r ie n c e
2011
Walkthroughs
Walkthroughs are an alternative to heuristic
evaluation for predicting user's problems without
doing user testing.
Involve walking through a task with the product
and nothing problematic usability features. Most
walkthrough methods to not involve users.
2011
C o g n it iv e w a lk t h r o u g h s
F o c u s o n e a s e o f le a r n in g .
D e s ig n e r p r e s e n t s a n a s p e c t o f t h e
d e s ig n & u s a g e s c e n a r io s .
E x p e r t is t o ld t h e a s s u m p t io n s
a b o u t u s e r p o p u la t io n , c o n t e x t o f
u s e , t a s k d e t a ils .
O n e o r m o r e e x p e r t s w a lk t h r o u g h
t h e d e s ig n p r o t o t y p e w it h t h e
s c e n a r io .
E x p e r t s a r e g u id e d b y 3 q u e s t io n s .
2011
T h e 3 q u e s t io n s
W ill t h e c o r r e c t a c t io n b e s u f f ic ie n t ly
e v id e n t t o t h e u s e r ?
W ill t h e u s e r n o t ic e t h a t t h e c o r r e c t
a c t io n is a v a ila b le ?
W ill t h e u s e r a s s o c ia t e a n d in t e r p r e t t h e
r e s p o n s e fr o m t h e a c t io n c o r r e c t ly ?
A s th e e x p e rts w o rk th ro u g h th e
s c e n a r io t h e y n o t e p r o b le m s .
2011
P lu r a lis t ic w a lk t h r o u g h
V a r ia t io n o n t h e c o g n it iv e w a lk t h r o u g h
th e m e .
P e r f o r m e d b y a c a r e f u lly m a n a g e d t e a m .
T h e p a n e l o f e x p e r t s b e g in s b y w o r k in g
s e p a r a t e ly .
T h e n t h e r e is m a n a g e d d is c u s s io n t h a t
le a d s t o a g r e e d d e c is io n s .
T h e a p p r o a c h le n d s it s e lf w e ll t o
p a r t ic ip a t o r y d e s ig n .
2011
A n a ly t ic s
A m e t h o d f o r e v a lu a t in g u s e r t r a f f ic
th ro u g h a s y s te m o r p a rt o f a s y s te m
M a n y e x a m p le s in c lu d in g G o o g le
A n a l y t i c s , V i s i s t a t ( s h o w n b e lo w )
T im e s o f d a y & v is it o r I P a d d r e s s e s
2011
S o c ia l a c t io n a n a ly s is
( P e r e r & S h n e id e r m a n , 2 0 0 8 )
2011
P r e d ic t iv e m o d e ls
P r o v id e a w a y o f e v a lu a t in g p r o d u c t s
o r d e s ig n s w it h o u t d ir e c t ly in v o lv in g
u se rs.
L e s s e x p e n s iv e t h a n u s e r t e s t in g .
U s e f u ln e s s lim it e d t o s y s t e m s w it h
p r e d ic t a b le t a s k s - e . g . , t e le p h o n e
a n s w e r in g s y s t e m s , m o b ile s , c e ll
p h o n e s , e tc .
B a s e d o n e x p e r t e r r o r - fr e e b e h a v io r .
2011
2011
K e y s t r o k e le v e l m o d e l
G O M S h a s a ls o b e e n d e v e lo p e d t o
p r o v id e a q u a n t it a t iv e m o d e l - t h e
k e y s t r o k e le v e l m o d e l.
T h e k e y s t r o k e m o d e l a llo w s
p r e d ic t io n s t o b e m a d e a b o u t h o w
lo n g it t a k e s a n e x p e r t u s e r t o
p e rfo rm a ta s k .
2011
R e s p o n s e t im e s fo r k e y s t r o k e
l e v e l o p e r a t o r s ( C a r d e t a l. , 1 9 8 3 )
O p e ra to r
K
P1
H
M
R (t)
D e s c r ip t i o n
P r e s s in g a s in g l e k e y o r b u t t o n
A v e r a g e s k ill e d t y p is t ( 5 5 w p m )
A v e r a g e n o n - s k ill e d t y p is t ( 4 0 w p m )
P r e s s in g s h if t o r c o n t r o l k e y
T y p is t u n f a m il ia r w it h t h e k e y b o a r d
P o i n t i n g w it h a m o u s e o r o t h e r d e v ic e o n a
d is p l a y t o s e le c t a n o b j e c t .
T h is v a lu e is d e r iv e d f r o m F it t s L a w w h ic h is
d is c u s s e d b e lo w .
C lic k in g t h e m o u s e o r s im ila r d e v ic e
B r in g h o m e h a n d s o n t h e k e y b o a r d o r o t h e r
d e v ic e
M e n t a l ly p r e p a r e / r e s p o n d
T h e r e s p o n s e t i m e is c o u n t e d o n ly if it c a u s e s
t h e u s e r t o w a it .
T im e ( s e c )
0 .2 2
0 .2 8
0 .0 8
1 .2 0
0 .4 0
0 .2 0
0 .4 0
1 .3 5
t
2011
U s in g K L M t o c a lc u la t e t im e t o
c h a n g e g a z e ( H o lle is e t a l. , 2 0 0 7 )
2011
F it t s L a w
( F it t s , 1 9 5 4 )
F it t s L a w p r e d ic t s t h a t t h e t im e t o p o in t
a t a n o b j e c t u s in g a d e v ic e is a f u n c t io n
o f t h e d is t a n c e f r o m t h e t a r g e t o b j e c t &
t h e o b je c t s s iz e .
T h e f u r t h e r a w a y & t h e s m a lle r t h e
o b je c t , t h e lo n g e r t h e t im e t o lo c a t e it &
p o in t t o it .
F it t s L a w is u s e f u l f o r e v a lu a t in g
s y s t e m s f o r w h ic h t h e t im e t o lo c a t e a n
o b je c t is im p o r t a n t , e . g . , a c e ll p h o n e ,
a h a n d h e ld d e v ic e s .
2011
47
2011
Key points
Evaluation & design are closely integrated in user-centered
design.
Some of the same techniques are used in evaluation as for
establishing requirements but they are used differently
(e.g. observation interviews & questionnaires).
Three types of evaluation: laboratory based with users, in
the field with users, studies that do not involve users
The main methods are: observing, asking users, asking
experts, user testing, inspection, and modeling users task
performance, analytics.
Dealing with constraints is an important skill for evaluators
to develop.
www.id-book.com
48
2011
U s a b ilit y t e s t in g is d o n e in c o n t r o lle d c o n d it io n s .
U s a b ilit y t e s t in g is a n a d a p t e d f o r m o f e x p e r im e n t a t io n .
E x p e r im e n t s a im t o t e s t h y p o t h e s e s b y m a n ip u la t in g c e r t a in
v a r ia b le s w h ile k e e p in g o t h e r s c o n s t a n t .
T h e e x p e r im e n t e r c o n t r o ls t h e in d e p e n d e n t v a r ia b le ( s ) b u t n o t
t h e d e p e n d e n t v a r ia b le ( s ) .
T h e r e a r e t h r e e t y p e s o f e x p e r im e n t a l d e s ig n : d if f e r e n t p a r t ic ip a n t s , s a m e - p a r t ic ip a n t s , & m a t c h e d p a r t ic ip a n t s .
F ie ld s t u d ie s a r e d o n e in n a t u r a l e n v ir o n m e n t s .
I n t h e w ild is a r e c e n t t e r m f o r s t u d ie s in w h ic h a p r o t o t y p e
is f r e e ly u s e d in a n a t u r a l s e t t in g .
T y p ic a lly o b s e r v a t io n a n d in t e r v ie w s a r e u s e d t o c o lle c t f ie ld
s t u d ie s d a t a .
D a t a is u s u a lly p r e s e n t e d a s a n e c d o t e s , e x c e r p t s , c r it ic a l
in c id e n t s , p a t t e r n s a n d n a r r a t iv e s .
2011
K e y p o in t s
I n s p e c t io n s c a n b e u s e d t o e v a lu a t e
r e q u ir e m e n t s , m o c k u p s , f u n c t io n a l
p ro to ty p e s , o r s y s te m s .
U s e r t e s t in g & h e u r is t ic e v a lu a t io n m a y
r e v e a l d if f e r e n t u s a b ilit y p r o b le m s .
W a lk t h r o u g h s a r e f o c u s e d s o a r e s u it a b le f o r
e v a lu a t in g s m a ll p a r t s o f a p r o d u c t .
A n a ly t ic s in v o lv e s c o lle c t in g d a t a a b o u t
u s e r s a c t iv it y o n a w e b s it e o r p r o d u c t
T h e G O M S a n d K L M m o d e ls a n d F it t s L a w
c a n b e u s e d t o p r e d ic t e x p e r t , e r r o r - fr e e
p e r f o r m a n c e f o r c e r t a in k in d s o f t a s k s .
2011