Evaluation Guideelines and Methods

Department of Mechanical, Materials and
Manufacturing Engineering
Human-Computer Systems
MM4HCI
2013
Lecture 3 Evaluation methods and
guidelines
Professor Sarah Sharples
AN EVALUATION
FRAMEWORK
Outline
1. Understand what evaluation is for
2. Preparing for an evaluation
3. The range of evaluation techniques and
their uses
4. Understanding some of the practical
issues of applying evaluation methods
Main reading - Sharp et al., Chapter 12, 14, 15
What is evaluation?
Involving users, and user
representatives, in the technology / ICT
design and development process in a
structured manner
Capturing responses to a design or a
design artefact
Can be carried out at any point in the
development process
fun
emotionally
fulfilling
satisfying
Efficient
to use
enjoyable
Easy to
remember
how to use
entertaining
Easy to
learn
helpful
Effective
to use
rewarding
Usability
goals
Safe to
use
supportive
of creativity
Have good
utility
motivating
aesthetically
pleasing
Source: Preece et al., 2002
Evaluation choice considerations

Why
What
Who
When
Where
How
Why are you conducting the evaluation?

What do you have to evaluate (eg prototypes)?
Who is going to help you (users, experts)?
When in the development process?
Do you need a clean environment, or context?

What method are you going to use?
Why evaluate?
Ensure a user-centred design
Easy to learn, easy to use, efficient, useful, satisfying to use
From a human factors perspective

Safe (for the operator), safe for the system, optimal system
performance (Holnagel and Woods, 2006)
Inform and evolve the design (saves time and

money); verify requirements (Chevalier and Kicka,
2006)
Benchmarking and comparison
What data do you need to capture?

Satisfaction
Ease of
learning
Usability
Performance /
efficiency
What have you got to evaluate?

Benefits
Drawbacks
Lo-fi
Hi-fi
Cheap
Addresses layout
Proof-of-concept
Open to participatory design
and comment (Erickson,
1995)
Navigation and flow

limitations for evaluation
Does not support good
quantitative measures (eg
errors)
Best used early on or for

rapid re-designs
Complete functionality
Supports quantitive evaluation
(eg users error rates)
Marketing and sales tool
A living specification
Expensive
Time consuming
Perceived limited scope for
change
Best used for quantitative userevaluation, and as part of proofs

of concept crossing business
functions
Who is going to be involved?

Do you need to match against certain
characteristics?
Age, gender, education, prior knowledge
Physical, cognitive and attitudinal implications
Do any of your users pose particular

challenges?
Older adults, children, children with special needs
Can you use novices, or HCI experts?

And how many (depends on method)
When in the development process?

Evaluation
Effort
Requirements
Last
minute
panic
testing!!!
Concept
Design and Development
Implementation
Deployment
Formative vs summative
Formative
To inform the design process
Explorative, using partially completed artefacts
(prototypes)
Maybe more qualitative or subjective
Summative
A confirmation exercise
To ensure meets intended aims
Often against a recognised standard or set of
benchmarks (or initial requirements)
Where? (see Duh et al, 2005)

Lab
Simulation
Real world
Evaluation as part of user experience

Does where they use the
technology influence the
interaction?
Social or physical factors
Temporality short or long
periods of use
Context
Is how they do it important?

Is performance relevant?
Are you investigating
functionality?
Technology
User
Experience
Tasks
Is the technology new?

Is there a novel input or
output?
How much does the
technology influence the
interaction?
Users
Are the users experts or have

prior knowledge?
Do they have specific
characteristics?
EVALUATION METHODS
Evaluation Approaches
Analytical
Predictive
evaluation
methods
Field study
Interpretive
evaluation
methods
Collecting
users
opinions
Lab study
Experiments
and
benchmarking
Usability
studies
Read Sharp, Rogers & Preece, 2007

Chapters 14 & 15 for more information
Analytical - Predictive evaluation

HCI experts use their knowledge of users and
technology to evaluate interface usability
Inspection methods and heuristics

Accessibility (WCAG, 1999)
User modelling GOMS and KLM

Walkthroughs
Analytical - Heuristic evaluation

~ 5 HCI experts work independently
General review of product
Focus on specific features

Structured expert reviewing against guidelines, e.g.
use simple and natural language
provide shortcuts
Collate reviews to prioritise problems

Five HCI experts typically find c.75% of usability
problems of an interface
BUT see Cockton and Woolrych, 2002
Analytical - Walkthroughs
Cognitive walkthrough focus on ease of learning
Scenario-based evaluation
3 main questions:
Will the correct action be evident to the user?
Will the user notice that the correct action is available?

Will the user associate and interpret the response from the
action correctly?
Pluralistic walkthrough (experts, experts + users)
Participatory design
Analytical evaluation
Advantages
Disadvantages
Experienced reviewers
Can be difficult and

expensive to find experts
Users not involved
Experts may have biases
Good experts will have

knowledge of users
Some problems may get

missed, trivial problems
identified
Easy to set up and run study
Field study - Interpretive evaluation

Aims to enable designers to understand better how
users use systems in context
Qualitative data
Description of performance/outcome
Field study - Data collection

Informal and naturalistic methods of data collection
Observations, interviews, usage logging, focus groups
Contextual Inquiry
Originates from ethnography
Observe the entire process of interface use, from switching
on computer to going home after task completion
Co-operative and participative evaluation

Focus groups
Development of prototypes
Iterative design process
Field study - Interviews vs focus

groups?
Do you want opinions or actual tasks?
Can get error / timing / task data from FGs
Are users familiar enough to remember
useage?
Do you have something to focus on?
Focus groups need careful planning and
careful facilitation
See Nielsen, 2001b
Field study methods

Advantages
Disadvantages
Reveals what really happens

in context of use
May not be easy to recruit

participants
Description of performance
or outcome
Can be disruptive to the

working environment
Users directly involved
True ethnographic studies

require evaluator expertise
Works well for formative

evaluation of prototypes
Quality of results variable
Lab study - Experiments and

benchmarking
Traditional approach to HCI
Predicted relationship between variables
Manipulate Independent Variable (IV), measure
Dependent Variables (DV)
Generally use time/error measurement
Specific Human Factors measures
Workload Nasa TLX
Situation Awareness SAGAT
Body Part Discomfort
Lab study - Usability testing

An essential part of the evaluation process
Structured interview and activity
Observed and recorded (eye tracking, facial
expressions, comments)
Tends to be summative, towards the end of the

process
At very least needs interactive prototype

Can then be backed up with a survey eg SUS
Lab study methods

Advantages
Disadvantages
Studies conducted under

controlled conditions
Requires lab facilities and

resources
Experiments provide
quantitative measures
May require experimenter

expertise
Focus on specific aspects of

design or user performance
Can be time consuming and

expensive
Usability testing provides

qualitative results
Unnatural setting may affect

user behaviour
Highlights particular usability

problems
Unrealistic tasks may not

inform design
EVALUATION IN PRACTICE
DECIDE: a framework to guide

evaluation
(Preece, Rogers and Sharp, 2002. Chapter 11)
Determine the goals
Explore the questions

Choose the evaluation approach and methods
Identify the practical issues
Decide how to deal with the ethical issues
Evaluate, analyze, interpret and present the

data
Applying methods across a project
Effort
Travel
application
concepts
Indoor
navigation
prototype
testing
Lab usability
study
Live field trials
Presentation
of
privacy
information
Concept
Design and Development
Implementation
Deployment
Practical issues
Selection and recruitment of participants

Number of participants
Find evaluators
Control over environment, study set-up
Equipment
Budget constraints
Schedule/deadline
Managing the session
Stepping back in interviews and focus groups
Ethical issues
Develop an informed consent form
Participants have a right to:
- Know the goals of the study
- Know what will happen to the findings
- Privacy of personal information
- Leave when they wish
- Be treated politely
Example evaluation exercise

You are required to propose an evaluation
programme to support the design of new voice
technologies to help older adults interact with
objects (e.g. furniture, electrical appliance) in
their homes
Summary (1)
There are many issues to consider before conducting an
evaluation study
These include the goals of the study, the approaches and

methods to use, practical issues, ethical issues, and how
the data will be collected, analysed and presented
Evaluation & design are closely integrated in usercentered design
References
Cockton, G., & Woolrych, A. (2002) Sale must end: should discount methods be cleared off
HCIs shelves? Interactions, 9 (5), 13-18.
Chevalier, A., & Kicka, M. (2006) Web designers and web users: Influence of the ergonomics
quality of the web site on the information search. International Journal of Human-Computer
Studies, 64 (10), 1031-1048.
Duh, H. B-L., Tan, G. C. B., & Chen, V. H. (2005) Usability evaluation for mobile device: a
comparison of laboratory and field tests. In Proceedings of the 8th conference on Humancomputer interaction with mobile devices and services. pp 181-186. New York, NY.: ACM
Press.
Erickson, T. (1995) Notes on design practice: stories and prototypes as catalysts for
communication. In J. M. Carroll (Ed.) Scenario-based design: Envisioning work technology in
system development pp. 37-58. New York, NY: John Wiley & Sons
NIELSEN, J. (2000a). The use and misuse of focus groups. http://www.useit.com/papers.
Nielsens Ten usability Heuristics (2001). retrieved from www.useit.com.
Sharp, H., Rogers, Y. and Preece, J. (2007). Interaction Design, Beyond human-computer
interaction (2nd edition). John Wiley and Sons:NY. Chapters 12, 13, 14 & 15.
Shneiderman, B. (1998). Designing the User Interface (3rd edition). Addison-Wesley:MA.
Standard Usability Measurement Inventory (SUMI) retrieved from http://sumi.ucc.ie/index.html
January 2008.
WCAG (1999) Web Content Accessibility Guidelines 1.0. Retrieved 28th Feb 2008, from
http://www.w3.org/TR/1999/WAI-WEBCONTENT-19990505/
Summary (2)
Different evaluation approaches and methods are often
combined in one study
Triangulation involves using a combination of techniques to

gain different perspectives, or analysing data using different
techniques
Dealing with constraints is an important skill for evaluators to

develop

Evaluation Guideelines and Methods

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Evaluation Guideelines and Methods

Uploaded by

Copyright:

Available Formats

Department of Mechanical, Materials and

Main reading - Sharp et al., Chapter 12, 14, 15

Evaluation choice considerations

Why are you conducting the evaluation?

Do you need a clean environment, or context?

From a human factors perspective

Inform and evolve the design (saves time and

What data do you need to capture?

What have you got to evaluate?

Navigation and flow

Best used early on or for

Best used for quantitative userevaluation, and as part of proofs

Who is going to be involved?

Do any of your users pose particular

Can you use novices, or HCI experts?

When in the development process?

Design and Development

Where? (see Duh et al, 2005)

Evaluation as part of user experience

Is how they do it important?

Is the technology new?

Are the users experts or have

Read Sharp, Rogers & Preece, 2007

Analytical - Predictive evaluation

Inspection methods and heuristics

User modelling GOMS and KLM

Analytical - Heuristic evaluation

Focus on specific features

Collate reviews to prioritise problems

BUT see Cockton and Woolrych, 2002

Will the user notice that the correct action is available?

Pluralistic walkthrough (experts, experts + users)

Can be difficult and

Users not involved

Experts may have biases

Good experts will have

Some problems may get

Easy to set up and run study

Field study - Interpretive evaluation

Field study - Data collection

Co-operative and participative evaluation

Field study - Interviews vs focus

Field study methods

Reveals what really happens

May not be easy to recruit

Can be disruptive to the

Users directly involved

True ethnographic studies

Works well for formative

Quality of results variable

Lab study - Experiments and

Lab study - Usability testing

Tends to be summative, towards the end of the

At very least needs interactive prototype

Lab study methods

Studies conducted under

Requires lab facilities and

May require experimenter

Focus on specific aspects of

Can be time consuming and

Usability testing provides

Unnatural setting may affect

Highlights particular usability

Unrealistic tasks may not