Professional Documents
Culture Documents
net/publication/242371833
CITATIONS READS
135 496
3 authors, including:
All content following this page was uploaded by Frank Wilson on 16 July 2015.
To cite this article: ANDY WHITEFIELD , FRANK WILSON & JOHN DOWELL (1991) A framework for human factors evaluation,
Behaviour & Information Technology, 10:1, 65-79, DOI: 10.1080/01449299108924272
Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the
publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations
or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any
opinions and views expressed in this publication are the opinions and views of the authors, and are not the
views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be
independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses,
actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever
caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any
form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://
www.tandfonline.com/page/terms-and-conditions
BEHAVIOUR & INFORMATION TECHNOLOGY, 1991, VOL, 10, NO. 1,65-79
proposed that attempts to clarify what can be done towardswhich goals and
how it can be done. This highlights and discusses notions of system
performance, of assessment statements, and of assessment methods. The
paper concludes with a discussion of the implications of the framework for
evaluation practice.
1. Introduction
Evaluation is an integral part of any development process, whether for
considerations of cost, safety, production, maintainability,. or whatever.
Evaluating the human factors aspects of interactive computer systems is simply
a part (albeit an important part) of this wider evaluation process. In practice,
human factors evaluation has tended to rely heavily on the experience of the
human factors specialist, who has had very little in the way of formal,
theoretical or tool support. This has begun to change recently, with an
increasing interest in understanding the nature of human factors evaluation and
in developing support for it (for instance, Lea 1988, Clegg et al. 1988, Dowell
and Long 1989a, Denley and Long 1990). This paper is intended to extend that
understanding and hence the consequent support.
The paper aims firstly to characterize and comment upon current
approaches to human factors evaluation. This is done in section 2. A
framework for evaluation, based on a definition of evaluation and on an
analysis of its methods, is outlined in section 3. The paper concludes with a
discussion of this framework in section 4.
2. Current approaches
Both personal experience and accounts of evaluations in the literature
suggest that human factors evaluation practice varies widely. This is due in part
to the variability in the processes and products of interactive system
development, but we would suggest it is also partly due to the absence of any
accepted framework for evaluation. Whilst acknowledging this variety in
human factors evaluation practice, it is important to characterize that practice
by identifying its major features. This will allow us to identify what would be
improvements to practice.
0144-929X191 $3.00 e 1991 Taylor & Francis LId.
66 A. Whitefield et al.
examples are those proposed by Clegg et al. (\ 988) and by Ravden and Johnson
(1989). These seek to provide the (non-specialist) practitioner with wide-ranging
questionnaire and checklist tools with which to evaluate a company's current or
potential computer systems. While these will almost certainly be useful to many
people, their orientation towards the non-specialist evaluator, and their
apparently implicit and experiential basis, make them difficult to use in
characterizing human factors evaluation practice.
The third area of research concerned with improving human factors
evaluation has been as part ofa general approach to incorporating human factors
in system development known as usability engineering or design for usability.
These terms cover loosely-related work by several human factors researchers in
both industry and academia. Although the approach concerns systems
development in general and is therefore not concerned solely with evaluation, it
has included a number of explicit and implicit proposals about it. We shall use
Shackel (1986) as our principal reference for the work, since he attempts to
provide an integration and overview of the approach. We shall also use his
favoured term, design for usability. The reader should bear in mind, however,
that design for usability is not a single, unified, fully-developed and widely-
adopted approach that is completely and clearly represented by Shackel's paper.
Another view is presented by Whiteside et at. (1988).
Shackel summarizes the fundamental features of design for usability as:
User centred design - focused from the start on users and tasks;
Participative design - with users as members of the design team;
Experimental design - with formal tests of usability in pilot trials,
simulations, and full prototype evaluations;
Iterative design - design, test and measure, and redesign as a regular cycle
until results satisfy the usability specification;
User supportive design - help systems, manuals, training, and selection
should be treated as part of the design.
Evaluation within this approach is characterized by a reliance on iterative
design, on experimental testing with users, and on usability criteria. As Shackel
(1986, p. 54) puts it
A framework for human factors evaluation 67
perhaps the most important feature of this process is that the usability goals
thus set become criteria by which to test the design as it evolves and to
improve it by iterative redesign. Such tests are embodied first in trials of early
versions of the design and later in formal evaluations of prototypes.
The second point about testing with users is whether it must be experimental.
It is not clear just what Shackel means by experimental. By talking about
comparative testing and 'formal and empirical' studies, one infers that he means
full control and manipulation of relevant variables, or something approaching it.
We prefer to adopt the view that there is a wide range of methods for user testing,
.including informal observation, interviews, and questionnaires as well as
controlled experimentation, and all of these can be used for the purposes of
evaluation. To equate evaluation with fully controlled experiments is to confuse
the goal with a method of reaching the goal.
. The third characteristic of evaluation within design for usability, and one on
which it lays strong emphasis, is the use of usability goals and criteria. Shackel
defines usability in terms of the operational factors ofeffectiveness, leamability,
flexibility, and attitude. Learnability, for example, is defined operationally as the
users' ability to perform a set range of tasks within some specified time frame,
based on some specified amount of training and user support, and within some
specified relearning time each time for intermittent users. Usability goals are set
Downloaded by [86.164.173.234] at 14:07 12 February 2015
point when they say 'Not all, or even most, aspects of usability experience can be
given operationally defined criteria and measured.' (p. 814).
Implicit in this criticism of usability goals and criteria is the view that the
notions of usability and performance within design for usability are too poorly
specified (and insufficiently distinguished) to support fully evaluation practice.
Essentially the view of usability is inflexible and its derivation is unclear. Thus it
follows from Shackel's statement (1986, p. 52) that 'usability can be defined in
terms of the interaction between user, task and system in the environment', that
usability is not an inherent property of a computer but is a property of a
particular system interaction. The implication of this is that usability must be
constructed differently for different system configurations and instances, but in
this respect Shackel's definition does not allow sufficient variation. The
definition does two things: it identifies four factors (effectiveness, learnability,
flexibility, attitude) and it specifies them in terms of performance measures
(time, errors, percentage of completed tasks, amount of training, etc). One
Downloaded by [86.164.173.234] at 14:07 12 February 2015
problem is that the factors are not explicitly derived and there is no clear
argument (other than experience) for this set as opposed to any other. Other
factors would certainly be possible. Sutcliffe (1988), for example, has proposed
the additional factor of coverage. Second, the particular performance
specification of the factors may be unnecessarily constraining. For example, why
should learnability always be specified in terms of time? Why not errors, or
memory recall, or ... ?
Whiteside et al. (1988) have a better approach here, in that they do not limit
in advance the usability factors or their measurement criteria. On the other
hand, they do not limit the factors ('attributes') at all and provide no assistance
in identifying suitable factors for any evaluation - an approach equally unhelpful
in its own way.
It has been suggested (Brooke 1990) that some of the proponents of the use of
usability goals and criteria were, from an early stage, aware of many of the
criticisms made above. Nevertheless, they decided that these problems were
acceptable given theirgoal of establishing human factors considerations within
system development on the same (quantitative) terms as those of other factors.
While we accept this as a motivation, and accept that considering human factors
in these terms may well have been beneficial rather than detrimental, this does
not of course mean that the criticisms above are invalid.
This completes the review of the design for usability approach to evaluation.
Although this section has concentrated very much on the one approach, it has
not been the intention simply to criticize this particular approach. Rather our
intention has been to examine critically current approaches to evaluation, of
which design for usability is the clearest and most prevalent example, with the
aim of improving our understanding of evaluation, and hence its practice. Our
attempt to do this is presented in the next section.
3.1. Definition
In an attempt to make explicit. the concepts and relations of evaluation, we
begin by proposing a definition. Definitions are not absolute but are fashioned
for particular purposes; the purpose here is to characterize all instances of
evaluation so as to identify important commonalities. The definition is a re-
expression of that used by Long and Whitefield (1986): human factors evaluation
is an assessment ofthe conformity between a system's performance and its desired
performance. The various terms in the definition will be discussed in turn.
ergonomic sense (and as used throughout this paper) is a user and a computer
engaged upon some task within an environment. A set of hardware and software
components is therefore not a system in this sense but simply a computer. Note
that this interpretation requires that a complete evaluation of human-computer
interaction within a system must consider not just users and how they perform
but also computers and how they perform. One must understand how the
structures and behaviours of software and hardware determine performance as
well as how the users' mental and physical structures and behaviours do the
same.
The definition of course requires clarification of the concept of performance.
In their discussion of HCI engineering, Dowell and Long (J989b) define
performance as the system's effectiveness in accomplishing tasks. It is 'a two-
factor concept expressed as the quality of task product (i.e., how well the task's
outcome meets its goal) and the incurred resource costs (i.e., the resources
employed by both the user and the computer in accomplishing the task). A most
effective system would minimize the resource costs in performing a specified
task with a given quality of product.
Dowell and Long (1989b) state that the resource costs incurred by the human
are of two kinds: structural and behavioural. Structural human resource costs are
the costs incurred in establishing and maintaining the mental and physical
structures that support behaviour during the task. Such costs are typically
incurred in educating and training users in the relevant skills and knowledge.
Behavioural human resource costs are the costs incurred in recruiting structures
to express behaviour during the task. They are both mental costs (e.g., in
memorizing, planning, and decision-making) and physical costs (e.g., in the use
of keyboards and pointing devices). Both mental structural and mental
behavioural costs can be differentiated into cognitive, conative, and affective
costs, relating to knowledge, motivation, and emotion respectively. Because they
concentrate on human factors (and not on software engineering) within HCI,
Dowell and Long treat computer resource costs simply as an undifferentiated
processing cost.
While this treatment of resource costs should be sufficient for our current
purposes, it is worth noting that further decomposition would almost certainly
A framework for human factors evaluation 71
be required for any instance of evaluation (e.g., cognitive structural costs could
be divided into the various classes of knowledge recruited for the task).
Decomposition will also be required for computer resource costs.
The performance of a system is determined by its behaviour. The system's
behaviour comprises the interacting behaviours of the user (e.g., sequencing
subtasks, remembering command syntax, pressing keys) and the computer (e.g.,
calculating balances, parsing inputs, displaying characters). Both user and
computer will have important behavioural limitations that constrain
performance (e.g., memory search mechanisms).
This notion of performance and behaviour therefore distinguishes what is
achieved from how it is achieved (the same performance can be achieved by
different behaviours) and distinguishes the quality of the task product from how
effectively it is produced (the same quality can be produced by systems of
different effectiveness). Both these distinctions help for evaluation purposes by
focusing on exactly what is to be assessed.
The notion of desired performance has similarities to that of specifying
Downloaded by [86.164.173.234] at 14:07 12 February 2015
3.3. Methods
The system development literature contains many discussions of what is
meant by method. There seems to be at least some basic agreement that a
method must contain both procedure and notation - neither is sufficient on its
own. For our purposes, we wish to consider as a method any means of
investigating human-computer systems that would support an assessment of
performance conformity. Thus a method must involve the system in some
form, and it must address system behaviour or task product quality and
resource costs. This is of course a very broad sense of the term method. It
includes on the one hand notations with little procedural content (for example,
theories, models, and representations), and on the other hand procedures with
little notational content (for example, knowledge of scientific and engineering
methodology),
Clearly this view of methods means that the number of methods to be
considered is very large. It is necessary to identify classes of method, both for
an adequate description and as a step towards the selection of appropriate
Downloaded by [86.164.173.234] at 14:07 12 February 2015
USER
Representational Real
l\l
c:
'-....,....,
0
l\l
Specialist Observational
Q,)
Reports Methods
Downloaded by [86.164.173.234] at 14:07 12 February 2015
Cl::
and simulations all count as a real computer presence in the evaluation. They
utilize the medium of the computer to demonstrate, to varying degrees of
completeness and fidelity, how the computer under development will appear and
function, and they could all be set before users to interact with in evaluation
tests. On the other hand, specifications and notational models are
representational computer presences, as are users' mental representations of the
computer. They utilize symbolic representation in a non-computer medium and
require transformation or manipulation to demonstrate computer appearance
and functionality. Thus asking users to remember interface features, as might be
done in interviews or questionnaires, involves their symbolic mental
representations of the computer and not the real computer. Similarly, a code
inspection to check that a program meets a functional requirement involves
representations of the computer only and not the real computer.
Applied to the user component of the system, real similarly means actual
users or approximations of them. Thus experiments involving subjects from the
system's actual user population or from a different population (for example,
students) are both observational methods with the presences of real users
(although the former group would provide more accurate and reliable results). In
contrast, the presence of representational users means (explicit or implicit)
descriptions or models of the users. Thus some analytic methods contain explicit
models of users, and human factors engineers performing a specialist report
operate with written or implicit representations of users.
Given this basis for distinguishing classes of methods, we can now consider
each class in turn. To help in understanding the use of the methods for
evaluation, a number of comments about characteristics of the methods will be
A framework for human factors evaluation 75
made. The characteristics selected for comment are mostly drawn from the
operational characteristics of methods, as identified by Wilson and Whitefield
(1989). The comments themselves are based both on personal experience with
the methods and on published accounts of evaluations. More detailed
descriptions of the various methods can be found in a number of places, for
example, Meister (1986) and Lea (1988) discuss user reports and observational
methods; Reisner (1983) and Whitefield (1990) discuss analytic methods;
Hammond et al. (1985) discuss specialist reports.
It should be clear that analytic methods of evaluation involve the use of
representations of both system components - the user and the computer.
Normally this would mean manipulating models of the system to predict
performance. Examples in the human factors literature are the Keystroke Level
Model (Card et al. 1983) and the Blackboard Design Model (Whitefield 1989,
1990). The principal advantages of such methods are that they can be used early
in development (before any real computers exist), require few resources to apply
Downloaded by [86.164.173.234] at 14:07 12 February 2015
(including neither real users nor real computers), and are potentially fast. The
current disadvantages are that suitable modelling techniques are still under
investigation and development, and consequently the validity and reliability of
the methods are uncertain.
Specialist reports involve one or more people who are not real users assessing
a version of the real computer. A typical method would be for a human factors
specialist to evaluate the screen design of a prototype version, making use of
relevant handbooks, guidelines, and experience. The specialist's use of the
computer could be unstructured or structured around particular tasks. Real users
are not involved, but the specialist will be using representations of them both
mentally and in the reference works. Note that these methods can be used by
other specialists, such as application domain experts or software engineers
(although to perform a human factors evaluation these other specialists would
need to focus on human factors issues). The advantages are that the methods are
relatively fast, use few resources, provide an integrated view, and can address a
wide range of behaviour. On the other hand, their reliability will vary between
specialists, and since their assessments are inevitably somewhat subjective, their
reports are likely to be incomplete, biased, and difficult to validate. A
comparison of specialist reports with observational methods is reported in
Hammond et al. (1985).
User reports involve real users but not a real computer. They typically involve
the use of questionnaires, interviews, or rating methods (which Lea (1988) refers
to collectively as survey methods). The methods are used to obtain data or
opinions from the users on some aspect of the system, but where they have to rely
on their knowledge of the computer and not to interact with it directly. Because
the data are subjective they are open to a number of inaccuracies. However,
relatively formal techniques for data collection and analysis are available, the
methods can be relatively quick, and it is the real users who are being involved.
Lea describes survey methods as indispensable.
The final class of evaluation methods is observational methods, which involve
a real system, i.e., real users interacting with a real computer. The set of such
methods is itself very large, ranging from informal observation of a single user to
full-scale experimentation with appropriate numbers of subjects and control of
variables. Although this range makes it difficult to generalize, observational
76 A. Whitefield et al.
methods have the major advantage of investigating the performance of the real
system and ought therefore to reflect that performance more accurately than the
other methods. Simple observational methods (which can be used in conjunction
with user reports such as interviews) can be easy to conduct, and provide a
wealth of detailed data, but they can be difficult to integrate and interpret, and
time-consuming to analyse fully. More formal experimental methods provide
detailed, quantitative, and reliable answers to particular questions, but they tend
to be slow to conduct, use many resources, and require expertise in experimental
design as well as in interface issues.
This discussion of the advantages and disadvantages. of the various classes of
method ends the description of the evaluation framework. The following section
concludes the paper with a discussion of some aspects of the framework.
4. Discussion .
Given the framework described in section 3, this section considers some of its
implications and discusses its relationship with improvements to evaluation
Downloaded by [86.164.173.234] at 14:07 12 February 2015
practice.
Conceiving of evaluation as in the above framework has a number of
implications and consequences. Some of these have been identified as the
framework has been described (for example, that purely presentation evaluations
do not address the behaviour underlying performance; that the quality of the
task product should be distinguished from how effectively it is produced; and
that representations of users can serve as a means of including users in
development). However, it is worth identifying and discussing some of the major
implications of the framework that have not yet been spelled out.
The first point is that evaluation can be done during system development at
any time after the first description of the proposed system is produced (whether
this be a formal specification or a prototype or whatever). Of course the levels of
description of the system and its desired performance would need to be
comparable - thus highly abstract descriptions are unlikely to support
assessments against detailed low-level specifications of desired performance -
but some forms of evaluation could be conducted from an early stage. Whether
evaluations early in development can be as accurate as those later in
development is currently a question for empirical research, see for example,
Dowell and Long (1989a).
The second important implication of the framework should be fairly clear:
evaluations address particular aspects of system performance and not
performance in general. Any evaluation requires a focus on certain system
behaviours and certain task products - it must be tailored to the given system
and to the intended goals of that system. It is therefore not the case that a
human factors evaluation can address all human factors aspects of a system -
something that it appears may be assumed by many developers, and something
that is to some extent assumed by the operational definition of usability in the
design for usability approach. An important requirement for the future
development of sound human factors evaluation practice is to construct tools
and methods that will enable evaluators first to select an appropriate focus for a
system evaluation, and second to recruit methods that will allow that focus to
be addressed.
A framework for human factors evaluation' 77
considered by the developers in the light of all the other non-human factors
concerns that affect the system. The relationship between evaluation and
generation in design is a very close and complex one (Whitefield 1989) but one
should not make the mistake of requiring solutions as well as problems to arise
from an evaluation.
One aim of the framework described in section 3 is to contribute to
improvements in evaluation practice. How might it do this? First, we must point
out that the framework is not a prescriptive procedure for how to carry out an
evaluation. It might provide support for prescriptions (for example, Wilson and
Whitefield (1989) use part of the framework to discuss the selection and
configuration of appropriate methods) but it does not itself prescribe practice
procedure. Its main contribution to practice therefore, is as a means of clarifying
what can be done towards which goals and how it can be done. Thus it allows one
to identify the possible types of evaluation statements, the areas of performance
and behaviour that might be addressed, and the range of available methods. Such
clarifications support the appropriate recruitment and allocation of evaluation
resources.
Unfortunately it is extremely hard to demonstrate that the framework does
indeed lead to such improvements in practice. For one thing, it is a novel
proposal and therefore has had little opportunity to influence practice. For
another, any convincing empirical test of the issue would require excessive
resources - a wide range of systems, of evaluators, of methods, of evaluation
statements, and of performance variables. We have no plans to attempt such a
test.
The case that the framework does, or could contribute to improvements is
therefore threefold. First, we have found the ideas in the framework to be useful
for our own evaluation practice (e.g., Dowell and Long 1989a, Sutcliffe and
Whitefield 1989, Wilson 1989). Second, the framework does enable one to
identify problems with, or omissions from, particular evaluations or evaluation
approaches, and as such it suggests potential areas for improvement. Third, an
important demonstration would be that others choose to recruit (all or part of)
the framework in their own practice; this paper is part of an attempt to make the
framework available for that purpose.
78 A. Whitefield et al.
Acknowledgments
This work was done while the authors were working on projects funded by the
Department of Trade and Industry, and by the Alvey Programme (projects
MMIII22 and MMIII51). We would like to thank our colleagues at the
Ergonomics Unit for discussions and for comments on an earlier draft.
References
BELLOTII, V.1988, Implications of current design practice for the use ofHCI techniques.
In D. M. Jones and R. Winder (eds), People and Computers IV. Proceedings ofHCI
88 (Cambridge: Cambridge University Press).
BROOKE, J. B. 1986, Usability engineering in office product development. In M. D.
Harrison and A. F. Monk (eds), People and Computers: Designing For Usability.
Proceedings of HCI 86 (Cambridge: Cambridge University Press).
BROOKE, J. B. 1990, Personal communication.
CARD, S. K., MORAN, T. and NEWELL, A. 1983, The Psychology of Human Computer
Interaction (Hillsdale, New Jersey: Lawrence Erlbaum Associates).
CLEGG, c., WARR, P., GREEN, T., MONK, A., KEMP, N., ALUSON, G., LANSDALE, M.,
Downloaded by [86.164.173.234] at 14:07 12 February 2015
POlTS, C. SELL, R. and COLE, !. 1988 People and Computers: How To Evaluate
Your Company's New Technology (Chichester: Ellis Horwood).
DENLEY, !. and LoNG, J. B. 1990, A framework for evaluation practice. In E. J. Lovesey
(ed.), Contemporary Ergonomics 1990 (London: Taylor & Francis).
DILLON, A. P. 1988, The role of usability labs in system design. In E. D. Megaw (ed.),
Contemporary Ergonomics 1988 (London: Taylor & Francis).
DOWELL, J. and LoNG, J. B. I 989a, The 'late' evaluation of a messaging system design and
the target for 'early' evaluation methods. In A. Sutcliffe and L. Macauley (eds),
People And Computers V. Proceedings of HCI 89 (Cambridge: Cambridge
University Press).
DOWELL, J. and LoNG, J. B. 1989b, Towards a conception for an engineering discipline of
human factors. Ergonomics, 32, 1513-1535.
GARDNER, A. and MCKENZIE, J. 1988, Human Factors Guidelines For The Design Of
Computer-Based Systems. Ministry of Defence and Department of Trade and
Industry.
HAMMOND, N., JORGENSEN, A. MAcLEAN, A., BARNARD, P. and LoNG, J. 1983, Design
practice and interface usability: evidence from interviews with designers. In
Proceedings of CHI 83 (New York: ACM), 40-44.
HAMMOND, N.,. HINTON, G., BARNARD, P., MACLEAN, A., LoNG, J. and WHITEFIELD, A.
1985, Evaluating the interface of a document processor: a comparison of expert
judgement and user observation. In B. Shackel (ed.) Human-Computer Interaction
INTERACT '84 (Anisterdam: North-Holland).
HOWARD, S. and MURRAY, D. M. 1987, A. taxonomy of evaluation techniques for HC!. In
H-J. Bullinger and B. Shackel (eds), Human-Computer Interaction INTERACT '87
(Amsterdam: Elsevier Science Publishers).
JORGENSEN, A. 1990, Thinking-aloud in user interface design: a method promoting
cognitive ergonomics. Ergonomics, 33, 501-507.
KARAT, J. 1988, Software evaluation methodologies. In M. Helander (ed.), Handbook Of
Human-Computer Interaction (Amsterdam: Elsevier Science Publishers).
LEA, M. 1988, Evaluating user interface designs. In T. Rubin, User Interface Design For
Computer Systems (Chichester: Ellis-Horwood).
LoNG, J. B. and WHITEFIELD, A. D. 1986, Evaluating Interactive Systems. Tutorial given at
HCI '86, University of York, September 1986.
MEISTER, D. 1986, Human Factors Testing and Evaluation (New York: Elsevier).
RAVDEN, S. and JOHNSON, G. 1989, Evaluating Usability OfHuman-Computer Interfaces
(Chichester: Ellis Horwood).
REISNER, P. 1983, Analytic tools for human factors of software. In A. Blaser and M.
Zoeppritz (eds), Lecture Notes in Computer Science No. 150 (Berlin: Springer-
Verlag).
A framework for human factors evaluation 79
WHITESIDE, J., BENNETT, J. and HOLTZBATT, K. 1988, Usability engineering: our experience
and evolution. In M. Helander (ed.), Handbook Of Human-Computer Interaction
(Amsterdam: Elsevier Science Publishers)..
WILSON, F. 1988, Human factors evaluations in the development and maintenance of
interactive computer systems. London HCI Centre Report LHC/EXT/ALVI
EV/l.lF.
WILSON, F. 1989, Case studies in interactive systems design and evaluation: I. The real-
time subtitling system. London HCI Centre Report LHC/EXT/ALV/EV/4.IF.
WILSON, F. and WHITEFIELD, A. D. 1989, Interactive systems evaluation: mapping
methods to contexts. In E. Megaw (ed.), Contemporary Ergonomics 1989 (London:
Taylor & Francis).