You are on page 1of 23

Measures and Interpretations of Vigilance Performance:

Evidence against the Detection Criterion

J. D. Balakrishnan, Purdue University, West Lafayette, Indiana

Operators’ performance in a vigilance task is often assumed to depend on their


choice of a detection criterion. When the signal rate is low this criterion is set
high, causing the hit and false alarm rates to be low. With increasing time on
task the criterion presumably tends to increase even further, thereby further
decreasing the hit and false alarm rates. Virtually all of the empirical evidence
for this simple interpretation is based on estimates of the bias measure β from
signal detection theory. In this article, I describe a new approach to studying
decision making that does not require the technical assumptions of signal detec-
tion theory. The results of this new analysis suggest that the detection criterion
is never biased toward either response, even when the signal rate is low and
the time on task is long. Two modifications of the signal detection theory
framework are considered to account for this seemingly paradoxical result. The
first assumes that the signal rate affects the relative sizes of the variances of the
information distributions; the second assumes that the signal rate affects the
logic of the operator’s stopping rule. Actual or potential applications of this
research include the improved training and performance assessment of operators
in areas such as product quality control, air traffic control, and medical and clin-
ical diagnosis.

Editor’s note: The following article was the INTRODUCTION


subject of considerable – yet constructive and
informative – debate throughout the review In many operational and applied settings,
process. The reason is that it challenges cer- fast and accurate decisions about the status of
tain widely accepted assumptions associated a system (e.g., the safety conditions of an air-
with estimates derived from the theory of sig- craft, a surgery patient’s vital signs) must be
nal detectability (TSD) and proposes an alter- made by human operators on the basis of data
native approach. Given that TSD is one of the presented on an information display of some
more robust human performance models we kind (e.g., a video terminal). The conditions
have, and has been widely used in many task requiring action are often relatively rare and
contexts including vigilance, it is not surpris- unpredictable, making it necessary for the
ing that such a challenge would generate con- operator to maintain a high level of alertness
troversy. Therefore, I felt that the article over an extended period. Any breakdown in
should be published and widely read so that this vigilance or sustained attention to the dis-
you, the readers, could form your own opin- play could cause the operator to overlook or
ion. We must always remember that science is respond too slowly to a mission-critical event.
an endless search for better explanations, not A classic example of the failure of sustained
final answers. We must continually challenge attention in an applied setting was the tenden-
convention if we are to move forward. cy of airborne British radar observers during
Requests for reprints should be sent to J. D. Balakrishnan, Department of Psychological Sciences, Purdue University, West
Lafayette, IN 47907; jdb@psych.purdue.edu. HUMAN FACTORS, Vol. 40, No. 4, December 1998, pp. 601–623.
Copyright © 1998, Human Factors and Ergonomics Society. All rights reserved.

Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016


602 December 1998 – Human Factors

World War II to begin to miss the radar signals eral properties and limitations of human vigi-
indicating the presence of enemy submarines in lance, including the effect of the signal rate on
the area after about 30 min on watch (N. H. the operator’s detection rate and the change in
Mackworth, 1948, 1961). This so-called vigi- detection rate over time, depend heavily on
lance decrement has since been documented in the notion of a detection criterion.
a variety of real-world and laboratory settings In its most general form (i.e., as a nonpara-
(e.g., Davies & Parasuraman, 1982; See, Howe, metric description of the two stages involved
Warm, & Dember, 1995; Warm & Jerison, in discrimination), the signal detection theory
1984). model is nonfalsifiable, but it is also unin-
Typically, when a vigilance decrement is formative about decision-making aspects of
observed, it is accompanied by a decline in the performance. In order to measure the contri-
number of incorrect responses to nonsignal bution of the decision-making process, some
events, or false alarms. Both effects are consis- additional, technical assumptions about the
tent with the notion that the operator’s atten- two processes (e.g., normality of the encoding
tion level is somehow impaired by the passage distributions and a single detection criterion)
of time, but both results could also be attrib- are required. These assumptions can be – and
uted to a reduction in the operator’s willingness have been – challenged for a number of rea-
to report a signal event; this could occur for a sons (e.g., Green & Swets, 1974; Lockhart &
number of reasons. One of the main objectives Murdock, 1970), and a number of alternatives
of empirical vigilance research, therefore, has to d′ and β have been proposed and studied
been to establish a general set of guidelines, or (e.g., Macmillan & Creelman, 1990; See,
a “vigilance taxonomy,” that can be used to pre- Warm, Dember, & Howe, 1997; Swets, 1986).
dict when and how the vigilance decrement will Few theorists, however, have found any seri-
be manifested (e.g., Parasuraman, 1979; ous reason to question the general concept of
Parasuraman & Davies, 1977; Parasuraman & the detection criterion.
Mouloua, 1987; Parasuraman, Warm, & In this article, I review and apply a new set
Dember, 1987; See et al., 1995; Warm, 1984). of “distribution-free” performance measures
For many vigilance researchers, the preferred that I have recently developed to study decision-
method of separating the effects of attentional making processes in vigilance and other kinds
capacity on performance from the effects of an of two-choice classification tasks (Balakrish-
operator’s attitudes or performance strategy is nan, 1998; Balakrishnan & Ratcliff, 1996).
to compute the sensitivity (d′) and response These measures are based on the same funda-
bias (β) statistics of signal detection theory mental principles of signal detection theory
(e.g., Davies & Parasuraman, 1982; Green & (i.e., statistical decision making) and, hence,
Swets, 1974; Proctor & Van Zandt, 1994). the same concept of response bias (i.e., a
According to this general framework for dis- description of the mapping between informa-
crimination behavior, the series of mental tion states and responses). However, instead of
events that takes place when an operator mon- making assumptions about the shapes of the
itors an information display over an extended encoding distributions, the new measures take
period can be broken up into pairs of tempo- advantage of the extra information about the
rally distinct operations: encoding and deci- decision-making process provided by subjec-
sion making. tive confidence judgments.
An overt response to a signal occurs when The results of these new analyses suggest
the encoded information, combined in some that operators do not change the value of a
manner with the operator’s objectives and detection criterion in response to changes in
knowledge of the task, exceeds a detection cri- the signal rate in a vigilance setting. The fact
terion. This criterion is under the control of that the signal rate does have a substantial
the operator, and its value ultimately defines impact on the operator’s detection rate must
the efficiency (or optimality) and the bias of therefore be attributed to an entirely different
the operator’s decision-making strategy. Some type of response bias. On this basis, I ar-
of the most widely accepted views of the gen- gue that the significance of some of the most
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
DECISION PROCESSES IN VIGILANCE 603

well-established findings in the vigilance litera- lance period. In the successive form, one of
ture – including the supposed conservatism of two stimulus objects is presented, and the par-
the criterion placement and the supposed cri- ticipant must refer to his or her memory of the
terion increment with time on task – need to objects in order to identify whether this object
be reexamined. represents a signal or a no-signal event. In the
In the first part of the paper, I review the simultaneous form, each event is defined by
basic concepts and measures of signal detec- the presentation of two stimulus objects. These
tion theory and their application to vigilance. I two objects are identical in the no-signal con-
then describe the new distribution-free mea- dition and different in some way (e.g., in size)
sures and use these to show that the detection in the signal condition. Given that the two ob-
criterion account of vigilance makes a testable jects are visible simultaneously, the task does
prediction about subjective confidence judg- not require any long-term memory for the
ments that is consistently contradicted by the physical properties of the two stimuli.
data. This result reopens the question of why The second important dimension identified
the operator’s detection rate is directly tied to by Parasuraman and Davies (1977) is the aver-
the signal rate in vigilance. Accordingly, in the age time between stimulus presentations (i.e.,
second part of the paper, I describe two potential the event rate). By combining the two dimen-
modifications of the classical signal detection sions (type of task and event rate) Parasur-
theory framework that would be consistent aman and Davies found reasonably consistent
with both the effect of signal rate on detection evidence for decreases in perceptual sensitivity
rate and the observed properties of subjective (i.e., a sensitivity decrement) in successive dis-
confidence. In the final section, I consider some crimination tasks when the event rate was
of the implications of rejecting the detection high and for criterion effects under other con-
criterion construct for researchers whose main ditions. More recent work (e.g., Deaton &
purpose is to measure and compare the per- Parasuraman, 1993; Parasuraman et al., 1987;
formance levels of human operators under dif- See et al., 1995; Warm, Dember, Murphy, &
ferent experimental conditions. Dittmar, 1992) has identified other variables
(e.g., signal discriminability and cognitive vs.
BACKGROUND sensory tasks) that can influence the type of
decrement observed. A detailed review and
In order to mimic some of the more com- meta-analysis of previous findings is given by
mon features of real-world vigilance settings, See et al. (1995).
experimental studies of vigilance typically
require the participant to monitor a series of Statistical Decision Making and Signal
distinct stimulus events presented at either Detection Theory
fixed or irregular intervals. Most of these When the signal detection theory measures
events are no-signal events, which require no are used to analyze and interpret vigilance
overt response by the participant. Occasion- data, the main assumption involved is that vig-
ally, however, a signal stimulus occurs, and ilance can be formally represented as a series
the participant is asked to report this event. of simple, two-step hypothesis tests. In Step 1,
Under most conditions, participants occasion- the operator collects some information from
ally respond incorrectly to a no-signal event the display, and in Step 2, a strategy or rule is
(false alarms) or fail to respond to a signal used to assign the resulting internal informa-
event (misses). The rates of these two kinds of tion state to a judgment (e.g., a detection
errors usually provide the primary basis for response). Figure 1 illustrates the additional
measuring and interpreting performance. assumptions of the model associated with the
One of the more important dimensions of d′ and β statistics.
the original taxonomy developed by Parasur- In the two panels of the figure, the stimulus
aman and Davies (1977) to organize the vigi- conditions – signal and no signal – represent
lance literature is the type of discrimination identical perceptual sensitivity (equal spacing
task carried out over the course of the vigi- of the distributions). In the upper panel the
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
604 December 1998 – Human Factors

Figure 1. The effects of signal rate in the normal, equal variance model of signal detection theory. In the
lower panel, the decision rule is biased toward the no-signal response but not enough to maximize the per-
centage of correct responses.

response criterion (XC) is unbiased, and in the maximize the overall proportion of correct
lower panel it is biased toward the no-signal discrimination responses). Assuming that the
response. The bias measure β is defined as the signal occurs on one of four trials on average,
ratio of the equal variance normal distribu- the region labeled w in the lower panel of
tions (signal to no signal) at the point XC (e.g., Figure 1 is a set of mental states for which the
Warm & Jerison, 1984). The interval labeled v decision rule is suboptimal.
in the lower panel of the figure represents a It is also important to recognize that the dis-
set of “biased mental states” – that is, in this tributions in Figure 1 are assumed to represent
region the states are mapped to a no-signal the true or objective likelihoods (probability
response, and the height of the no-signal dis- densities) of the states induced by the stimuli.
tribution, fN(t), is less than that of the signal Thus a decision rule is or is not biased depend-
distribution, fS(t). ing on the mapping between mental states and
In order to appreciate the significance of their true likelihoods under the two stimulus
the results that are described later, several conditions. These likelihood values presumably
implicit properties of this two-stage model depend on both external, physical properties of
must be clearly understood. First, note that bias the stimuli and internal properties of the percep-
and optimality are completely independent tual system (i.e., on the noise level). Whether or
constructs in the model: Depending on the not the decision maker consciously attempts to
prior probabilities (or rates) of the no-signal use a biased or suboptimal decision rule is an
and signal events, the decision rules represent- entirely different issue (i.e., an issue of the cor-
ed in Figure 1 may or may not be optimal with rect psychological interpretation of the observed
respect to accuracy (i.e., they may or may not decision rule; Balakrishnan, 1997).
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
DECISION PROCESSES IN VIGILANCE 605

Finally, given that the effect of a change in be modified so that the participants choose
decision-making variables (e.g., the signal both a discrimination response (signal or no-
rate) is represented in the model as a shift in signal) and a confidence rating after each stim-
the location of the detection criterion, it is im- ulus event. The most efficient way to obtain
portant to notice that the mental states nearest both types of data is to add an explicit cutoff
to the criterion are the first to be remapped to somewhere (e.g., in the middle) of the bipolar
a different response when these decision vari- rating scale. That is, somewhere on the scale
ables are manipulated. That is, a small change are two adjacent rating responses that repre-
in the signal rate or payoff structure does not sent the least confident no-signal and signal
lead to an arbitrary remapping of states to responses. The highest and lowest values on
responses but, rather, to a remapping of the the scale represent the highest confidence sig-
states near the criterion. An important implicit nal and no-signal responses, respectively.
assumption of this model, therefore, is that Alternatively, the participant may be asked
the decision process considers the states near to first give a signal or no-signal response and
the criterion to be associated with weak evi- then a confidence rating. The same representa-
dence or high uncertainty. More generally, the tion as the bipolar rating scale is easily obtained
decision maker’s perception of the strength of from these data by multiplying the no-signal
the evidence (i.e., subjective confidence) is a response confidence ratings by –1. This devia-
monotonic, increasing function of the distance tion from the more common vigilance design
of the perceptual state from the detection cri- obviously could have some effect on a partici-
terion (Balakrishnan & Ratcliff, 1996). If this pant’s performance. However, because the
were not so, then the parameters of the model detection problem remains the same, there is
could still be estimated, but the theory could no basis within signal detection theory for pre-
not explain why the degree of the criterion dicting that soliciting confidence judgments
shift is a direct function of the degree of the should fundamentally alter the nature of the
bias manipulation. detection process. In support of this, the data
reported later replicate the two major results
The Ratings Paradigm previously attributed to the placement of a
In a popular variation of the classical yes-no detection criterion: (a) The hit rate is substan-
detection paradigm, the all-or-none detection tially lower than the correct rejection rate when
response is replaced by a graded response (i.e., the signal rate is low, and (b) the hit and false
a confidence rating on a single, bipolar scale). alarm rates both decrease with time on task.
High confidence in a no-signal response is indi-
cated by lower values on the scale, and high DISTRIBUTION-FREE TESTS
confidence in a signal response is indicated by OF OPTIMALITY AND BIAS
higher values on the scale. Given that the par-
ticipant’s confidence state should be a monoto- Suboptimal Decision Rules
nic function of the encoding effect of the If the participant’s objective is to maximize
stimulus, asking participants to rate their con- the percentage of correct responses, then the
fidence is equivalent to asking them to set optimal decision rule is to choose the response
additional criteria on the continuum of encod- that is most likely to be correct on each trial.
ing effects (e.g., Dorfman & Alf, 1969). In other words, the participant should choose
The ratings paradigm is often preferred to the response associated with the maximum of
the yes-no detection task because it allows the the two objective posterior probabilities, P[no
researcher to test hypotheses about the sensory signal | Ψ] and P[signal | Ψ], where Ψ is a
distributions using data from a single condition random variable representing the information
(i.e., without varying the stimulus presentation state. (Note that P[signal | Ψ] = 1 – P[no
rates or the values of a payoff matrix). In order signal | Ψ]).
to take advantage of the additional information A simple and direct consequence of this fact
about the decision process provided by confi- leads immediately to a test for optimality that
dence ratings, the standard vigilance task must does not depend on any assumptions about the
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
606 December 1998 – Human Factors

distributions of the stimulus information (Ψ) tion theory also assumes that there is only
under the two event conditions. Specifically, no one, if any, contiguous region of biased mental
matter what the transformation is from stimulus states. In principle, several different biased
information (Ψ) to a confidence state E, the regions could exist, and in fact the same de-
probability that the response associated with this cision rule could be biased simultaneously
state is correct must be greater than or equal to toward both responses (i.e., there could be one
the probability that the response is incorrect (or, region of bias toward the no-signal response
equivalently, greater than 0.5). If this were not and another toward the signal response). A
true, then the decision rule would not be opti- simple way to avoid both the distributional
mal, because the decision maker could increase assumption and the assumption about the
the percentage of correct responses by reversing number of biased regions in signal detection
the discrimination response associated with theory is to redefine the bias index as the
these suboptimal confidence states. probability (relative frequency) of a biased
To apply this assumption-free test to empir- response (or, in other words, the probability
ical data, the experimenter calculates the pro- that the perceptual effect of the stimulus will
portion of correct responses, P[no signal | R = fall into a biased response region),
kN], for each rating response R = kN associated
with a no-signal response, and the proportion pbias = pN ∫v fN(s)ds + pS ∫v fS(s)ds,
of correct responses, P[signal | R = kS], for
each rating response associated with a signal where pS and pN are the signal and no-signal
response. If any of these values is less than 0.5 rates, respectively, and v denotes all regions of
(allowing for estimation error), then it follows confidence values for which the decision rule
immediately that participants could improve is biased. This distribution-free index is per-
their performance merely by changing their fectly compatible with β of signal detection
decision rule (i.e., by making the alternative theory, given that for each pair of values β and
discrimination response for each suboptimal d′ there exists a unique value of pbias (see
rating response). Figure 1). Using the symbol βp to refer to this
According to classical signal detection theo- particular, distribution-dependent value, the
ry analyses of human discrimination perfor- distribution-free construct pbias and the signal
mance, participants tend to be conservative in detection theory statistic βp will be equal
their response to objective bias manipula- when the distributional assumptions of signal
tions. For example, when the signal rate is de- detection theory are in fact satisfied.
creased, the estimated value of the detection To estimate pbias without relying on a distri-
criterion, XC, increases, but not enough to butional assumption, the method proposed by
maximize the percentage of correct responses Balakrishnan (1998) is to examine the empiri-
(Creelman & Donaldson, 1968; Green & cally observable function,
Swets, 1974; Macmillan & Creelman, 1991;
Maloney & Thomas, 1991). In other words, UR(k) = FR,N(k) – FR,S(k), (1)
empirical estimates of the parameters of this
model indicate that the participant’s decision where FR,N(k) and FR,S(k) are the cumulative
rule is in fact biased when the signal rate is relative frequency histograms of confidence
low, but it is not biased enough to be optimal rating k on no-signal (N) and signal (S) trials,
(e.g., as in the lower panel of Figure 1). This respectively. When this function is decreasing
means that there should exist a set of mental for any value of k associated with a no-signal
states for which the optimality test described response or increasing for any value of k asso-
earlier fails. Data confirming this prediction ciated with a signal response, then the decision
are presented later. rule is biased. Further, the total proportion of
these “biased rating responses” in the data
A Distribution-Free Index of Response Bias (henceforth, Ωp) provides a distribution-free
In addition to assuming a specific distribu- estimate of the probability of a biased re-
tion model, the bias statistic β of signal detec- sponse, pbias.
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
DECISION PROCESSES IN VIGILANCE 607

The basic idea behind this new, distribution- the ROC curve is less than one and decreasing
free test is very simple. As is well-known (e.g., when this value is greater than one.
Green & Swets, 1974), the slope of the receiv- Because of the relationship in signal detec-
er operating characteristic (ROC) curve at any tion theory between the slope of the ROC
information state t is equal to the likelihood curve and the likelihood ratio at the detection
ratio of the two underlying distributions, criterion (i.e., β), this model makes a testable
fS(t)/f N(t), where f denotes the density func- prediction about the location of the peak of
tion of the encoding effect of the stimulus. the function UR(k). Specifically, if the detec-
Note that this result refers to a plot of the hit tion criterion is shifted in favor of the no-signal
rate against the false alarm rate in probability response, then the peak of the UR(k) function
coordinates. A plot of these values using should be shifted toward the no-signal side of
normal-normal coordinates, or z-score trans- the bipolar rating scale. If the bias is toward
forms, produces a linear function when the the signal response, then the peak should
underlying distributions are normal. It is easy occur on the signal response side of this scale.
to show (see Balakrishnan, 1998) that the This fundamental prediction of signal
function UR(k) is increasing when the slope of detection theory is illustrated in Figure 2,

Figure 2. An illustration of the relationship between properties of the confidence ratings distributions and
the location of the detection criterion in signal detection theory. Changes in the signal rate change the loca-
tion of the detection criterion and hence the objective likelihood ratios corresponding to the lowest confi-
dence states. When the decision rule is biased, the log likelihood function crosses the value 0 (i.e., the
likelihood ratio is greater than 1) between Rating Responses 3 and 4, and the log likelihood ratio at the
detection criterion is therefore greater than 0.
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
608 December 1998 – Human Factors

which plots the UR(k) function corresponding


to biased and unbiased decision rules in an
experiment with four confidence levels per
response (i.e., the bipolar rating scale is com-
posed of the integers 1–8, and the yes-no cut-
off is between 4 and 5). In this example, the
decision rule is unbiased when the criterion
defining the upper bound of Response 4 (i.e.,
the detection criterion) is set at the intersec-
tion of the two distributions, as in the left
panel Figure 2. Note that for each rating
response k, the log likelihood ratio,

L(k) = log ( P(R = k | S )


P(R = k | N) )
increases with k and crosses the value 0 (i.e.,
the likelihood ratio crosses the value 1)
between the lowest confidence no-signal and
signal responses (i.e., Responses 4 and 5; see
lower left panel of Figure 2).
When the detection criterion is shifted to
the right, as in the upper right panel of Figure
2 (e.g., as is presumably the case when the no-
signal rate is increased), the log likelihood
ratios associated with the different rating
responses cross the value 0 prior to the cutoff
between no-signal and signal responses. In the Figure 3. Illustration of the potential for the func-
tion UR(k) to miss the bias in the decision rule if
unbiased case (left panels), the function UR(k) the spacing between criteria is too large. Given that
reaches its peak at the yes-no cutoff (Rating the interval defining the response bias (i.e., the
Response 4), whereas in the biased case (right interval between the upper bound on “4” responses
panels), the function reaches its peak earlier and the point at which the two density functions
(in this example, at Rating Response 3). intersect) is contained within the interval between
the upper bounds on “4” and “3” responses, the
Underestimation of Bias Using the proportion of biased responses is less than the pro-
portion of “4” responses.
Distribution-Free Measure Ωp
If the test function UR(k) is found to be
unimodal with its maximum at the yes-no cut- on the true proportion of biased responses.
off (e.g., Rating Response 4 on an eight-point This property of the measure Ωp is illustrated
bipolar scale), then Ωp will be zero, suggesting in Figure 3.
that the decision rule is completely unbiased. Although the decision rule is biased, the
However, in such a case it is possible that a spacing between the detection criterion and
biased response region near the detection cri- the immediately adjacent criterion on the left
terion did exist but was too small to have any is too large for this bias to be detected from
effect on the discrete function UR(k); a more the function UR(k). To see why the proportion
detailed explanation of this potential limita- of “4” responses is an upper bound on the pro-
tion of Ωp is given in Balakrishnan (1998). portion of biased responses in this example,
Because of this, when Ωp is zero, the researcher note that the actual proportion of biased re-
should also calculate the proportion of least con- sponses is determined by the areas under the
fident no-signal responses and the proportion two distributions (see top panel of Figure 3)
of least confident signal responses. The maxi- between the detection criterion (the upper
mum of these two values is an upper bound bound for Rating Response 4) and the point
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
DECISION PROCESSES IN VIGILANCE 609

of intersection between the two distributions. sequently, the rating task was always per-
This interval is contained within the interval formed with equal likely stimuli.
between the detection criterion and the upper Finally, instructions or a continuous valued
bound for Response 3, and hence the propor- rating scale must be introduced so that the pro-
tion of “4” responses is larger than the pro- portion of the lowest confidence responses is
portion of biased responses, pbias . small. In signal detection theory’s representa-
Given that the upper bound on the total tion of rating data, the relative frequencies of
amount of response bias when Ωp equals zero the different rating responses are determined
is determined by the proportion of lowest confi- by the spacing of the criteria: A small response
dence no-signal and signal responses (and, proportion indicates a small response bin.
hence, by the spacing between the detection Small spacing gives the experimenter an esti-
criterion and the two immediately adjacent cri- mate of the UR(k) function (or, equivalently, the
teria), there are several possible ways a slopes of the ROC curve) at positions close to
researcher can keep this upper bound to a mini- the detection criterion, which is needed in order
mum. These include (a) adding rating respons- to place a sharp upper bound on the degree
es to the confidence scale (adding more criteria of response bias. Obviously, any one – or all
to decrease the spacing between them); (b) three – of these manipulations may have some
instructing the participants to be conservative effect on the operator’s detection performance.
about making the least confident responses on However, because these manipulations are all
the rating scale (in effect, asking the participant perfectly consistent with the fundamental prin-
to use small spacing near the detection criteri- ciples of signal detection theory (i.e., that there
on); and (c) using a sliding scale response pro- is a continuum of information states that the
cedure (essentially, a continuous measure of participant is obliged to divide into response
confidence; e.g., Watson, Rilling, & Bourbon, bins using response criteria), there is no basis
1964). Each of these methods can be employed within this theoretical framework for claiming
without increasing the required number of that the fundamental nature of the operator’s
experimental conditions or the sample size. The performance (i.e., the issue of whether or not
same arguments given earlier about the effects an adjustable criterion is set on an informa-
of adding a confidence rating judgment to the tion continuum) should change because of
vigilance task would also apply to the effects of these changes in the details of the design.
these manipulations on the detection process.
EMPIRICAL APPLICATIONS
New Methods and Their Relationship to
Signal Detection Theory Although several aspects of vigilance, in-
The new analyses described previously differ cluding the exact form of the vigilance decre-
from traditional signal detection theory analy- ment, appear to depend on the specific aspects
ses of ratings data in three ways. First, a cutoff of the vigilance task, two empirical results are
between no-signal and signal responses must be consistent throughout this literature. First,
added to the rating scale. Traditionally, the rat- when the signal rate is low, the correct rejec-
ing scale was used exclusively to study sensitivi- tion rate is substantially larger than the detec-
ty and the shape of the ROC curve, and tion rate (the signal rate effect). As the signal
therefore an explicit cutoff between the two rate increases, the difference between these
discrimination responses was unnecessary. two proportions decreases (e.g., Davies & Para-
Second, the signal rate or some other factor suraman, 1982). The most common interpre-
presumably tied to the decision process should tation of this well-established effect is the one
be set in such a way as to induce a bias. In the associated with signal detection theory (i.e.,
traditional approach, the purpose of the rating the detection criterion is biased toward the
task was to avoid the need to vary the decision more frequent stimulus event in the manner
maker’s response bias, and therefore there was represented in the lower panel of Figure 1).
no obvious reason to combine the rating task Second, as the time on task increases, both the
with the standard bias manipulations. Con- hit and the false alarm rates decrease, suggesting
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
610 December 1998 – Human Factors

that the initial bias against the detection re- at Purdue University participated in a single 50-
sponse increases with time or, in other words, min session in partial fulfillment of an introduc-
that the detection criterion increases (the cri- tory course requirement. All were prescreened
terion increment). for normal or corrected-to-normal vision, and
The main purpose of the experiments reported they performed one of three experimental condi-
here was to apply the methods described pre- tions (20 participants per condition): successive
viously to test the detection criterion account discrimination, simultaneous discrimination,
of the effect of signal rate and time-on-task or mixed successive discrimination.
on vigilance performance. In the first experi- Stimuli. In the two successive discrimina-
ment, participants performed successive or tion conditions, the signal and no-signal stim-
simultaneous discrimination tasks for a con- uli were defined by the length and height of an
tinuous 50-min period. The participants were L-shaped figure presented in the approximate
asked to respond to each stimulus presenta- center of a computer monitor (14-inch, or
tion using a bipolar rating scale with a yes-no 35.6-cm diagonal) for 0.25 s. The vertical and
cutoff in the middle rather than to respond horizontal lengths of this figure were 4.9 cm
only to a perceived signal event, so that the and 2.1 cm, respectively, for the no-signal
necessary confidence rating data could be stimulus. For the signal stimulus, the vertical
obtained. In the simultaneous discrimination length was increased to 5.4 cm and the hori-
condition, the signal rate was fixed at .1. The zontal length was increased to 2.4 cm.
reason for performing both the simultaneous In the simultaneous discrimination condi-
and successive discrimination tasks was not to tion, the events were defined by the simultane-
compare performances between the two tasks ous presentation of two L-shaped figures, one
but, instead, to add some generality to the on the left (rightmost edge 2.2 cm to the left
conclusions about the relationship of decision- of the screen center), and the other on the
making processes to the signal rate and the right (leftmost edge 2.2 cm to the right of the
time on task. horizontal screen center).The two figures were
In addition to these two standard vigilance either identical (the no-signal stimulus) on
conditions, a third, mixed successive discrimi- both dimensions (horizontal and vertical
nation condition was run in which the signal length) or different on both dimensions (the
rate varied between blocks of 100 events. On signal stimulus). The horizontal and vertical
half of these blocks the probability of a signal lengths of the segments were 2.1 cm and 4.9
event was .1, and on the other half it was .5. cm, respectively, for the figure presented on
These data provide a basis for comparing the the left in both the no-signal and the signal
test function, U R(k), under conditions in stimulus. The lengths of these segments for
which the optimal decision rule was either the figure presented on the right were
biased or unbiased for the same group of par- increased to 2.3 cm (horizontal) and 5.2 cm
ticipants. Under no objective bias manipula- (vertical) for the signal stimulus. In all condi-
tion, both measures Ωp and βp would be tions, the two segments in each stimulus were
expected to be zero or close to zero, with little 1 mm wide and viewing distance was approxi-
implication for the study of the nature of mately 40 cm. The physical sizes of the stimuli
response bias. Given that one demonstration were chosen on the basis of pilot data so that
of this easily predicted result seemed adequate the tasks would be moderately difficult. Given
by itself, the analogous simultaneous discrimi- that the comparisons of interest were within
nation condition with equal no-signal and sig- conditions, no attempt was made to equate
nal event rates was not included in the design. the difficulty levels of different conditions for
the average participant.
EXPERIMENT 1 Procedure. At the beginning of the session,
the two stimuli (no-signal and signal) were
Method presented simultaneously on the display
Participants. Sixty undergraduate students screen with their correct identification labels.
from the psychology department participant pool The experimenter explained the nature of the
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
DECISION PROCESSES IN VIGILANCE 611

task and asked the participants to press one respond prior to the deadline as often as pos-
key to begin the experimental trials or another sible, an artificial reward system was used
key to repeat the demonstration (i.e., there was (participants were not paid). If a participant
no practice or warm-up period). Each stimulus failed to respond prior to the end of the dead-
event began with a random foreperiod during line, the screen flashed and a penalty of 50
which the screen was blank. This delay value points was assessed. This occurred on less
was obtained by sampling from a truncated than 3% of the trials across conditions. Parti-
exponential distribution with a mean of 2.0 s. cipants were awarded 10 points for each cor-
If this sampled value was less than 0.3 s or rect response and were penalized 10 points
greater than 30 s, a new sample was taken for each incorrect response.
(i.e., the minimum foreperiod was 0.3 s and The correct response and the cumulative
the maximum foreperiod was 30 s). After the point scores as well as the points allotted after
stimulus presentation (0.25 s), the participants each event were presented on the screen for 1 s.
entered a response using a 14-point bipolar The foreperiod for the subsequent event began
rating scale. immediately after the feedback for the previ-
There were 14 adjacent keys on the upper ous response was removed from the display
row of the computer keyboard labeled from screen. Including the time to respond and the
left to right with decreasing integers from 7 to feedback, the total time for each event was
1 (no-signal responses) and then increasing between 1.3 and 33 s, and the average event
integers from 1 to 7 (signal responses). The rate was therefore less than 20 events/min.
two sides of the scale were labeled with the In the mixed successive discrimination con-
discrimination response that they represented, dition, the signal rate was chosen randomly
and the participants were asked to be conserv- from the two values .1 and .5 and was held
ative in their use of the extreme (low and high constant at this value within a block of 100
confidence) rating responses. events. At the beginning of each block, a mes-
The participants were given a 2-s deadline sage indicating the signal rate for the next
to respond to the presentation of the stimulus. block of trials was presented on the screen for
To indicate the amount of time remaining 5 s. In the other two conditions, participants
before the deadline was reached, an expand- were told at the beginning of the session that
ing horizontal line was presented in the upper the no-signal event would occur nine times
portion of the display (9.4 cm below the top more often than the signal event.
edge of the viewing screen) after the stimulus
was removed. The length of this line increased Results and Discussion
at a constant rate (9.54 cm/s for 1.75 s) until Two estimates of sensitivity, d′ and A′ (area
its right-hand endpoint reached a target point under the confidence rating ROC curve; e.g.,
indicating the completion of the time allowed Green & Swets, 1974), and the two bias mea-
for a response. To encourage participants to sures βp and Ωp are listed in Table 1. These

TABLE 1: Performance Statistics for the Four Conditions of Experiment 1

Condition d′ A′ CR HR βp Ωp Max(pbias )

Successive 1.908 .815 .945 .619 .125 .000 .031


Simultaneous 1.482 .727 .923 .521 .162 .000 .002
Mixed successive 1.542 .747 .932 .518 .163 .004 .039
(Ps = .1)
Mixed successive 1.456 .800 .798 .733 .032 .000 .011
(Ps = .5)

Note: CR and HR are the correct rejection and hit rates, respectively, and Max(pbias) is the estimated
upper bound on the proportion of biased responses.
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
612 December 1998 – Human Factors

Figure 4. Estimates of the UR(k) functions for the four conditions of Experiment 1. The dashed vertical line
indicates the predicted location of the maximum of the function implied by the estimated value of βp given
in Table 1.

and subsequent estimates were obtained by com- in all four conditions, suggesting that the deci-
bining the individual participants’ data sets to- sion rule was uniformly unbiased and therefore
gether to form a single data set. Estimates of unrelated to the signal rate.
the UR(k) functions for each condition are Given that there is only one region of
shown in Figure 4. As expected, the correct biased mental states in signal detection theo-
rejection rate was considerably larger than the ry’s representation of the decision process, this
hit rate in the three low signal rate conditions, model predicts that UR(k) will be unimodal,
suggesting a substantial bias toward the no- with its modes shifted to the left of the yes-no
signal response in each of these conditions. cutoff when βp is large (see Figure 2 and
Given that the signal detection theory measure Balakrishnan, 1998). Although the estimated
βp (and all of the other signal detection theory functions are in fact unimodal, there is no left-
measures reviewed recently by See et al., ward shift in the position of their maxima in
1997) is directly related to the size of this dif- any of the conditions. In each case, the peak
ference, its value was large in each of these of the function occurs at or very near the yes-
conditions. no cutoff. Because the sample sizes were large
Also as expected, the difference between the enough for the estimated functions to be
two correct response proportions was consid- smooth and unimodal (i.e., the result cannot
erably smaller when the signal and no-signal be attributed to estimation error), an upper
rates were equal. In contrast, the distribution- bound on the proportion of biased responses
free measure Ωp was zero or very close to zero is the estimate of Ωp (zero in all but one case)
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
DECISION PROCESSES IN VIGILANCE 613

Figure 5. Proportion correct conditioned on the discrimination response and confidence level for the four
conditions of Experiment 1. Any percentage correct value less than 0.5 indicates that overall proportion cor-
rect could be improved by changing the decision rule.

plus the proportion of rating responses at the the optimal decision rule (maximizing per-
maximum of the function (Rating Response 7 centage correct). However, when the signal
in all but one case). These values – also very rate is low, an unbiased decision rule is neces-
small – are given in the right-hand column of sarily suboptimal no matter what the shapes
Table 1. of the underlying distributions may be (i.e.,
Finally, to illustrate in a more concrete way equal variance normal or otherwise). Thus, if
the discrepancy between the signal detection the decision rule in each of the four conditions
theory representation of bias and the true like- of the experiment was in fact unbiased, it
lihood ratios associated with the confidence must also be suboptimal in three of the four
ratings, Figure 4 also shows the point at which conditions (i.e., when the signal rate was low)
the UR(k) functions should have reached their and optimal in the other (i.e., when the signal
maxima under the distributional assumptions and no-signal rates were equal). Results of the
of this model. These values are indicated by optimality test defined earlier are shown in
the dashed vertical lines connecting the esti- Figure 5. Consistent with the interpretation
mated UR(k) function to the abscissa. Note that the decision rule was in fact unbiased, the
that each of these predicted maxima occur to test clearly failed when the signal rate was low
the left of the observed maxima. (the proportion of correct responses associat-
ed with several rating responses was less than
The Optimality Test 0.5), but it was clearly satisfied for all 14 rat-
When the signal and no-signal event rates ing responses when the signal and no-signal
are equal, the unbiased decision rule is also rates were equal.
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
614 December 1998 – Human Factors

EXPERIMENT 2 were very similar to those in Experiment 1.


Most importantly, when the signal rate was
Method low, the signal detection theory bias measure
The main purpose of the second experiment βp was large, whereas the distribution-free
was to replicate the results of Experiment 1 measure Ω p was zero or close to zero.
using a slightly different design. In the first Estimates of the UR(k) functions are shown in
experiment, the feedback was given immedi- Figure 6. Once again, there was no indication
ately after the participant’s response to the of any leftward shift in the maximum of the
stimulus event, and the foreperiod for the sub- function with respect to the yes-no cutoff. In
sequent event began immediately thereafter addition, the point at which the UR(k) func-
(i.e., after the 1-s feedback display). This tion should have reached a maximum under
meant that by responding quickly to an event, the assumptions of signal detection theory was
the participant could decrease the total time consistently to the left of the observed loca-
between stimulus presentations, raising the tion, indicating that βp overestimates the
possibility that he or she might introduce a amount of bias. Results of the assumption-free
subjective value on the speed of responses test for optimality, shown in Figure 7, were
(i.e., in order to increase the pace of the task). also similar to those of Experiment 1.
Because response time incentives could poten-
tially play an important role in alternative GENERAL DISCUSSION
accounts of the decision-making process, the
time between the stimulus presentation and Implications for the Fixed-Sample
the feedback display in Experiment 2 was Models of Vigilance
fixed at 2 s, regardless of when the participant Although the statistical analyses described
entered his or her response. Thus, responding previously are new and somewhat complex,
more quickly would not decrease the total the conclusion to be drawn from them is fairly
length of time between events. simple. Specifically, the signal rate had a signif-
There were 22 participants in the successive icant impact on the detection rate but no effect
and simultaneous discrimination conditions on the logic of the decision rule (i.e., no effect
(12 and 10 per condition, respectively), and on response bias as this notion is defined in
15 participants performed the mixed succes- signal detection theory). This seemingly para-
sive discrimination condition. The methods doxical result has at least two important impli-
were otherwise identical to those of Exper- cations. First, given that the optimal decision
iment 1. rule is biased toward the more frequent stimu-
lus event, the participants’ unbiased decision
Results rule is necessarily suboptimal. Second, because
Performance statistics for the four condi- the participants’ detection rate did depend on
tions are listed in Table 2. Overall, the results the signal rate, the relative frequency of the

TABLE 2: Performance Statistics for the Four Conditions of Experiment 2

Condition d′ A′ CR HR βp Ωp Max(pbias)
Successive 2.365 .887 .947 .772 .070 .007 .037
Simultaneous 2.109 .860 .942 .702 .095 .000 .034
Mixed successive 1.778 .812 .929 .621 .123 .024 .098
(Ps = .1)
Mixed successive 2.045 .873 .886 .799 .043 .000 .014
(Ps =.5)

Note: CR and HR are the correct rejection and hit rates, respectively, and Max(pbias) is the estimated
upper bound on the proportion of biased responses.
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
DECISION PROCESSES IN VIGILANCE 615

Figure 6. Estimates of the UR(k) functions for the four conditions of Experiment 2. The dashed vertical line
indicates the predicted location of the maximum of the function implied by the estimated value of βp given
in Table 2.

two stimulus events must have substantial distribution causes the hit rate to decrease and
effects on the encoding distributions. the correct rejection rate to increase.
For obvious reasons, few if any theorists One way to test the hypothesis that the dis-
have explicitly predicted that the signal rate tribution variances are inversely related to pre-
should have absolutely no effect on the partici- sentation rates is to plot the ROC curves for
pant’s decision-making process. Several theo- different signal rate conditions. Under some
rists, however, have argued that this variable weak assumptions (e.g., Green & Swets, 1974),
should have at least some effect on the encod- increasing the differences in the variances of
ing distributions. J. F. Mackworth (1970) and the two distributions would increase the asym-
others (see Craig, 1979), for example, have metry of their corresponding ROC curves.
pointed out that because the signal stimulus Empirical ROC curves from the two mixed suc-
occurs infrequently when the signal rate is low, cessive conditions of Experiment 1 are shown
it is not unreasonable to expect more across- in Figure 9. With respect to the minor diagonal
trial variability in the participant’s internal rep- (i.e., the diagonal line connecting the upper left
resentation of this stimulus. In order to account and lower right corners of the graph), the
for the results described earlier, the signal rate asymmetry in these curves is clearly more pro-
must be assumed to affect only the variances of nounced when the signal rate was low, consis-
the two encoding distributions, as illustrated in tent with J. F. Mackworth’s (1970) hypothesis.
Figure 8. Because the detection criterion is (For more detailed discussions of the analysis
always set at the intersection of the two distrib- of vigilance ROC curves and their implications,
utions, an increase in the variance of the signal see Davies & Parasuraman, 1982.)
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
616 December 1998 – Human Factors

Figure 7. Percentage correct conditioned on the discrimination response and confidence level for the four
conditions of Experiment 2.

However, even granting that less frequent should continue to process the stimulus (or a
stimuli give rise to distributions with more memory trace) until all of the stimulus infor-
variance, several questions about the nature of mation has been exhausted. In such a case, the
the discrimination process still remain to be total time taken to respond might vary from
answered. In particular, it is not clear why the trial to trial, but the participant’s behavior
decision rule is completely unbiased when the would still be formally equivalent to the fixed
signal rate is low (i.e, why the detection criteri- sample model. For a variety of reasons, how-
on is always set exactly at the intersection of ever, it is not unreasonable to suppose that the
the distributions). It is also unclear why the participant might place some value on the
same effect on the sizes of the distribution vari- speed of responses. For example, continued
ances occurred in the simultaneous discrimina- processing of the stimulus requires continued
tion condition, in which the participant can effort, which may lead to increased fatigue
directly compare the physical properties of the over the course of the vigilance period.
stimuli during each presentation. Further, participants do not always wait
until (or respond immediately after) the stimu-
Sequential Sampling Models of Vigilance lus has been removed from the screen.
From the experimenter’s point of view, the Presumably, they respond quickly when they
fixed sample assumption is appropriate for believe they have sufficient information to
vigilance tasks because no specific incentives make a correct response. However, they some-
are given to the participants to respond quick- times make errors when their response times
ly. Under these conditions, the participants are short, suggesting that they might be able
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
DECISION PROCESSES IN VIGILANCE 617

is needed. Once again, such a measure can be


derived by analyzing the behavior of an opti-
mal decision maker.
Whatever the objective of the stopping rule
might be, the optimal decision maker must
still use whatever stimulus information was
collected to compute the two objective proba-
bilities, P[no signal | Ψi ] and P[signal | Ψi ],
where Ψi is now a random vector containing
the samples collected up to time i (the point at
which sampling was terminated). The larger
of these two probabilities would determine the
response selected when the sampling process
is terminated as well as the decision maker’s
confidence in the accuracy of this response.
Given that the reported confidence level of an
optimal decision maker should be essentially a
readout of the objective probability of a cor-
rect response at the point of the decision, a
general test of bias for sequential sampling
models is to determine whether the same
nominal degree of reported confidence is asso-
ciated with the same objective degree of accu-
racy; that is,

P[no signal | R = kN] = P[signal | R = kS], (2)


Figure 8. A distribution model that predicts the
effect of signal rate on vigilance performance.
Increasing the difference in the presentation rates where kN and kS are equivalent confidence
of the two stimulus events changes the relative levels associated with no-signal and signal
sizes of the variances of the distributions, but it has responses, respectively (e.g., on an eight-point
no effect on the degree of response bias.
bipolar scale with a cutoff between Responses
4 and 5, Responses 2 and 7 both represent
to improve their performance by taking more Confidence Level 3).
time to respond. Notice that these same conditional proba-
In the variable or sequential sampling mod- bilities also determine whether the decision
els of human performance, the participant con- rule is optimal in the sense defined earlier for
tinues to collect information until a condition the fixed sample model. Thus, for an optimal
for terminating the sampling process is reached. decision maker in a dynamic decision making
Once this stopping rule condition is satisfied, a context, each of the terms involved in the new
decision rule determines the response to be test should be greater than or equal to .5.
emitted and, if necessary, the confidence level. To account for the transformation from
(For a more detailed discussion of these kinds continuous confidence values to discrete rat-
of models, see Green & Swets, 1974; Link & ing responses, each confidence rating in the
Heath, 1975; Luce, 1986; Thomas, 1971; Town- optimal decision maker should be defined by
send & Ashby, 1983.) Given that the stopping an upper and a lower bound on the probabili-
rule in a sequential sampling model is inherent- ty of a correct response. The probability of a
ly a decision-making process, these models correct response conditioned on the rating
have two potential sources of bias: the stop- response will fall somewhere between these
ping rule and the decision rule. In order to two bounds, and a more general test is there-
develop statistical tests of these types of mod- fore to verify that a higher confidence level is
els, therefore, a different kind of bias measure always associated with higher accuracy, even
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
618 December 1998 – Human Factors

ed to the signal rate and affecting only the


decision-making process (i.e., the stopping
rule), the dynamic modeling framework for
vigilance also provides a basis for explaining
both the lack of bias in the decision rule and
the apparently strong bias in the stopping rule.
To predict the results of the Equation 2 bias
test, it is enough to suppose that the partici-
pants accurately compute the heights of the
encoding distributions associated with their
stimulus samples but overestimate the signal
rate when this value is low. Such a mecha-
nism, already incorporated in classical, fixed-
sample theories of discrimination, would
cause a decision maker in a dynamic decision-
making context to overestimate the probability
P[signal | Ψi ], leading to premature termina-
Figure 9. ROC curves for the mixed successive con- tion of the sampling process.
ditions of Experiment 1. The skew of the curve A similar approach would be to assume
with respect to the minor diagonal (i.e., the line that the participant sets a higher value on cor-
connecting the upper left and lower right corners of
the figure) can be taken as an index of the differ-
rect signal responses than on correct no-signal
ences in variances of the signal and no-signal distri- responses. In this case, the participant inten-
butions. tionally adjusts his or her stopping rule to
maximize the expected reward (which pre-
sumably would also depend on the speed of
when comparisons are drawn between differ- the response). Because they are mathematically
ent discrimination responses. For example, the equivalent, empirical tests to distinguish these
proportion of correct responses corresponding two interpretations are difficult to formulate
to signal responses given at Confidence Level (Balakrishnan, 1997).
3 should be larger than the proportion of cor- To illustrate the effect of this miscalculation
rect responses corresponding to no-signal on the objective stimulus probabilities associat-
responses given at Confidence Level 2. ed with the different subjective confidence
Although there are other possible forms of states, suppose that the sampling process is ter-
response bias in a dynamic system, this new minated when the perceived value of the proba-
test identifies a bias that is clearly a signature bility of a correct response (i.e., the perceived
for the effects of signal rate on participants’ value of the maximum of P[signal | Ψi ] and
performance. Referring once more to the esti- P[no signal | Ψi] is equal to .9. In this case,
mated functions in Figures 6 and 8, the type when the participant makes a signal response,
of discrimination response (i.e., signal or no the perceived value of P[signal | Ψi] at the point
signal) has no systematic effect on the propor- of the decision was .9, but the actual value of
tion of correct responses associated with the this probability was lower (because the signal
different confidence levels when the signal rate was overestimated). Hence, the percentage
and no-signal rates are equal, and it has a sub- correct associated with the signal responses will
stantial effect on these values when the signal be less than .9. Similarly, when the participant
rate is low. Thus, response biases of the form makes a no-signal response, the perceived value
defined by Equation 2 do exist empirically and of P[no signal | Ψi] at the point of the decision
are directly tied to the signal rate. was .9, but the actual value of this probability
was higher (because the no-signal rate was
Psychological Interpretations underestimated), and so the percentage of cor-
In addition to recovering the original con- rect responses associated with this discrimina-
cept of response bias as a factor directly relat- tion response will be greater than .9.
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
DECISION PROCESSES IN VIGILANCE 619

TABLE 3: Performance Statistics for the First and Second Half of the Sessions from the Successive and
Simultaneous Discrimination Conditions of Experiment 1

Condition d′ A′ CR HR βp Ωp

Successive
1st half 1.799 .820 .922 .645 .113 .026
2nd half 2.096 .811 .968 .594 .130 .000

Simultaneous
1st half 1.438 .747 .891 .581 .132 .000
2nd half 1.594 .706 .955 .458 .183 .000

To illustrate the relationship between the ment. More important, the experiment also
percentage of correct responses and the objec- replicated the classical effect of time on task:
tive probability of a correct response at the In each condition, the hit and false alarm rates
point of the decision in a sequential sampling decreased in the second half of the session,
model, consider the following example. Sup- resulting in an increase in the bias measure βp.
pose that a gambler will make a bet only when In contrast, the distribution-free measure Ωp
the odds are (exactly) four to one in his or her was zero or close to zero, suggesting that the
favor. Examining the (infinite) history of this detection criterion account of this phenome-
gambler’s bets, exactly four out of five will non is once again invalid.
turn out to be winners. Estimates of the new bias measure repre-
sented by Equation 2 are shown in Figure 10.
Vigilance Decrements and Criterion For the no-signal responses, there was little
Increments effect of time on task on the probability of a
If the difference between the hit and cor- correct response under the different confidence
rect rejection rates in vigilance experiments levels. There was a substantial effect of time,
cannot be attributed to a biased decision rule, however, on the level of this function for the
then it follows that the increase in this differ- signal response trials: According to this new
ence with time on task cannot be attributed to measure, the participants’ response bias de-
an increase in this type of bias. In this section, creased with time on task, reducing (but not
I use the new bias measures described previ- eliminating) the suboptimal character of their
ously to see how dynamic models of vigilance decision rule.
would interpret the effect of time on the
detection rate. To do this, the data from the Interpretations of the Effects of Time on
successive and simultaneous conditions of Task in the Fixed and Variable Sample
Experiment 1 were split into two parts (rough- Models
ly 25 min each) and reanalyzed. Sensitivity In previous accounts of the vigilance decre-
and bias statistics for the two halves of the ment, the decrease in detection rate was attrib-
session are given in Table 3. Given that the uted to an increment in the detection criterion
event rate in both conditions was low, little or corresponding to a decrease in the operator’s
no decrease in sensitivity would be predicted willingness to make a detection response.
by the taxonomy developed by Parasuraman Assuming that the initial location of the criteri-
and Davies (1977). on was conservative with respect to its optimal
Although d′ actually increased slightly with location, the increment would increase the
time on task, the distribution-free index A′ overall percentage of correct decisions. One
should be considered a more appropriate mea- possible explanation for the decrease in the
sure given the violations of the equal variance detection rate, therefore, was that participants
assumption indicated by the ROC curves learn to adjust their detection criterion as they
shown in Figure 9. This value decreased gain experience with the task (e.g., Williges,
slightly for the two conditions of this experi- 1969; see, however, Vickers, Leary, & Barnes,
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
620 December 1998 – Human Factors

anism is assumed to cause the difference in


variances could also be assumed to weaken
with time on task.

SUMMARY AND FINAL COMMENTS

A fundamental assumption of signal detec-


tion theory is that the objective likelihood ratio
at the point XC (i.e., the location of the detec-
tion criterion) changes as a function of the par-
ticipant’s decision-making attitude. This
attitude is presumably rational at least to some
degree. That is, if the signal stimulus becomes
more likely, the participant shifts the detection
criterion to some degree to favor a signal re-
sponse. For several decades, this relatively sim-
ple, one-parameter representation of human
decision processes has seemed to work ex-
tremely well, providing a highly plausible and
consistent picture of human discrimination per-
formance in a wide variety of experimental set-
tings, including vigilance settings. For example,
Figure 10. Effects of time on task on the accuracy the model predicts that as the signal rate
predictions of subjective confidence (Equation 2 decreases, the hit and false alarm rates should
bias test) in the successive and simultaneous dis- both decrease, whereas the sensitivity measure,
crimination conditions of Experiment 1. d′, should remain roughly the same. Empirical
estimates of d′ do in fact remain roughly con-
1977, for a demonstration that estimates of the stant under the standard bias manipulations
parameter β do not always approach the puta- (e.g., Green & Swets, 1974), providing what
tive optimal value with time on task). appears to be a strong confirmation of the basic
Although this description of the decision tenets of this formal theory.
rule is inconsistent with the finding that the Although many researchers would probably
decision rule is always unbiased, a similar expect this model’s underlying assumptions to
mechanism could account for the results be violated to some degree (e.g., the distribu-
shown in Figure 10 if it is added to the tions might not be exactly normal), few would
sequential sampling framework for discrimi- expect them to be violated in an insidious way
nation. Specifically, if the participants initially (i.e., in a way that would cause the model to
overestimate the signal rate but the size of this fundamentally misrepresent the qualitative
estimation error decreases with experience properties of human decision-making behav-
(i.e., their representation of the signal rate ior). This is true despite the fact that for many
becomes more accurate), then their tendency years it has also been known that another
to terminate the sampling process prematurely class of models (sequential sampling models)
would also be reduced. This would cause the can also account for these classical data,
objective probabilities associated with the although they interpret them in very different
confidence levels for the two discrimination ways (e.g., the bias is defined by the relative
responses to converge in the manner shown in distances from the starting point of the two
Figure 9. Under the fixed sample model, the response thresholds).
results shown in Figure 10 could be taken as The results described in this article present
evidence that the disparity in the variances of what appears to be a major challenge to the
the signal and no-signal distributions decreases static interpretation of signal detection theory.
with time on task. Presumably, whatever mech- Specifically, for the same reasons that this
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
DECISION PROCESSES IN VIGILANCE 621

model predicts a decrease in the hit and false tion criterion on some psychological dimen-
alarm rates as the signal rate decreases (i.e., a sion, and in the former case, the participant
change in the likelihood ratio at the detection sets this detection criterion and several others
criterion), it also predicts that the peak of the on the same psychological dimension. The dif-
empirical function, UR(k), should shift toward ference is that the detection criterion deter-
the no-signal response side of a bipolar confi- mines the accuracy of the participant (the hit
dence rating scale. This prediction was consis- and false alarm rates) and, hence, is not arbi-
tently violated by empirical data in the two trary. The other criteria determine the relative
vigilance experiments reported here. frequencies of the different rating responses
Although the signal rate did have a sub- and the percentage correct values associated
stantial impact on the detection rate, there with these responses, and they are arbitrary in
was no shift in the peak of the estimated func- the sense that they do not affect the partici-
tion UR(k). Similarly, if the detection criterion pant’s hit or correct rejection rates.
shifts further in favor of the no-signal re- It is possible that subjective confidence as
sponse with increasing time on task (the crite- measured by the rating scale is contaminated
rion increment), then the peak of the UR(k) in some way, causing the location of the detec-
function should shift further in the direction tion criterion to be misrepresented. However,
of the no-signal response side of the rating no mechanism exists within the framework of
scale. In both the first and second halves of signal detection theory that would produce
the vigilance periods, the peak of the function such an effect, and no obvious modification of
was consistently fixed at the center of the rat- this theory would account for the confidence
ing scale, despite the fact that the hit and false ratings data. At the very least, therefore, the
alarm rates decreased with time on task. Thus, data presented here need to be accounted for
with respect to both the signal rate and time before the interpretations associated with sig-
on task, the standard findings of previous vigi- nal detection theory are taken for granted.
lance studies were replicated, but an analysis In addition to the confidence rating re-
of the distributions of subjective confidence sponse, the methods employed in the two
revealed some apparently serious problems experiments described here also differed from
with signal detection theory’s representation the more typical vigilance tasks in two other
of the decision-making process. ways. First, the participants were informed
about the relative frequencies of the signal
Alternative Interpretations of the and no-signal conditions prior to performing
Empirical Results the task (or prior to each block of trials in the
In order to estimate the value of the likeli- mixed successive discrimination condition).
hood ratio at the criterion value, XC, the stan- The participants were also given feedback after
dard experimental methods associated with each response. Both manipulations may be
vigilance experiments needed to be modified. expected to have some effect on performance,
For example, the participants were asked to but neither should be expected to cause the
indicate not only their discrimination judg- fundamental assumptions of signal detection
ment but also their degree of confidence in theory to be violated. In fact, if anything, this
this response. Further, they were also instruct- information should have encouraged the par-
ed to use the rating scale in a particular way ticipants to bias their decision rule in the appro-
(i.e., reserving the extremely underconfident priate way.
responses of the scale for extremely uncertain Finally, in other experiments (e.g., Balakrish-
states). Obviously, it is possible to simply dis- nan, 1998), asymmetric payoffs were used to
miss confidence rating data as a measure of induce a response bias, and the results were
human performance. the same (i.e., there was no evidence for bias-
In terms of the signal detection theory es of the type defined within the signal detec-
model, however, the rating scale is a logical tion theory model). Thus, although it is
extension of the yes-no detection task. In the possible that an alternative explanation could
latter case, the participant sets a single detec- be formulated for the results described here
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
622 December 1998 – Human Factors

that preserves the notion of response bias similarity of the stimuli or the noise level).
from signal detection theory, it is not obvious This difference clearly must have important
what this would be, and certainly no alterna- implications for theories of vigilance – the
tive account would be possible without sub- main point of the present study is not that
stantial revisions, additions, or both to the response biases do not exist but, merely, that
classical signal detection theory model. response biases of the type defined in signal
detection theory are not produced by standard
Other Measures of Response Bias bias manipulations.
As noted earlier, most experimental analy-
ses of performance indices for vigilance data Implications for Performance Measures
have involved quantitative comparisons of the Given that all formal models of vigilance
degree of change in the measures or their sta- must define an information state at the point at
tistical properties (e.g., their variance or corre- which a decision is made about the current dis-
lations to sensitivity indices) under different play condition, all models (fixed and variable
experimental conditions. For example, in a sample) must define two distributions, one for
recent review of the several bias measures each of the two stimulus events. Sensitivity
derived from signal detection theory by Mac- indices such as d′ and A′ are measures of the
millan and Creelman (1991), See et al. (1997) relationship between these two distributions
concluded that the measure β was less “ade- under the assumption that they are univariate
quate” than the alternatives proposed by (e.g., as is subjective confidence defined on a
Macmillan and Creelman because it was less bipolar scale). Thus, although they were origi-
sensitive to supposed bias manipulations. nally developed with the fixed-sample model in
Several points about the relevance and impli- mind, these measures can also be used to ana-
cations of this earlier work are worth clarify- lyze performance under the variable sample
ing. First, all of the measures compared in See assumption, under the proviso that any differ-
et al. were based exclusively on the hit and ences in sensitivity level could be attributable
false alarm rates. In the present study, the dif- entirely or in part to differences in the opera-
ference between these two values was, not tor’s stopping rule. For example, when a sensi-
surprisingly, substantial when the signal rate tivity index decreases with time on task (i.e.,
was low (see Tables 1 and 2), and therefore all the sensitivity decrement), it is not necessarily
of these bias measures would indicate a sub- true that the operator’s attention level has
stantial bias toward the no-signal response. decreased. At least in principle, the same result
Second, all of these measures depend, could also be attributed to a change in the
implicitly or explicitly, on a specific distribu- stringency of the stopping rule.
tion model, and when they indicate that a bias
exists, they also indicate that there exists a set Practical Applications
of “biased mental states,” as illustrated in Understanding the nature of the decision
Figure 1. In other words, the differences in processes involved in vigilance is obviously
these measures are quantitative rather than important with respect to theories of vigi-
qualitative. The results described in this article lance, but it may also have important ramifica-
raise questions about the validity of all bias tions for applied problems associated with
indices that represent response bias in this information display. For example, if the sub-
general way, not just the measure β. optimal performance of an operator is related
Finally, and most important, the fact that to the amount of time spent processing the
biases of this particular kind do not appear to display before an action is taken, then opera-
exist under standard bias manipulations (e.g., tors’ performance should improve if they are
changes in the signal rate) should not be con- encouraged to respond more slowly to appar-
fused with the empirical fact that these manip- ent signal occurrences. This kind of interven-
ulations have qualitatively different effects on tion might be particularly effective if the
the hit and false alarm rates than do sensitivity vigilance sensitivity decrement turns out to be
manipulations (e.g., changes in the physical a stopping rule effect.
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016
DECISION PROCESSES IN VIGILANCE 623

Another approach to manipulating the Mackworth, N. H. (1948). The breakdown of vigilance during
prolonged visual search. Quarterly Journal of Experimental
operator’s decision-making strategy would be Psychology, 1, 6–21.
to solicit confidence rating responses and then Mackworth, N. H. (1961). Researches on the measurement of
human performance. In H. W. Sinaiko (Ed.), Selected papers
inform the operator about the objective proba- on human factors in the design and use of control systems
bility of a correct response corresponding to (pp. 174–331). New York: Dover.
Macmillan, N. A., & Creelman, C. D. (1990). Response bias:
their different confidence levels (i.e., in the Characteristics of detection theory, threshold theory, and “non-
manner of the Equation 2 bias test). In this parametric” indexes. Psychological Bulletin, 107, 401–413.
Macmillan, N. A., & Creelman, C. D. (1991). Detection theory: A
way, a trainer or personnel manager could pro- user’s guide. Cambridge, England: Cambridge University Press.
vide a concrete goal for operators that would Maloney, L. T., & Thomas, E. A. C. (1991). Distributional assump-
tions and observed conservatism in the theory of signal
allow them to optimize their decision-making detectability. Journal of Mathematical Psychology, 35, 443–470.
behavior with respect to whatever objective is Parasuraman, R. (1979). Memory load and event rate control sensi-
tivity decrements in sustained attention. Science, 205, 924–927.
most appropriate for their task. This type of Parasuraman, R., & Davies, D. R. (1977). A taxonomic analysis of
bias manipulation is not possible using tradi- vigilance performance. In R. R. Mackie (Ed.), Vigilance:
Theory, operational performance, and physiological correlates
tional methods (e.g., defining an explicit pay- (pp. 559–574). New York: Plenum.
off system or setting a target for the hit or Parasuraman, R., & Mouloua, M. (1987). Interaction of signal dis-
criminability and task type in vigilance decrement. Perception
false alarm rates), because the information & Psychophysics, 41, 17–22.
needed to reveal the logic of the operator’s Parasuraman, R., Warm, J. S., & Dember, W. N. (1987).
Vigilance: Taxonomy and utility. In L. S. Mark, J. S. Warm, &
decision-making strategy is not available in the R. L. Huston (Eds.), Ergonomics and human factors: Recent
hit and false alarm rates alone. research (pp. 11–32). New York: Springer-Verlag.
Proctor, R. W., & Van Zandt, T. (1994). Human factors in simple
and complex systems. Boston: Allyn and Bacon.
ACKNOWLEDGMENTS See, J. E., Howe, S. R., Warm, J. S., & Dember, W. N. (1995).
Meta-analysis of the sensitivity decrement in vigilance.
Psychological Bulletin, 117, 230–249.
Parts of this research were supported by See, J. E., Warm, J. S., Dember, W. N., & Howe, S. R. (1997).
Vigilance and signal detection theory: An empirical evaluation
National Science Foundation Grant SBR- of five measures of response bias. Human Factors, 39, 14–29.
9709789 and NASA Dryden FRC Grant Swets, J. A. (1986). Indices of discrimination or diagnostic accura-
cy: Their ROCs and implied models. Psychological Bulletin,
NCC2-374. 99, 110–117.
Thomas, E. A. C. (1971). Sufficient conditions for monotone haz-
ard rate and application to latency-probability curves. Journal
REFERENCES of Mathematical Psychology, 8, 303–332.
Townsend, J. T., & Ashby, F. G. (1983). Stochastic modeling of
Balakrishnan, J. D. (1997). Form and objective of the decision elementary psychological processes. Cambridge, England:
rule in absolute identification. Perception & Psychophysics, Cambridge University Press.
59, 1049–1058. Vickers, D., Leary, J., & Barnes, P. (1977). Adaptation to decreas-
Balakrishnan, J. D. (1998). Some more sensitive measures of sen- ing signal probability. In R. R. Mackie (Ed.), Vigilance:
sitivity and response bias. Psychological Methods, 3, 68–90. Theory, operational performance and physiological correlates
Balakrishnan, J. D., & Ratcliff, R. (1996). Testing models of deci- (pp. 679–703). New York: Plenum.
sion making using confidence ratings in classification. Journal Warm, J. S. (1984). An introduction to vigilance. In J. S. Warm
of Experimental Psychology: Human Perception and Perfor- (Ed.), Sustained attention in human performance (pp. 1–14).
mance, 22, 615–633. Chichester, England: Wiley.
Craig, A. (1979). Nonparametric measures of sensory efficiency Warm, J. S., Dember, W. N., Murphy, A. Z., & Dittmar, M. L.
for sustained monitoring tasks. Human Factors, 21, 69–78. (1992). Sensing and decision-making components of the signal-
Creelman, C. D., & Donaldson, W. (1968). ROC curves for dis- regularity effect in vigilance performance. Bulletin of the
crimination of linear extent. Journal of Experimental Psychology, Psychonomic Society, 30, 297–300.
77, 514–516. Warm, J. S., & Jerison, H. J. (1984). The psychophysics of vigi-
Davies, D. R., & Parasuraman, R. (1982). The psychology of vigi- lance. In J. S. Warm (Ed.), Sustained attention in human per-
lance. London: Academic. formance (pp. 15–60). Chichester, England: Wiley.
Deaton, J. E., & Parasuraman, R. (1993). Sensory and cognitive Watson, C. S., Rilling, M. E., & Bourbon, W. T. (1964). Receiver
vigilance: Effects of age on performance and subjective work- operating characteristics determined by a mechanical analog
load. Human Factors, 6, 71–97. to the rating scale. Journal of the Acoustical Society of
Dorfman, D. D., & Alf, E., Jr. (1969). Maximum-likelihood esti- America, 36, 283–288.
mations of parameters of signal-detection theory and determi- Williges, R. C. (1969). Within-session criterion changes compared
nation of confidence intervals – Rating-method data. Journal to an ideal observer criterion in a visual monitoring task.
of Mathematical Psychology, 6, 487–496. Journal of Experimental Psychology, 81, 61–66.
Green, D. M., & Swets, J. A. (1974). Signal detection theory and
psychophysics. Huntington, NY: Krieger. J. D. Balakrishnan is an assistant professor in the
Link, S. W., & Heath, R. A. (1975). A sequential theory of psy- Department of Psychological Sciences at Purdue
chological discrimination. Psychometrika, 40, 77–105. University. He received his Ph.D. in cognitive math-
Lockhart, R. S., & Murdock, B. B., Jr. (1970). Memory and the
theory of signal detection. Psychological Bulletin, 74, 100–109.
ematical psychology from the University of
Luce, R. D. (1986). Response times: Their role in inferring elemen- California, Santa Barbara, in 1991.
tary mental organization. New York: Oxford University Press.
Mackworth, J. F. (1970). Vigilance and attention. Harmondsworth, Date received: January 9, 1997
England: Penguin. Date accepted: April 28, 1998
Downloaded from hfs.sagepub.com at NORTHWESTERN UNIV LIBRARY on June 4, 2016

You might also like