You are on page 1of 5

PROCEEDINGS of the HUMAN FACTORS AND ERGONOMICS SOCIETY 48th ANNUAL MEETING—2004 2163

TYPE OF AUTOMATION FAILURE: THE EFFECTS ON TRUST AND


RELIANCE IN AUTOMATION

Jason D. Johnson, Julian Sanchez, Arthur D. Fisk, and Wendy A. Rogers


Georgia Institute of Technology
Atlanta, Georgia

Past automation research has focused primarily on machine-related factors (e.g.,


automation reliability) and human-related factors (e.g., accountability). Other
machine-related factors such as type of automation errors, misses or false alarms,
have been noticeably overlooked. These two automation errors correspond to
potential operator errors, omission (misses) and commission (false alarms), which
have proven to directly affect operators’ trust in automation. This proposed
research will examine how automation-error-type affects operator trust and begin
to develop baseline trust measures as they relate to error type and participant age.
It is expected that participants presented with more automation false alarms than
misses will experience a larger degradation of subjective trust than those
presented with equal numbers of false alarms and misses or more automation
misses than false alarms.

INTRODUCTION and Moray’s (1992) automated pasteurization plant


experiments showed that when overall system
As the human race continues to progress, so performance is low, based on efficiency and the
does the sophistication of the systems we use. From occurrence of faults, the operators’ trust in
military command and control systems and aircraft automation was low (i.e., they generally opted for
to nuclear power plants and automobile assembly manual control). Very little, however, has been
lines, operators encounter automated systems on a done to investigate how, if at all, the type of
daily basis. Purely mechanical systems have been automation error, false alarm or miss, affects an
mostly replaced by computers and circuit boards operator’s trust in automation.
allowing automated systems to perform tasks at The decision to trust or not trust automation is
which human beings are historically poor likely to be related to the more general construct of
performers, such as monitoring during unengaging trust. In the social psychology literature, trust is
tasks (Parasuraman, Mouloua, Molloy, & Hilburn, traditionally defined on an interpersonal-level (e.g.,
1996). Although these systems are designed to “I do or do not trust a person or group”). In that
increase performance and decrease errors, they are regard, each person must decide whether to 1) give
not perfect. In addition, “When humans are trust to another, and 2) act in a trust worthy manner
involved, errors will be made, regardless of the as to gain others’ trust. The first scenario is easily
level of training, experience, or skill” (Park, 1997, extrapolated to relationships between operators and
p. 151). machines. To illustrate, research on trust in teams
Understanding constructs such as trust, in (Harris & Provis, 2000) categorized trust into two
particular trust in automation, is imperative in levels: competence and intentions. The competence
optimizing the overall relationship between the component of trust referred to a person’s confidence
operator and the system. Because of the potential that someone else (or some group) has the ability to
for devastating accidents to occur when trust in perform in the manner in which they are expected to
automation is not allocated in appropriate levels or have advertised. The competence component is
(i.e., over trusting automation despite other external easily mapped to an automated system. Operator’s
cues of a potential malfunction), much research has have both preconceived and developed over time
been performed in the area of trust in automation notions of an automated systems’ ability to perform,
concerning function allocation. For instance, Lee or its reliability.
Downloaded from pro.sagepub.com at Freie Universitaet Berlin on May 2, 2015
PROCEEDINGS of the HUMAN FACTORS AND ERGONOMICS SOCIETY 48th ANNUAL MEETING—2004 2164

Trust can be described as a subjective measure Aging and automation


of one’s confidence in something or someone else.
In the context of automation, research has begun to Along with the growing amount of research
distinguish this subjective rating from its closely examining trust and reliance in automation (e.g.,
related counterpart reliance. Trust refers to the Dzindolet, Peterson, Pomranky, Pierce, & Beck,
subjective reports of the operators or their feeling 2003; Sanchez, Fisk, & Rogers, 2004; Wiegmann,
about automation; whereas reliance is the objective et al., 2001), more researchers are interested in how
performance measures such as automation these issues specifically relate to the effect of
utilization or task efficiency (Wiegmann, Rich, & operator age. As “baby boomers” approach old age,
Zhang, 2001). Wiegmann et al. found that the sheer number of older adults relying on
participants’ subjective ratings of trust in automation seems to be increasing nearly
automation may be lower than their usage of the exponentially. By determining age-related
automation; that is, trust ratings were less than 100 differences in one’s approach to automation or
percent despite utilization rates of 100 percent. formulation of trust in automation, systems can be
Wiegmann et al. also showed, in general, that as the designed and training programs developed with
reliability of the automated system decreased the those differences in mind. Some research suggests
participants’ subjective ratings of trust in the system that changes in trust developed based on automation
decreased. The converse also appeared to apply. reliability differ based on age (e.g., Sanchez et al.,
Although it is important to understand the effects of 2004). In their study, older adults showed a
reliability in general, it is critical to understand how significant loss of trust in automation when
other changes in an automated system, such as type reliability degraded from 100 to 80 percent and then
of automation error, affect an operator’s subjective again from 80 to 60 percent. Younger adults on the
rating of trust. other hand only showed a statistically significant
loss in trust when reliability dropped from 100 to 80
Types of Errors percent. Understanding such age-related
differences in trust and reliance in automation are
When an automated engine status indicator important but underdeveloped. This present study
malfunctions, one of two types of errors will occur. will add to our understanding of aging and factors
First, if a system malfunctions and the automation affecting use of automation.
does not indicate a malfunction (the “signal” is not
detected), a miss has occurred. Second, if the PROPOSED METHOD
automation erroneously indicates a malfunction
when the system is working properly (a non- Participants
existent signal is detected), a false alarm has
occurred. Research to date has not examined how A total of 60 older and younger adults will
changes in these types of errors affect trust and participate in this study. The 30 younger adults will
reliance in the automation. consist of males and females between the ages of 18
Based upon the two types of automation errors, and 28, inclusive, from within the Georgia Institute
operators can make two potential errors: omission of Technology’s undergraduate psychology student
and commission errors. Omission errors occur when population. The 30 older adults will be male and
an operator fails to respond to a system malfunction female volunteers between 65 and 75 years of age,
in instances when the automation monitoring the inclusive, from within Atlanta, GA and the
system fails to register or detect a malfunction. surrounding areas.
Commission errors occur when an operator
inappropriately complies with the automation’s Proposed Design
directions concerning a system malfunction when
no real malfunction exists (Skitka, Mosier, & Participants will be trained to perform a dual
Burdick, 2000). task scenario in which they monitor a flight
simulator cockpit and respond to engine
Downloaded from pro.sagepub.com at Freie Universitaet Berlin on May 2, 2015
PROCEEDINGS of the HUMAN FACTORS AND ERGONOMICS SOCIETY 48th ANNUAL MEETING—2004 2165

malfunctions (the “engine task”) and monitor a Apparatus


radar scope and report the number and type of
objects they see on the scope (the “radar task”). An The experiment will be conducted using a low-
engine status indicator will be available to assist in fidelity cockpit simulator. Participants will interact
monitoring engine performance (see figure 1). with the simulation using a standard two-button
Participants will be instructed prior to training that mouse. Participants will be required to monitor
the engine status indicator might not be 100 percent engine performance and reset engines when they
reliable and that it can be verified by selecting the malfunction. In conjunction with monitoring
“view gauges” button which will display 100 engine performance in the top half of the screen,
percent reliable gauges. each participant will also monitor a radar scope on
the bottom half of the screen and click on symbols
Figure 1: Screen capture of simulator interface. as they appear on the radar scope.

Engine task. The engine task consists of monitoring


the engines to ascertain when an engine malfunction
has occurred. The engine status indicator is
designed to aid the operators in determining if an
engine is malfunctioning. It will have a reliability
of 80 percent (i.e., the engine status indicator will
indicate a true engine malfunction 80 percent of the
time). The remaining 20 percent will consist of
false alarms and misses in proportions that are
condition specific: 75/25, 25/75, and 50/50 false
alarm to miss respectively. The 80 percent
reliability of the engine status indicator was chosen
to replicate a specific condition in previous research
conducted by Sanchez et al. (2004).

Radar Task. The radar task will consist of a radar


scope that will display four different symbols. The
participants will be instructed to select symbols
each time they see them on the radar scope. Their
The type of automation failure, false alarm or goal will be to depress a symbol’s corresponding
miss, will be a between-participants variable. Each button when they first notice a symbol has appeared
age subgroup, younger adults and older adults, will on the radar screen (See figure 1). The performance
be separated into three groups. Each group will data collected from the radar task will include the
receive one of three experimental conditions: ratio of symbols selected to symbols displayed.
predominantly misses, predominantly false alarms,
or equal number of misses and false alarms. The Questionnaire
dependent variables will be engine task and radar
task performance; the number of times the view The experiment will include a basic demographic
gauges button is pushed (objective trust measure); questionnaire and subjective trust questionnaires.
subjective trust ratings; and the conditional The trust questionnaires will be administered after
probabilities of operator error given an automation training, after block one, and after block two. The
false alarm, an automation miss, or an automation trust questionnaire will provide information about
error (either false alarm or miss). each participant’s overall trust in automated
systems, trust in this particular automated system,
and changes in trust based on the type and
proportion of automation errors they encounter.
Downloaded from pro.sagepub.com at Freie Universitaet Berlin on May 2, 2015
PROCEEDINGS of the HUMAN FACTORS AND ERGONOMICS SOCIETY 48th ANNUAL MEETING—2004 2166

Procedure response channels. Thus, future similar


alerts may receive less attention. They may
Upon beginning the study and providing elicit weaker fear reactions. The threat may
informed consent, each participant will complete a be perceived as less intense or less
demographic questionnaire and an initial trust probable. People may overestimate their
questionnaire. Participants will perform a seven ability to cope with the danger if and when it
minute training block to learn the engine and the materializes. Or, most important, they may
radar tasks. Following the completion of training, reduce their willingness to engage in
participants will perform two experimental blocks, protective behavior. (p.11)
lasting 20 minutes each. At the completion of each Other researchers in addition to Breznitz’s (e.g.,
block, participants will be given the trust Bliss & Dunn, 2000) have supported the notion that
questionnaire and performance feedback. Each persistent or pervasive false alarms negatively affect
participant will begin each block of testing with operator trust in automated systems. Very little,
1000 points. They will lose five points for each however, has been done to examine how, if at all,
“mistake” (e.g., missing a symbol on the radar task misses affect trust. One might imagine that
or failing to reset a malfunction on the engine task) automation misses would have a lesser effect on
and 1 point each time they press the view gauges operator trust because the operator is experiencing
button. Participants will be provided with a less interaction with the system as a result of the
performance feedback scale to internally rank their automation not indicating the system malfunctions.
performance consisting of point ranges associated
with popular “fighter jock” labels. Scores between DISCUSSION
1,000 and 950 will be classified as “Top Gun”, 949
to 900 as “Ace”, 899 to 850 as “Instructor Pilot”, Trust and reliance in automation are dynamic
849 to 800 as “Pilot Trainee”, and less than 800 as traits that are significantly affected by many factors.
“Cadet”. Much research has been conducted to date to
examine how some of these factors affect an
EXPECTED RESULTS operator’s trust in increasingly automated systems,
such as self-confidence, reliability, or prior
Analyses of variance (ANOVAs) will be performed knowledge of impending automation failure (e.g.,
to assess the main effects of type of automation Dzindolet et al, 2003; Moray, Inagaki, & Itoh, 2000;
error and age and any interaction effects. In Sanchez et al., 2004;Wiegmann et al., 2001). Very
addition, conditional probabilities will be calculated little has been done to examine how the type of
to determine 1) the probability of an operator error automation failure affects an operator’s trust and
given the occurrence of an automation error, 2) the reliance in automation. This proposed study will
probability of an operator error given the establish the foundation upon which more elaborate
occurrence of a false alarm, and 3) the probability studies can build. By holding the scenario
of an operator error given the occurrence of a miss. description and reliability constant (i.e., referring
It is expected that the majority automation only to malfunctions not specifically false alarms or
false alarms condition will produce larger drops in misses and maintaining an overall automation
subjective trust than will the majority automation reliability of 80 percent) this study will isolate the
misses condition. The following passage from type of automation failure and its effect on trust and
Shlomo Breznitz’s Cry Wolf: The Psychology of reliance in the decision aid. One potential finding is
False Alarms (1984) highlights the various potential that false alarms will produce lower levels of
consequences of continued false alarms on an subjective trust. Increasing the number of false
operator. alarms committed by the decision aid will increase
Each false alarm reduces the credibility of a the number of messages displayed by the decision
warning system. The credibility loss aid. After a false alarm or two, the participants may
following a false alarm episode has serious be more likely to press the “engine gauges” button
ramifications to behavior in a variety of to confirm each subsequence decision aid
Downloaded from pro.sagepub.com at Freie Universitaet Berlin on May 2, 2015
PROCEEDINGS of the HUMAN FACTORS AND ERGONOMICS SOCIETY 48th ANNUAL MEETING—2004 2167

malfunction indication. Misses, on the other hand, under the auspices of the Center for Research and
are completely unknown until the decision aid Education on Aging and Technology Enhancement
displays the message stating that the engine (CREATE).
malfunctioned and was reset (i.e., “out of time”
displayed in the action results window). The only REFERENCES
way for the participants to confirm a miss is to try
to completely prevent it (i.e., continuously press the Bliss, J. P., & Dunn, M. C. (2000). Behavioral implications of
view gauges button in an attempt to catch the miss alarm mistrust as a function of task workload.
Ergonomic, 43, 1283-1300.
when it is occurring). Lastly, this research may find Breznitz, S. (1984). Cry wolf: the psychology of false alarms.
that trust does not vary with the type of error Lawrence Erlbaum Associates: Hillsdale, NJ.
experienced by the participants. Dzindolet, M. T., Peterson, S. A., Pomranky, R. A., Pierce, L.
Proposed follow-up research will need to be G., & Beck, H. P. (2003). The role of trust in automation
conducted to examine some of the questions this reliance. International Journal of Human-Computer
Studies, 58, 697-718.
study is not designed to answer. Future work Harris, H., & Provis, C. (2000). Teams, trust and norms:
should investigate the effects of changing the there’s more to teams as a manufacturing management
scenario in which the participants find themselves, system than some might think. Proceedings of the 8th
such as monitoring a nuclear reactor or conducting International Conference on Manufacturing
DNA tests for a criminal trial. By doing this, Engineering. Sydney, Australia, 1-5. (Available from
http://www.smartlink.net.au/library/harris/teamstrustnor
participants will be put in situations which appear to ms.pdf).
foster tolerance of one type of error over the other. Moray, N., Inagaki, T., & Itoh, M. (2000). Adaptive
Also of interest, and closely related to the scenario, automation, trust, and self-confidence in fault
is the degree to which the cost of the automation management of time-critical tasks. Journal of
failure is varied. Lastly, a multifactor experiment Experimental Psychology: Applied, 6(1), 44-58.
Parasuraman, R., Mouloua, M., Molloy, R., & Hilburn, B.
should be conducted in which all of the (1996). Monitoring of automated systems. In R.
aforementioned factors are systematically varied to Parasuraman & M. Mouloua (Eds.), Automation and
determine their levels of interaction. Those results human performance: theory and applications (pp. 91-
coupled with this single factor experiment will 115). Mahwah, NJ: Lawrence Erlbaum Associates.
provide a more complete picture of the complex Parks, K. S. (1997). Human error. In G. Salvendy (Ed.),
Handbook of human factors and ergonomics (pp. 150-
dynamics involved with determining and predicting 173). New York: Wiley-Interscience.
operator trust and reliance in automation. Sanchez, J., Fisk, A. D., & Rogers, W. A. (2004, March). Age-
related and reliability related effects on trust of a
ACKNOWLEDGMENTS decision support aid. Poster session presented at the
2004 Human Performance, Situation Awareness and
Automation Technology Conference, Daytona Beach,
The research team would like to extend a very FL.
special thanks to Neta Ezra for her contributions to Skitka, L., Mosier, K., & Burdick, M. (2000). Accountability
programming the interface. This research was and automation bias. International Journal of Human-
supported in part by contributions from Deere & Computer Studies, 52, 701-717.
Company and we thank Jerry Duncan and Bruce Wiegmann, D., Rich, A., & Zhang, H. (2001). Automated
diagnostic aids: the effects of aid reliability on users’
Newendorp for their support and advice on this trust and reliance. Theoretical Issues in Ergonomics
research. This research was also supported in part Science, 2(4), 352-367.
by a grant from the National Institute of Health
(National Institute of Aging) Grant P01 AG17211

Downloaded from pro.sagepub.com at Freie Universitaet Berlin on May 2, 2015

You might also like