You are on page 1of 12

HUMAN FACTORS, 1992,34(4),429-439

Comparison of Four Subjective Workload


Ra ting Scales

SUSAN G. HILL,1 EG&G Idaho, Inc., Idaho Falls, Idaho, HELENE P. lAVECCHIA,
Computer Sciences Corporation, Moorestown, New Jersey, JAMES C. BYERS, EG&G Idaho,
Inc., Idaho Falls, Idaho, ALVAH C. BITTNER, Jr., Battelle Human Affairs Research Center,
Seattle, Washington, ALLEN L. ZAKLAD, CHI Systems, Spring House, Pennsylvania, and
RICHARD E. CHRIST, U.S. Army Research Institute, Fort Bliss, Texas

Four subjective workload scales were compared along four dimensions: sensitivity,
operator acceptance, resource requirements, and special procedures. The scales
were the Modified Cooper-Harper scale, the National Aeronautics and Space Ad-
ministration Task Load Index (TLX), the Overall Workload (OW) scale, and the
Subjective Workload Assessment Technique. Three U.S. Army systems were stud-
ied for potential workload concerns. Data from five different studies on the three
systems were compared along the aforementioned four dimensions. Results indi-
cate that all four scales are acceptable tools and are sensitive to different levels of
workload. However, TLX and OW are consistently superior when considering sen-
sitivity, as measured by factor validity, and operator acceptance. This research is
an example of a systematic approach for examining human factors measurement
tools.

INTRODUCTION ing number of techniques for measuring


workload creates both an opportunity and a
In recent years several books and reports problem for human factors practitioners and
have presented workload studies, theoretical researchers. On one hand, tools have been de-
discussions, and reviews of past research veloped for a wide variety of situations; on
(e.g., Hancock and Meshkati, 1988; Lysaght et the other hand, human factors specialists
al., 1989; Moray, 1979; O'Donnell and Egge- faced with choosing the most appropriate of
meier, 1986). A growing number of different these tools may find the information neces-
approaches and techniques for measuring sary to make an informed choice sparse or
workload have been discussed, developed, nonexistent.
and used. Subjective techniques, in particu- The present systematic study of workload
lar, have seemed to proliferate. The increas- techniques was undertaken to provide such
information. The Operator Workload Pro-
1 Requesls for reprints should be sent to Susan G. Hill, gram, sponsored by the U.S. Army Research
Human Factors Research Unit, Idaho National Engineer-
ing Laboratory, EG&G Idaho, Inc., P.O. Box 1625, Idaho
Institute, was a three-year basic and applied
Falls, ID 83415. research effort specifically directed toward

© 1992, The Human Factors Society, Inc. All rights reserved.


Downloaded from hfs.sagepub.com at UNIV CALIFORNIA SAN DIEGO on January 24, 2016
430-August 1992 HUMAN FACTORS

establishing guidance for the assessment of previous studies in the breadth of evaluation
operator workload associated with the oper- environments.
ation of army systems. To establish such
FOUR SUBJECTIVE WORKLOAD SCALES
guidance, comparisons were made between
and among workload techniques. These com- The focus of the Operator Workload Pro-
parisons were made across a number of army gram's empirical research was on operator
systems and were used to examine the useful- workload ratings, often referred to as subjec-
ness of various operator workload methodol- tive rating techniques. This may refer to their
ogies, to draw inferences about their relative apparently weaker objectivity relative to
value and usefulness in examining workload, other workload measurement approaches
and to assess workload for the army systems. (such as physiological measures; see Lysaght
A major objective of the Operator Work- et aI., 1989, for a critical evaluation of empir-
load Program was to examine operator work- ical and analytical approaches to measuring
load in a broad army context. Previous work workload). It has been argued, however, that
regarding workload has been conducted pri- operator ratings are the most direct indica-
marily in relatively narrow aviation environ- tors of operator workload (Sheridan, 1980). In
ments frequently set in laboratories (see addition, operator ratings are among the
Lysaght et aI., 1989). However, much of the least intrusive of all techniques because they
U.S. Army's interest is in ground-based and can be administered after the task or mission
nonaviation systems in more operational set- is completed without disturbing the operator
tings. This has resulted in workload studies during task performance. Further, the tech-
with characteristics different from those of niques are flexible and portable; no equip-
traditional aviation studies. For example, the ment or special data collection devices are
selection criteria for these operators (e.g., ed- needed. Finally, these techniques can be
ucation, verbal skills, and general intelli- quick and inexpensive to administer and an-
gence) were typically less stringent than alyze. These substantial advantages recom-
those for pilots. Hence one part of the Oper- mended their application in most operator
ator Workload Program was concerned with workload investigations and as a focus of the
examining and comparing workload mea- Operator Workload Program.
sures for more diverse populations than those The Operator Workload Program's compre-
evaluated in more traditional studies. hensive review of workload methodologies
It was also deemed critical to test the work- stated a number of conclusions concerning
load measures in field settings to yield im- operator rating methods (Lysaght et al.,
portant information about the operators 1989). Three of the most relevant here are as
and systems in these settings and about the follows:
usefulness of current workload methodolo-
gies under more operational conditions. • Operator rating methods are sensitive to key
mission variables and can provide valuable
Studies in controlled environments, such low-cost information.
as laboratories and simulators, can yield im- • There are both unidimensional and multidi-
portant information and may be the only mensional techniques, 'with different advan-
tages and disadvantages.
alternative for nonexistent or unfielded • Three particular techniques-the National
systems. However, with a programmatic goal Aeronautics and Space Administration (NASA)
to look at workload in more realistic, less Task Load Index (TLX), the Subjective Work-
load Assessment Technique (SWAT),and the
controlled operational environments, the Modified Cooper-Harper (MCH) scale-were
present research differs considerably from found to have substantial validated sensitivity.
Downloaded from hfs.sagepub.com at UNIV CALIFORNIA SAN DIEGO on January 24, 2016
COMPARISON OF WORKLOAD SCALES August 1992-431

Lysaght et al. (1989) found a number of pub- signed to the nearest 5) is obtained on each
lished studies that used subjective rating scale. A weighting procedure is used to com-
techniques. Comparisons have been made be- bine the six individual scale ratings into a
tween two techniques (e.g., Vidulich and global score; this procedure requires a paired
Tsang, 1985, compared SWAT and NASA Bi- comparison task to be performed prior to the
polar scales; Warr, Colle, and Reid, 1986, workload assessments. Paired comparisons
compared MCH and SWAT). However, these require the operator to choose which dimen-
studies did not statistically compare the effi- sion is more relevant to workload for a par-
cacy of measures, and no more than two ticular task across all pairs of the six dimen-
methods were compared at anyone time. sions. The number of times a dimension is
The review also identified four measures chosen as more relevant is the weighting of
that were most researched or which had the that dimension scale for a given task for that
greatest promise for field application: MCH operator. A workload score from 0 to 100 is
(Wierwille and Casali, 1983), NASA TLX obtained for each rated task by multiplying
(Hart and Staveland, 1988), the Overall the weight by the individual dimension scale
Workload (OW) scale (Vidulich and Tsang, score, summing across scales, and dividing
1987), and SWAT (Reid, Shingledecker, and by the total weights (i.e., 15 for the 15 paired
Eggemeier, 1981). Three of these techniques comparisons).
(TLX, SWAT, and MCH) were selected be-
Overall Workload Scale
cause of the extent of previous validation; the
other scale (OW) was chosen primarily be- Overall Workload is a rating of the sub-
cause of its simplicity and suspected potential. ject's overall workload on a unidimension-
al scale of 0 to 100, with 0 representing very
Modified Cooper-Harper Scale low workload and 100 representing very
high workload. A single, 20-step bipolar scale
The Modified Cooper-Harper scale is a 10- is used to obtain this global rating. A score
point unidimensional rating scale that results from 0 to 100 (assigned to the nearest 5) is
in a global rating of workload. The rating obtained.
scale uses a decision tree to assist the rater in
Subjective Workload Assessment Technique
determining the most appropriate rating to
assign. The MCH was developed for workload The Subjective Workload Assessment Tech-
assessment in systems in which the task is nique is a subjective rating technique de-
primarily cognitive, rather than motor or veloped by the U.S. Air Force Armstrong
psychomotor, and in which the original Aerospace Medical Research Laboratory at
Cooper-Harper scale (Cooper and Harper, Wright-Patterson Air Force Base. SWAT uses
1969) may not be appropriate. three levels-low (1), medium (2), and high
(3}-for each of the three dimensions of time
NASA Task Load Index load, mental effort load, and psychological
stress load to assess workload. It uses con-
The NASA Task Load Index uses six dimen- joint measurement and scaling techniques to
sions to assess workload: mental demand, develop a single, global rating scale with in-
physical demand, temporal demand, perfor- terval properties.
mance, effort, and frustration. Twenty-step The use of SWAT entails three distinct
bipolar scales are used to obtain ratings for steps. The first is called scale development.
these dimensions. A score from 0 to 100 (as- All possible combinations of the three levels
Downloaded from hfs.sagepub.com at UNIV CALIFORNIA SAN DIEGO on January 24, 2016
432-August 1992 HUMAN FACTORS

of each of the three dimensions are contained further development. A Force Development
in 27 cards. Each operator sorts the cards into Test and Evaluation (FDTE) is a field test on
the rank order that reflects his or her percep- a system already selected for development.
tion of increasing workload. Conjoint scaling Both tests are operationally oriented and use
procedures are used to develop a scale with trained operators in field environments.
interval properties. The second step is the Aquila FDTE
event-scoring-that is, the actual rating of
Workload assessments were conducted of
workload for a given task or mission segment.
four crews of operators of the Aquila Ground
In the third step, each three-dimension rating
Control Station during an FDTE in the last
is converted into numeric scores between 0
quarter of 1987. Four operator workload rat-
and 100 using the interval scale developed in
ing scales were used: MCH, OW, SWAT, and
the first step.
TLX. The objectives of this investigation were
DATA COLLECTION EFFORTS to (1) explore the applicabili.ty of the work-
load scales under the conditions characteriz-
Three U.S. Army systems, two ground-
ing field evaluations (i.e., Army FDTEs) and
based and one aviation, were selected for
to evaluate ground operation workload vari-
study of potential workload concerns (Bittner
ations during the Aquila Remotely Piloted
et aI., 1988): the Aquila Remotely Piloted Ve-
Vehicle FDTE. Subsequent to completion of
hicle; the Line-of-Sight, Forward, Heavy
the FDTE, a rating scales questionnaire was
(LOS-F-H) mobile air defense system; and the
administered which solicited judgments re-
UH-60A Blackhawk helicopter.
garding the procedures and test instruments,
A plan for investigating workload tech-
particularly those used to assess operator
niques was developed in which several spe-
workload.
cific studies were to be conducted for each
Two broad conclusions were drawn from
system (Bittner et aI., 1988). Each of the stud-
the Aquila FDTE evaluation of the use of the
ies had specific objectives in examining oper-
operator workload scales under field test con-
ator workload methodologies. At the same
ditions. First, operator workload measures
time, effort was made to conduct tests in a
may be successfully applied and evaluated
similar manner so that comparison across all
within the stringent field test environment
data collection efforts could be made. Each of
characterizing a U.S. Army FDTE. Second,
the studies is briefly described in the follow-
TLX had both the highest validity and best
ing sections. The studies are summarized by
user acceptance within the limited subject
subjects, objectives, method, and results in
group and conditions of the present investi-
Table 1. The five studies consisted of one of
gation. This study is documented in Byers,
the Aquila, three of the LOS-F-H, and one of
Bittner, Hill, Zaklad, and Christ (1988) and in
the Blackhawk helicopter.
Byers, Hill, Zaklad, and Christ (1989).
Most of the workload studies were part
of previously scheduled U.S. Army test LOS-F-R NDICE
and evaluation efforts. One such type of test Retrospective workload assessments were
and evaluation field exercise was a Non- conducted of six operators of the selected
Developmental Item Candidate Evaluation LOS-F-H candidate system 10 weeks after the
(NDICE), which is an evaluation of two or candidate field evaluation, which was held
more already developed systems that enables during late 1987. Four operator workload
selection of the better performing system for rating scales were used: MCH, OW, SWAT,

Downloaded from hfs.sagepub.com at UNIV CALIFORNIA SAN DIEGO on January 24, 2016
COMPARISON OF WORKLOAD SCALES August 1992-433

and TLX. The objectives of this investigation ments while minimizing differences caused
were (1) to explore the applicability of the by specific mission idiosyncrasies. Second, no
operator workload scales for obtaining retro- systematic differences were found between
spective workload assessments several weeks generic operator workload ratings made by
after a system field evaluation (i.e., Army subject matter experts and ratings made by
NDICE); (2) to evaluate the relationship be- crew members who had operated the system.
tween system performance and the retrospec- This study is documented in Bittner, Byers,
tive workload assessments of the crew mem- Hill, Zaklad, and Christ (1989).
bers of the selected system; and (3) to
compare the results of the present program- LOS-F-H FDTE Basic
matic investigation with those from the Aq-
Workload assessment was conducted of
uila study (Byers et aI., 1988). Data on total
seven operators during the LOS-F-H FDTE,
system performance for each mission were
which took place at Fort Bliss, Texas, from
provided to the authors by independent eval-
May through July 1988. Four workload rating
uators.
scales were used: MCH, OW, SWAT, and TLX.
Substantial and significant multiple corre-
The objectives of this investigation were to
lations between system performance and
explore the applicability of the operator
workload (as measured by TLX) were found.
workload scales under the conditions char-
In particular, with R = 0.66, the resulting
acterizing field test evaluations and to eval-
model relating system performance to TLX
uate operator workload during LOS-F-H
indicated generally decreasing performance
operations.
with increases in workload. In addition, ret-
The results of the subjective ratings of op-
rospective application of operator workload
erator workload in the LOS-F-H FDTE indi-
measures proved successful. This study is
cated the following. Overall workload ratings
documented in Hill, Zaklad, Bittner, Byers,
were significantly different for crew position.
and Christ (1988).
There were some significant effects of mission
LOS-F-H Generic variables on workload. There were differ-
ences in both magnitude and dimensions of
Generic workload ratings were obtained
workload among mission segments. This
from five operators and nine subject matter
study is documented in Hill, Byers, Zaklad,
experts of the LOS-F-H system during two
and Christ (1989).
separate data collection sessions. Four oper-
ator workload rating scales were used: MCH, UH-60A Blackhawk Helicopter
OW, SWAT, and TLX. The objectives of this
investigation were to explore the applicabil- Operator workload data were collected
ity of the workload scales for obtaining ge- from 10 two-man UH-60A crews during sim-
neric estimates from operators and subject ulated one-hour resupply missions. Four op-
matter experts and to explore operator work- erator workload rating scales were used:
load in the baseline LOS-F-H, which was MCH, OW, SWAT, and TLX. In addition, a
evaluated during the NDICE. tool was developed to assess peak workload
Two broad conclusions were drawn from (the PW scale). The objectives of this investi-
this evaluation of the use of operator work- gation were (1) to determine the relationship
load scales. First, generic ratings may be used between an analytical model's (the Task
to assess mission conditions and task seg- Analysis/Workload, or TAWL) prediction of

Downloaded from hfs.sagepub.com at UNIV CALIFORNIA SAN DIEGO on January 24, 2016
co
.::
'"
<:;
:a-
'"
o

~
o ~ III

~
o
to S"'Ul
Qi ·W
'"
C-
c- ~:::!:
O
o Oen
.•..
I'-
<0 It)0l

co
CO
Ql
<J
.•..
Ol

c:
e
.l!!
Ql
ex:

W
l- I
e
u.. u.
.!!!
.-0'
en
':;
Qio
C:...J
C7 Q)~
« CJ

-434--
Downloaded from hfs.sagepub.com at UNIV CALIFORNIA SAN DIEGO on January 24, 2016
c
Ul
o
o
iV
Ul
m
ma. 6~
!~
e
,... 0
0

0-
.- J:
l3LL
In'(J)
We
t-...J
0-
u..

-435-
Downloaded from hfs.sagepub.com at UNIV CALIFORNIA SAN DIEGO on January 24, 2016
436-August 1992 HUMAN FACTORS

workload (Bierbaum, Szabo, and Aldrich, task loading, factor analysis was performed
1987) and the actual workload reported by on the aggregate data. For all studies, a sin-
the pilot and copilot while flying a simulated gle factor emerged which was named the
mission; (2) to investigate differences be- operator workload (actor. The correlation of
tween workload reported during the mission each scale with the operator workload factor
and workload recalled following the mission; was calculated and is the factor validity for
(3) to examine operator acceptance of the var- that scale for that study. Sensitivity, as
ious operator workload assessment tech- measured by the factor validity, is a descrip-
niques; and (4) to evaluate the effects of key tion of the degree to which the scales measure
mission variables on crew workload as well workload.
as the relationship between performance and The factor validity analysis for a given
workload. study was conducted in two stages. During
Four primary conclusions concerning work- the first stage, a principal component analy-
load were drawn from this investigation. sis was conducted on the sets of segment rat-
First, the TAWL model has shown an ability ings collected across subjects and missions
to predict empirical data (r = 0.82 to 0.95, P using BMDP4M (Dixon, 1983). Each set in-
< 0.01) and has substantial potential as an cluded overall workload ratings using four
analytical workload estimation technique scales: TLX, SWAT, OW, and MCH. In all five
that may be applied to predict workload be- data sets, this analysis revealed a single com-
fore system development. Second, empirical ponent, the operator workload factor. The re-
workload assessment techniques can be sults of this initial analysis supported the
readily applied in an army aviation setting view that the four workload scales essentially
with favorable operator acceptance, with provide assessments of a single common
TLX and OW being the most preferred scales. factor.
Third, workload ratings have shown reason- Jackknife principal component analyses
able variation with key mission conditions. were conducted of the workload measures
Fourth, the PW measure may be a useful ad- during the second stage in order to evaluate
dition to the repertoire of subjective work- the stability of the factor loadings of the four
load assessment techniques but requires fur- scales (i.e., correlations with the operator
ther study and validation. This study is workload factor). Jackknife analysis, it
documented in lavecchia, Linton, Bittner, should be noted, generally involves succes-
and Byers (1989) and lavecchia, Linton, Har- sive analyses (in the present case, principal
ris, Zaklad, and Byers (1989). component analyses), dropping subjects
one at a time from a data set in order to al-
COMPARISON DIMENSIONS
low analysis of the stability of parameter
The four subjective scales used in all stud- estimates (Hinkley, 1983). For example, in
ies-TLX, OW, MCH, and SWAT-were com- the LOS-F-H NDICE analysis with four fac-
pared with one another along four dimen- tor loadings and six subjects, a 4 (loadings)
sions: sensitivity (as measured by factor x 6 (subject dropped) matrix was produced
validity), operator acceptance, resource re- that could be analyzed by conventional
quirements, and special procedures. repeated-measures analysis of variance
(ANOVA). ANOVA using BMDP2V (Dixon,
Sensitivity
1983) was used to examine significant dif-
To examine how each of the four scales was ferences among the workload scale factor
able to discriminate among different levels of loadings.
Downloaded from hfs.sagepub.com at UNIV CALIFORNIA SAN DIEGO on January 24, 2016
COMPARISON OF WORKLOAD SCALES August 1992--437

Operator Acceptance RESULTS

Another source of comparative information Sensitivity


on the four rating scales was operators' reac-
tions to the scales. This dimension is of in- TLX has the highest factor validity (i.e., the
terest because the increased operator ac- greatest correlation with the operator work-
ceptance of a subjective measurement tool load factor) for each of the five studies, as
may result in increased willingness to express seen in Table 2. Subsequent analysis revealed
a valid opinion that can be taken seriously the orderings of the mean factor loadings for
and used. the other scales, also shown in this table. The
simplest scale, OW, had the second highest
Resource Requirements average factor validity after TLX.

For practical purposes, it is important to Operator Acceptance


know the relative resource requirements of
the scales (i.e., how much it costs to use). The The data show that TLX was liked best, OW
differences among the scales are reflected in was the easiest to complete, and TLX was
the required time for training, preparation, rated the best in its ability to represent work-
completion, data reduction, and analysis. load (Table 3).

Special Procedures Resource Requirements

Requirements of the multidimensional OW took the least time to complete, TLX


scales-TLX and SWAT-include special pro- took the most, and MCH and SWAT were in-
cedures, which consume additional time. termediate (Table 4). Table 4 can be readily
Both the TLX and SWAT require sorts, or understood if one looks at the number of
judgments, that are designed to uncover the judgments made for each scale: six for TLX,
relative salience of the scales' component di- three for SWAT, and one each for MCH and
mensions to each individual (independent of OW. Although other times were not system-
the workload ratings themselves). atically measured, our experience suggests

TABLE 2

Workload Scale Factor Validities


Study % Total Variance Factor Validities

LOS-F-H 80 TLX (0.935) > OW (0.927) > MCH (0.862) > SWAT (0.860)
NDICE
LOS-F-H 80 TLX (0.924) > OW (0.905) > MCH (0.904) > SWAT (0.778)
generic
LOS-F-H 79 TLX (0.942) > SWAT (0.900) > OW (0.898) > MCH (0.818)
FDTE basic
Aquila FDTE 75 TLX (0.910) > SWAT (0.893) > OW (0.869) > MCH (0.833)
UH-60A 71 TLX (0.899) > OW (0.872) > SWAT (0.805) > MCH (0.799)

Note: Breaks In the underline Indicate statistically significant differences batween factor validities (a = 0.05).
FDTE = Force Development Test and Evaluation; lOS-F-H = line-of-Sight. Forward, Heavy; MCH = Modified Cooper-Harper scale; NDICE
= Non·Developmentalltem Candidate Evaluation; OW = Overall Workload scale; SWAT = Subjective Workload Assessment Technique; TLX
= NASA Task load Index; UH-60A = Blackhawk helicopter.

Downloaded from hfs.sagepub.com at UNIV CALIFORNIA SAN DIEGO on January 24, 2016
438-August 1992 HUMAN FACTORS

TABLE3 1987). Our observations suggest that the


problems may be more pronounced for less
Operator Acceptance Ratings
verbal and sophisticated operators.
Rating Scale Attribute TLX OW MCH SWAT
DISCUSSION
Liked best 1.5* 2 3.5 3
Easiest to complete 2 1 4 3 Each of the four subjective workload tech-
Best decription of workload 1 2 3.5 3.5 niques has unique characteristics. All mea-
• Average of mean rankings (1 = best •... 4 = worst) of 37 oper- sure workload, as shown by the moderate to
ators.
TlX = NASA Task load Index: OW = Overall Workload scale: high positive correlations with the operator
MCH = Modified Cooper-Harper scale; SWAT = Subjective Work- workload factor (i.e., the factor validities).
load Assessment Technique.
Two are unidimensional workload scales,
which take less time to complete and less
that OW requires substantially less time for
data reduction time (OW and MCH). Two oth-
training, preparation, and data reduction
ers are multidimensional scales, which tradi-
than do the other scales. Both TLX and SWAT
tionally entail special sorting procedures that
require more time for data reduction and
take more time but which are justified by the
analysis because of the multidimensional na-
additional information they provide (TLX
ture of the scales.
and SWAT). Certainly the multidimensional
Special Procedures scales information may also be used diagnos-
tically to address the question of what is
The additional information gained from causing the workload reported, and it may
the multidimensional representation of work- point toward ways to alleviate excessive
load may justify the cost of the additional workload.
sorting procedures. However, in the case of All four scales are moderately to highly ac-
TLX, our research suggests that its paired- ceptable tools with respect to sensitivity to
comparison sort procedure may be skipped different levels of workload. However, TLX
without compromising the measure (Byers, and OW are consistently superior in terms of
Bittner, and Hill, 1989). SWAT sorts pre- sensitivity and have the strongest operator
sented problems to some of the subjects; 23 of acceptance. With regard to resources and spe-
54 subjects did not initially produce adequate cial procedures, the needs of the study must
SWAT card sorts. That results in a 43% fail- be examined to determine which tool may be
ure rate on the first attempt to perform the most appropriate for the particular circum-
SWAT sort as suggested by the criteria pre- stance. The unidimensional scales, particu-
sented in the SWAT user's guide (Armstrong larly OW, may be useful as a screening tool to
Aerospace Medical Research Laboratory, identify potential chokepoints of workload;
the multidimensional tools, particularly
TABLE4 TLX, may be used to obtain more detailed
and diagnostic data.
Time to Complete Each Rated Workload Situation
(in Seconds) This research is an example of a systematic
approach to examining four subjective work-
TLX OW MCH SWAT
load measurement tools. The comparative
Number of cases 38 33 27 27 approach, attempting to draw inferences
Mean time to complete 51.3 9.8 29.1 33.6 concerning workload techniques from sev-
Standard deviation 29.5 8.4 26.3 24.6
eral studies, has proved to be useful. This
TLX = NASA Task lOad Index; OW = Overall Workload scale; approach has provided information with
MCH = Modified Cooper-Harper scale; SWAT = Subjective Work-
load Assessment Technique. which
Downloaded from hfs.sagepub.com at UNIV CALIFORNIA SAN to make
DIEGO informed
on January 24, 2016 decisions about se-
COMPARISON OF WORKLOAD SCALES August 1992-439

lection of workload assessment techniques. Hart, S. G., and Staveland, L. E. (1988). Development of a
NASA-TLX (Task Load Index): Results of empirical
Such a systematic approach should be used and theoretical research. In P. A. Hancock and N.
to assess many human factors measurement Meshkati (Eds.), Human mental workload (pp. 139-
183). Amsterdam: North-Holland.
tools. Hill, S. G., Byers, J. C., Zaklad, A. L., and Christ, R. E.
(1989). Subjective workload ratings of the LOS-F-H mo-
ACKNOWLEDGMENTS bile air defense missile system in a field test environment
(Tech. Memo 5). willow Grove, PA: Analytics, Inc.
This work was supported by the u.s. Army Research Hill, S. G., Zaklad, A. L., Bittner, A. C., Jr., Byers, J. C., and
Institute under Contract No. MDA903-86-C-0384 to Analyt- Christ, R. E. (1988). Workload assessment of a mobile
ics, Inc. The views, opinions, and findings contained in this air defense missile system. In Proceedings of the Human
paper are those of the authors and should not be construed Factors Society 32nd Annual Meeting (pp. 1068-1072).
as an Official Department of the Army position, policy, or Santa Monica, CA: Human Factors Society.
decision. During the majority of the research program, the Hinkley, D. V. (1983). Jackknife methods. In S. Konz, N. L.
first five authors were affiliated with Analytics, Inc., Johnson, and C. B. Read (Eds.), Encyclopedia of statis-
Willow Grove, Pennsylvania. The authors acknowledge tical sciences (Vol. 4, pp. 280-287). New York: Wiley.
the contribution of other members of the team to the suc- lavecchia, H. P., Linton, P. M., Bittner, A. C., Jr., and By-
cess of the Operator Workload Program: John P. Bulger, ers, J. C. (1989). Operator workload in the UH-60
Regina M. Harris, Paul M. Linton, Robert J. Lysaght, and Blackhawk: Crew results vs. TAWL model predictions.
Michelle Sams. In Proceedings of the Human Factors Society 33rd An-
REFERENCES nual Meeting (pp. 1481-1485). Santa Monica, CA: Hu-
man Factors Society.
Armstrong Aerospace Medical Research Laboratory. lavecchia, H. P., Linton, P. M., Harris, R. M., Zaklad, A. L.,
(1987). Subjective workload assessment technique and Byers, J. C. (1989). UH-60 system report (Tech. Re-
(SWAT): A user's guide. Wright-Patterson Air Force port 2075-4C). Willow Grove, PA: Analytics, Inc.
Base, Author. Lysaght, R. J., Hill, S. G., Dick, A.O., Plamondon, B. D.,
Bierbaum, C., Szabo, S., and Aldrich, T. (1987). A compre- Wherry, R. 1., Jr., Zaklad, A. L., and Bittner, A. C., Jr.
hensive task analysis of the UH-60 mission with crew (1989). Operator workload: Comprehensive review and
workload estimates and preliminary decision rules for de- evaluation of operator workload methodologies (ARI
veloping a UH-60 workload prediction model (Tech. Re- Tech. Report 851). Alexandria, VA: U.S. Army Re-
port MDA903-87-C-0523). Ft. Rucker, AL: Anaeapa Sci- search Institute for the Behavioral and Social Sciences.
ences. Moray, N. (Ed.). (1979). Mental workload: Its theory and
Bittner, A. C., Jr., Byers, J. C., Hill, S. G., Zaklad, A. L., and measurement. New York: Plenum.
Christ, R. E. (1989). Generic workload ratings of a mo- O'Donnell, R. D., and Eggemeier, F. T. (1986). Workload
bile air defense system (LOS-F-H). In Proceedings of the assessment methodology. In K. R. Boff, L. Kaufman,
Human Factors Society 33rd Annual Meeting (pp. l47b- and J. P. Thomas (Eds.), Handbook of perception and
1480). Santa Monica, CA: Human Factors Society. human performance (pp. 42-1-42-49). New York: Wiley.
Bittner, A. C., Jr., Zaklad, A. L., Dick, A. 0., Wherry, R. J., Reid, G. B., Shingledecker, C. A., and Eggemeier. F. T.
Jr., Herman, E. D., Bulger, J. P., Linton, P. M., Lysaght, (1981). Application of conjoint measurement to work-
R.J., and Dennison, T. W. (1988). Operator workload load scale development. In Proceedings of the Human
(OWL) assessment program for the army: Validation and Factors Society 25th Annual Meeting (pp. 522-525).
analysis plans for three systems (ATHS, Aquila, WS-F- Santa Monica, CA: Human Factors Society.
H) (Tech. Report 2075-3b). Willow Grove, PA: Analyt- Sheridan, T. (1980). Mental workload-What is it? Why
ics, Inc. bother with it? Human Factors Society Bulletin, 23(2),
Byers, J. C., Bittner, A. C., Jr., and Hill, S. G. (1989). Tra- 1-2.
ditional and raw Task Load Index (TLX) correlations: Vidulich, M. A., and Tsang, P. S. (1985). Assessing subjec-
Are paired comparisons necessary? In A. Mital (Ed.), tive workload assessment: A comparison of SWAT and
Advances in industrial ergonomics and safety I (pp. 481- the NASA-Bipolar methods. In Proceedings of the Hu-
485). London: Taylor & Francis. man Factors Society 29th Annual Meeting (pp. 71-75).
Byers, J. C., Bittner, A. C., Jr., Hill, S. G., Zaklad, A. L., and Santa Monica, CA: Human Factors Society.
Christ, R. E. (1988). Workload assessment of a re- Vidulich, M. A., and Tsang, P. S. (1987). Absolute magni-
motely piloted vehicle (RPV) system. In Proceedings of tude estimation and relative judgment approaches to
the Human Factors Society 32nd Annual Meeting (pp. subjective workload assessment. In Proceedings of the
1145-1149). Santa Monica, CA: Human Factors Soci- Human Factors Society 31 st Annual Meeting (pp. 1057-
ety. 1061). Santa Monica, CA: Human Factors Society.
Byers, J. C., Hill, S. G., Zaklad, A. L., and Christ, R. E. Warr, D., Colle, H., and Reid, G. (1986). A comparative eval-
(1989). Aquila system report (Tech. Report 2075-4A). uation of two subjective workload measures: The Subjec-
Willow Grove, PA: Analytics, Inc. tive Workload Assessment Technique and the Modified
Cooper, G. E., and Harper, R. P. (1969). The use of pilot Cooper-Harper Scale. Paper presented at the Sympo-
ratings in the evaluation of aircraft handling qualities sium on Psychology in the Department of Defense. U.S.
(NASA TN-D-5153). Moffett Field, CA: NASA Ames Re- Air Force Academy, Colorado Springs, CO.
search Center. Wierwille, W. W., and Casali, J. G. (1983). A validated
Dixon, W. J. (Ed.). (1983). BMDP statistical software. Los rating scale for global mental workload measurement
Angeles: University of California Press. application. In Proceedings of the Human Factors Soci-
Hancock, P. A., and Meshkati, N. (Eds.). (1988). Human ety 27th Annual Meeting (pp. 129-133). Santa Monica,
mental workload. Amsterdam: North-Holland. CA: Human Factors Society.
Downloaded from hfs.sagepub.com at UNIV CALIFORNIA SAN DIEGO on January 24, 2016
Downloaded from hfs.sagepub.com at UNIV CALIFORNIA SAN DIEGO on January 24, 2016

You might also like