Fleischman Evaluating Classifications of Job Behaviour

PERSONNEL PSYCHOLOGY
1991.44
EVALUATING CLASSIFICATIONSOF JOB BEHAVIOR:

A CONSTRUCT VALIDATION OF THE ABILITY
REQUIREMENT SCALES
EDWIN A. FLEISHMAN, MICHAEL D. MUMFORD

George Mason University
This article discusses the major inferential issues arising in the devel-
opment of behavioral classification systems. Subsequently, we discuss
the implications of these inferential issues for evaluating the construct
validity of systems designed to assess the requirements of human task
performance. Fleishman’s (1975b, 1982)ability requirement taxonomy
and its associated job analysis system, the Manualfor theAbility Require-
ment Scales (UARS), were then evaluated with respect to these crite-
ria. In particular, a variety of criteria relevant to internal and external
validity was reviewed. It was found that the ability requirement tax-
onomy and the associated measurement system provides a meaningful
description of job activities with respect to these criteria. It was argued
that application of these construct validity principles might contribute
much to our understanding of human performance.
The varied research efforts of industrial and organizational psychol-

ogists are bound together by our concern with understanding people’s
behavior in the workplace. Attempts to understand work behavior are,
however, contingent on prior description of this behavior (Ash, 1988).
Job analysis procedures attempt to generate a coherent description of
the job behaviors that are the focus of explanation. As a result, job anal-
ysis provides a foundation for work in personnel selection (Dunnette,
1966), performance appraisal (Latham & Wexley, 1980), training (Gold-
stein, 1986), job design (Davis & Wacker, 1988), and wage and salary
administration (Henderson, 1988).
If it is granted that job analysis provides the descriptive information
guiding personnel interventions, then we must apply descriptive systems
that yield a valid, substantive, meaningful description of people’s job ac-
tivities. We are not the first investigators to call for the validation of
job analysis systems (Ash & Levine, 1980; McCormick, 1979). In fact,
Chesler (1948), Levine, Ash, and Bennett (1980), Mumford, Weeks,
Harding, and Fleishman (1987), Myers, Gebhardt, Price, and Fleishman
We would like to thank Lee Friedman, Sigrid Gustafson, Al Hartman, and Bernard
Nickels for their comments concerning earlier drafts of this manuscript. Correspondence
and requests for reprints should be addressed to Dr. Edwin A. Fleishman, Department of
Psychology, George Mason University, Fairfax, Virginia 22030-4444.
COPYRIGHT 0 1991 PERSONNEL PSYCHOLOGY, INC.
523
524 PERSONNEL PSYCHOLOGY
(1981), and Rupe (1956) have conducted studies intended to appraise

the meaningfulness of certain job analysis systems. Wagner (1985), for
instance, attempted to appraise the meaningfulness of task data by as-
sessing convergence with aircraft maintenance reports. Similarly, Mum-
ford et al. (1987) considered how inferences derived from a job analysis
method yielding measures of occupational difficulty were borne out in
the actual training course content of Air Force technical schools. Levine
et al. (1980) used professional judgment to establish the meaningful-
ness of the descriptive information provided by alternative job analysis
systems for different kinds of personnel interventions. Fleishman and
Quaintance (1984), in particular, have reviewed alternative task classifi-
cation systems and discussed the scientific and conceptual issues involved
in their evaluation.
When one reviews available evidence bearing on the validity of job
analysis systems, it becomes clear that validation studies have been few
and far between. This state of affairs might be attributed to the belief
that a direct and comprehensive assessment of observable behaviors will
invariably yield an adequate description of job activities. This assump-
tion can be questioned, however, based on the complex nature of job
activities (Wagner, 1985), the subtle influences of organizational climate
(Madden, 1962), and the impact of cognitive biases on behavioral recon-
structions (Harvey & Lozada-Larson, 1988).
If we cannot use this a priori assumption to establish the construct
validity and meaningfulness of our job analysis systems, then one must
ask why few empirical studies have been conducted. One explanation
for this situation is that we lack general principles for establishing the
validity of the descriptive information provided by our job analysis sys-
tems. The intent of the present paper is to elucidate the nature of these
principles and to illustrate their application in validating descriptive in-
ferences derived from one job analysis system. In particular, we want to
emphasize the applications of general, scientific principles of construct
validity to issues involved in job analysis.
To delineate these principles, we will first show how all job analysis ef-
forts require an implicit or explicit classification of job activities. We will
then consider the nature of the summarization categories derived from
such classification efforts. This information will then be used to establish
a set of general strategies for evaluating the substantive meaningfulness
or construct validity of any system for classifying job tasks.
Having used the principles of classification to specify a general set
of validation issues, we will then attempt to illustrate the application of
these strategies to a particular job analysis system. Specifically, we will
examine the validation evidence compiled for the ability requirements
approach (Fleishman, 1975b, 1982). This approach resulted in a set of
FLEISHMAN AND MUMFORD 525
behaviorally anchored scales covering a range of cognitive, psychomo-

tor, physical, and sensory-perceptual abilities combined into a Manual
for the Abilify Requirement Scales (MARS) (Fleishman, 1975a, 1991). We
selected this particular job analysis system for assessment because (a) it
was explicitly developed and implemented in the context of a broader
study of classification principles (Fleishman & Quaintance, 1984), and
(b) it is commonly used to obtain summary descriptions of job require-
ments for use in selection, training, and career development efforts (see
Fleishman, 1988; Fleishman & Mumford, 1989a, 1989b).
ClassiJcationIssues
Principles and Procedures
To understand the role of classification in job analysis, it is neces-

sary to consider the nature of jobs and the descriptive problems they
pose. McCormick (1979) notes that a job is reflected in the activities
performed by people for remuneration that in some way contribute to
the attainment of organizational goals. Organizations, however, create
a large, if not infinite, number of jobs that require a number of different
activities. Any attempt to observe every activity performed by each and
every worker would yield a confusing mass of information that has little
value for attaining a general understanding of work behavior. Thus, the
fundamental problem facing us in job analysis is to find a viable frame-
work for summarizing this massive body of descriptive information to
provide a more parsimonious and meaningful description of people’sjob
activities.
In the natural sciences, this simplification problem is solved through
the development and application of classification systems. A classifica-
tion system may be defined as a set of specified rules for describing the
structure and relationships among a set of objects drawn from some do-
main that permits similar units to be assigned to a smaller number of cat-
egories or classes (Simpson, 1961; Sokal, 1974;Sokal & Sneith, 1963). By
assigning similar objects or units to a common categoly, it becomes pos-
sible, within the boundaries of the classification system, to treat these ob-
jects as functionallyequivalent entities (Fleishman & Quaintance, 1984).
As a result, the search for understanding need not consider each object
or observation to be a lawful entity unto itself. Instead, laws may be con-
structed with reference to all members of a category, seen or unseen,
based on a limited, albeit adequate, number of observations within a
category.
If it was, indeed, possible to identify a universal order of the sort
sought by Aristotle (Crowson, 1970; Mayr, 1969), then application of
classifications might have provided an unambiguous solution to the sim-

plification problem confronting us in job analysis. It has become appar-
ent over the intervening millennia, however, that we cannot formulate
a single, absolute, all-encompassingclassification scheme. In fact, even
in the relatively circumscribed domain of job behavior, a variety of clas-
sification schemes have been applied (Fleishman, 1975b, 1982; Fleish-
man & Quaintance, 1984). Use of these varied approaches arises from
the fact that in complex observational domains a number of different
classification systems can be constructed that yield meaningful summary
descriptions.
According to Fleishman and Quaintance (1984), the most powerful
determinant of the nature of a classification system is its intended pur-
pose. The development of a classification system is rarely an end unto it-
self. Rather, classifications are formulated to facilitate description, pre-
diction, and understanding with respect to certain goals. These differ-
ences in intent lead to the development and application of very different
classification strategies. Fleishman and Quaintance organized the many
different task descriptive systems available into four general approaches:
(a) behavior description, (b) behavior requirements, (c) ability require-
ments, and (d) task characteristics. Investigators constructing a classi-
fication of job behavior for use in personnel selection might employ an
ability requirements approach (Fleishman, 1975b), while investigators
interested in man-machine design or training design might employ this
approach or a behavior requirements, behavior description, or task char-
acteristics strategy (Farina & Wheaton, 1973; Fleishman & Quaintance,
1984; GagnC, 1962; Miller, 1973).
Even given common goals, the specific procedures used in construct-
ing classifications can exert a powerful influence on the nature of the re-
sulting summary descriptions. Four basic operations are required to gen-
erate a classification (Fleishman & Quaintance, 1984; Mumford, Stokes,
& Owens, 1990): (a) specification of the domain of behaviors to be clas-
sified; (b) definition and measurement of the essential properties of be-
haviors lying in this domain; (c) appraisal of the relative similarity of
these behaviors to each other; and (d) specification of decision rules for
determining when behaviors display sufficient similarity to permit assign-
ment to a common category. However, a variety of procedures might be
used to meet each of these operational requirements. Thus, differences
in the nature of the procedures used to meet these operational require-
ments can lead to complex and pervasive differences in the nature of the
resulting classification schemes.
Evaluation Principles
Construct systems. Variation in the nature and content of classifica-

tion brought about by differences in purpose and method indicate that
taxonomic efforts cannot provide a single, unitary structure for the de-
scription, prediction, and understanding of job behavior. Given this sit-
uation, how is one to go about selecting the classification systems that
should be employed in describing job behavior? One answer to this
question would be to use the most effective classification scheme avail-
able with respect to one’s intended purpose (Fleishman & Quaintance,
1984). The term eflective here, however, is not itself lacking in ambiguity,
because at first glance it is not exactly clear how one should go about es-
tablishing the effectivenessof a classification. As Fleishman and Quain-
tance point out, taxonomists have not traditionally given much attention
to the quantitative evaluation of alternative classification schemes, often
stressing the utilitarian criteria (Miller, 1969; Miller, 1962, 1967).
The nature of classification efforts does, however, suggest a solution
to this problem. Any classification results in the assignment of units-in
our case behaviors-to a smaller number of categories. Each category,
as a result, contains a set of functionally equivalent units that have been
organized on the basis of certain observed similarities. These categories
must, therefore, reflect a set of constructs, defined by Cronbach (1971)
as entities for organizing experienced objects into categories permitting
general law-like statements.
As Cronbach (1971) and Messick (1989) point out, these categorical
constructs reflect a substantive organization of a set of objects and object
relationships. More specifically, the objects or behaviors assigned to a
common category are held to have similar meaning, as manifest in the
relationships they produce. Furthermore, these relationships differ from
those expected for objects or behaviors assigned to other categories. As
a result, the crucial evaluative issue to be addressed in any taxonomic
effort is whether category assignments do, indeed, provide a meaning-
ful basis for organizing behavior and defining behavioral relationships.
Construct validity is viewed as an attempt to establish the substantive
meaningfulness of a particular interpretive organization imposed on a
set of behavioral observations, such as people’s behavior in the work-
place (Cronbach & Meehl, 1955; Guion, 1978,1980; Landy, 1986; Mes-
sick, 1975, 1980, 1989). Thus, the evaluation of classifications requires
a multifaceted construct validation effort intended to establish the sub-
stantive meaningfulness of the proposed categories.
Points of view differ concerning how one should establish the mean-
ingfulness of an interpretive structure (Cook & Campbell, 1979; Mu-
liak, 1986). However, James, Muliak, and Brett (1984) and Simon (1953,
1957), among others, have argued that meaning arises from specification
of functional relationships permitting inferences concerning the likely
status of an object given certain known conditions. In essence, then,
meaning arises from a set of propositional relational statements that can
be shown to hold in reality. Classification systems, of course, involve a
set of propositional relationships concerned with the relative similarity
of category members and their characteristic properties. The proposi-
tional relationships implied by category status can, of course, be articu-
lated, elaborated, and formally specified. If these relationships are sub-
sequently substantiated in empirical tests, evidence for the meaningful-
ness of a classification will have been obtained.
Evaluation issues. This hypothesis testing framework indicates that
a well-articulated nomothetic net contributes to the evaluation of clas-
sifications. This nomothetic net, reflecting relationships implied by our
understanding of the phenomenon, provides a necessary basis for hy-
pothesis generation and the subsequent accumulation of confirmatory
evidence. Viable hypotheses, however, do not arise in a vacuum. Rather,
they require some understanding of the phenomenon at hand. Psycho-
logical theory, therefore, provides a basis for establishing the construct
validity of classification systems. As a result, the acquisition of validation
evidence is likely to progress more rapidly in areas where viable theories
providing well-specified relational networks are available.
Hypothesis generation, however, represents a minimum condition
for establishing the meaningfulnessof a classification. Multiple hypothe-
ses must be generated, and these hypotheses must be tested to provide
the requisite confirmatory evidence. Under these conditions, greater
confidence can be placed in the meaningfulness of a classification as the
number and diversity of the confirmed hypotheses increases (James, Mu-
liak, & Brett, 1984). In searching for evidence of a system’s meaningful-
ness, the sheer number of confirmed relationships is not the only issue
to consider. Messick (1989) points out that stronger validation evidence
is provided by studies that provide confirmatory evidence for central,
theoretically important relationships. Greater confidence can also be
placed in the meaningfulness of a classification when this confirmatory
evidence has been accrued using a number of alternative methods (Cron-
bach, 1971). Similarly, Cook and Campbell’s (1979) notions of conver-
gent and divergent validity suggest that relational tests which serve to
rule out competing theoretical explanations will tend to permit stronger
inferences to be drawn concerning the classification’sconstruct validity.
In evaluating evidence provided by these hypothesis tests, two other
considerations should be attended to. First, only rarely will it prove pos-
sible to test all relationships implied by the nature of the classification
categories. Because future relational tests may yield disconfirmatory evi-

dence, statements concerning a classification’s construct validity must be
made in a tentative fashion, based on the evidence accrued to date. Be-
cause a variety of tests might be conducted, and spurious findings might
emerge, statements about the apparent meaningfulness or construct va-
lidity of a classification must consider the overall pattern of available ev-
idence.
One problem with our foregoing observations is that they do not pro-
vide specific standards and procedures for determining when a classi-
fication evidences adequate construct validity. In part, this deficiency
derives from the notion of theory-based hypothesis testing. Obviously,
the relational tests used to accrue evidence for the meaningfulness of a
classification are necessarily contingent on the substantive implications
of the categories at hand. Nonetheless, the goals and structure of all
classification efforts do lead to some general conclusions about the kind
of inferential tests that should be conducted and the sequence in which
they should be carried out.
To construct a classification, investigators must make a series of sub-
stantive and methodological decisions that permit initial category defini-
tion (Messick, 1989). These substantive and methodological decisions,
however, often imply certain expected outcomes to result from applica-
tion of certain procedures. These expected outcomes, associated with
the operations used in constructing a classification, provide a set of hy-
potheses that may be used to accrue some initial evidence indicative of
the system’s meaningfulness.
These observations suggest that attempts to obtain evidence indica-
tive of a classification’sconstruct validity should begin at the time the sys-
tem is being constructed. Hypotheses should be formulated concerning
the outcomes expected after each of the four major operational steps: (a)
domain definition, (b) measurement of behavioral properties, (c) simi-
larity assessment, and (d) application of the decision rules for assigning
behaviors to categories. For instance, it might be hypothesized that ap-
plication of effective decision rules will result in 80% of the behaviors
being assigned to a single category. If this hypothesis is confirmed after
applying these decision rules, then evidence indicative of the system’s
meaningfulness will have been obtained.
Beyond operational tests of the sort described above, the nature of
classification systems permits two other kinds of inferences to be drawn
that might be used to provide construct validity evidence. Any classifi-
cation implies a set of expected relationships (a) among the behaviors
to be assigned to the categories, and (b) between the categories them-
selves. If hypotheses can be formulated concerning the kind of behaviors
that should be assigned to each category, or concerning expected rela-

tionships among the categories, then confirmation of these hypothesized
relationships can be used to provide construct validity evidence derived
from the structure of the classification. This internal validity evidence
not only permits initial inferences pertaining to the system’s meaningful-
ness, it provides a necessary prerequisite for construct validity evidence
focusing on external relationships.
Unlike internal validity tests that focus on relationships embedded
in the classification structure, tests of external validity focus on how the
categories are related to theoretically significant forms of behavior that
were not expressly considered in the initial classification effort. Thus,
one might ask whether intelligence is related to school grades. Confir-
mation of hypothesized relationships with forms of behavior that were
not considered in specifying the classification categories often provide
compelling evidence for the system’s meaningfulness or construct va-
lidity. This is especially likely to prove true when these hypotheses are
specified in such a way as to reflect the classification’sintended purpose.
Additionally, hypotheses derived from the theory underlying the clas-
sifications that specify how categories condition or contribute to other
substantively relevant forms of behavior, while discounting alternative
causal explanations for the observed relationships, often yield particu-
larly compelling construct validity evidence.
Tests of a classification’s external validity often require a well-
specified, rather elaborate substantive understanding of the categories
and their implications for other forms of behavior. This theoretical un-
derstanding, however, will be subject to progressive refinement as a re-
sult of the findings obtained in these tests. These theoretical refine-
ments, along with the results obtained in operational tests and assess-
ments of internal validity, will, of course, lead to progressive refinements
in the nature of the classification (Snow & Lohman, 1989). Thus, con-
struct validity and classification cannot be viewed as k e d , static entities.
Rather, they should be viewed as part of an integrated, ongoing effort
intended to reveal and refine the meaning imputed to categories held
to summarize certain forms of behavior. Over time, this progressive re-
finement in categories, theories, and hypotheses should lead to the emer-
gence of more sophisticated descriptive, predictive, and explanatory sys-
tems. To accumulate evidence of this sort, a diverse set of studies needs
to be carried out in order to address the many inferential issues involved
in a comprehensive classification effort.
Description ofthe Abilily RequirementsApproach
Overview
In the foregoing section, we specified the general principles underly-

ing the validation of classifications and delineated the kinds of relational
hypotheses that might be used to obtain evidence of the classification’s
meaningfulness or construct validity. In the ensuing discussion, we will
attempt to elaborate these principles by illustrating their application in
establishing the construct validity of Fleishman’s (1972a, 1975a, 1975b,
1982) ability requirement taxonomy and the job analysis system asso-
ciated with this taxonomy. In reviewing evidence bearing on the rela-
tional hypotheses implied by the ability requirements classification, we
hope to demonstrate how an ongoing construct validation program, con-
sidering the varied inferential issues arising in classification efforts, can
contribute to the description, prediction, and understanding of job be-
havior.
In keeping with the principles sketched out earlier, much of the evi-
dence bearing on the construct validity of this classification is intimately
tied to the substantive nature and intent of the taxonomy. Hence, we will
first provide a brief overview of the nature and goals of this taxonomic
effort. Next, we will provide a brief description of the procedures used
to classify tasks in terms of their associated ability requirements.
Definitional Issues: Abilities, Tasks, and Units
Within the ability requirements approach, an ability is held to re-

flect a relatively enduring attribute of the individual that accounts for
differences in task performance across a number of situations (Fleish-
man, 1972a, 1982). Hence, an ability may reflect cognitive competen-
cies, such as deductive reasoning and spatial orientation, as well as other
performance-relevant attributes of the individual, such as sensory capac-
ities, physical abilities, psychomotor skills, interpersonal resources, and
social skills (Fleishman, 1964, 1972b; Fleishman & Quaintance, 1984;
Mumford & Nickels, 1990). This definition of an ability, however, does
not extend to skills developed in a particular situation, although it is rec-
ognized that abilities may develop over time and with exposure to mul-
tiple situations (Fleishman, 1972a; Snow & Lohman, 1984).
Given these assumptions, a straightforward strategy underlies appli-
cation of the ability requirements approach. More specifically, it is held
that the tasks people perform on their jobs differ in the extent to which
certain abilities are required for timely, efficient, or appropriate action.
Thus, tasks requiring similar abilities can be grouped together based on
similarity in the abilities relevant to performance. For instance, tasks

involved in aerial navigation, blueprint reading, and dentistry might be
grouped together based on similarity in the demands made for spatial
visualization in task performance. Subsequently, this diverse set of tasks
can be summarized, described, and understood with respect to the un-
derlying ability category serving to specify similarity in terms of individ-
ual performance requirements.
By summarizing tasks in terms of more basic abilities, application of
the ability requirements approach results in four significant outcomes.
First, a large number of tasks performed in diverse job settings might
be summarized with respect to a limited number of ability constructs.
Second, by virtue of the abilities in use, these categories should general-
ize across a number of discrete job settings. Third, observed similarities
in differential task performance provide an objective basis for category
definition and task assignment. Fourth, in areas where inferences are to
be drawn concerning how human capacities influence performance, as is
the case in personnel selection or training, the resulting summarization
of job behavior should have substantial practical value.
In considering the assumptions underlying application of this ap-
proach, some attention should be given to the tasks that represent the
units to be classified. Tasks have been defined in many ways at different
levels of generality (Fleishman, 1982; Fleishman & Quaintance, 1984).
Based on Miller’s (1967) earlier work, Wheaton (1973) proposed the
operational task definition used in constructing the ability requirements
classification. This definition holds that a task reflects an organized set
of responses to a specified stimulus situation intended to bring about the
attainment of a certain goal state. This definition of a task is similar to
one proposed by Hackman (1968) and to McCormick’s (1979) stimulus,
operations, and response model (Fleishman & Mumford, 1989a, 1989b).
Thus, definition of the units to be summarized is of interest because
it specifies the boundary conditions for application of the ability require-
ment taxonomy. For instance, this definition of a task suggests that the
taxonomy is not intended to capture affective reactions to performance
(Kulik & Oldham, 1988). Similarly, it is not expressly concerned with
the environment in which the work is performed (Konz, 1988). Within
this model, however, if environmental conditions lead to changes in the
stimulus, operations, or goals involved in performance, multiple distinct
task statements would be called for. Similarly, if task interdependencies
change pertinent stimuli, operations, and goals, multiple task statements
reflecting the nature of these interdependencies should be specified.
IdentifiingAbility Categories
Clearly, the ability categories used to summarize job tasks play a cen-
tral role in applying this classification. To define these categories, Fleish-
man (1972a, 1975a, 1982) attempted to identify the fewest, most useful
independent ability categories for describing performance. The method-
ology used to identify these ability categories is illustrated in Fleishman’s
(1964, 1972b) research on psychomotor and physical abilities. This re-
search program involved a series of interlocking experimental and fac-
tor analytical studies. Here, tasks were explicitly designed or selected
for inclusion in task batteries to test certain hypotheses about the or-
ganization of abilities over a wide range of tasks. These task batteries
were administered to several hundred subjects. The resulting correla-
tions among the performances of these subjects on these tasks were then
factor-analyzed to define ability categories. Experimental studies were
subsequently designed to introduce variations in the conditions of task
performance aimed at sharpening, limiting, or broadening initial factor
definitions.
Without reviewing the list of resulting abilities, we can say that a lim-
ited number of categories (9 or 10 in the psychomotor area and 9 in the
physical performance area) seemed to account for most of the variance
in several hundred tasks investigated over many years. It became appar-
ent in these studies, furthermore, that these abilities convey a great deal
of information about task performance. For example, multilimb coordi-
nation summarized tasks involving two hands and hands and feet seen in
operating equipment. This summarization, however, did not extend to
tasks involving coordination when the whole body was in motion. It was
also possible to say that there was an ability common to simple reaction
tasks, both auditory and visual, although complicating the response or
stimulus brought another ability into play, termed response orientation.
These investigations, along with a variety of other factor analysis
studies, provided well-documented evidence indicating the kind of abil-
ity categories that might be used to generate summary descriptions of
human task performance. Consequently, specification of the relevant
summarization categories began with a comprehensive literature review
intended to identify abilities that had been empirically established in
earlier research concerned with human task performance (Theologus &
Fleishman, 1973; Theologus, Romashko, & Fleishman, 1973). The pri-
mary sources of evidence for the utility of potential cognitive constructs
was the work of Guilford (1967) and Guilford and Hoepfner (1966) on
the structure of intellect, along with the work of French, Ekstrom, and
Price (1963). From these sources, initial ability categories were selected
according to the criterion that each ability had been identified in at least
10 separate factor analytic investigations. In the psychomotor and phys-

ical ability areas, the categories of concern were specified on the basis of
Fleishman’s (1964,1966,1972b) prior research.
This initial literature review yielded 37 abilities capable of summariz-
ing a large number of task performances. To guarantee the quality of the
resulting classification structure, however, a number of additional steps
were taken. The most important step involved generating operational
definitions for each ability. To formulate adequate operational defini-
tions, the consolidated list of ability constructs and their associated defi-
nitions were reviewed by panels of subject matter experts and psycholo-
gists (Fleishman & Quaintance, 1984). Subsequent data, as well as inter-
views with informants, indicated the need for more coherent definitions
of each ability and specific examples of the tasks likely to be assigned to
these categories. These revisions were made in an iterative fashion, un-
til the refinements appeared to result in operationally viable definitions
and descriptions of each category yielding high interrater agreement.
It was also recognized that, due to the inherent limitations of any
historic research base, certain significant categories might have been
overlooked simply because they had received little attention or had only
been recently discovered. Thus, to satisfy a taxonomic criterion of com-
prehensiveness, an extended review of the experimental and measure-
ment literatures was conducted to identify any potentially significant cat-
egories that might have been overlooked in the initial review. This ex-
tended review, as well as subsequent reviews of more recent work (Car-
roll, 1976; Ekstrom, French, & Harmon, 1979; Horn, 1976; Peterson &
Bownas, 1982) indicated that this initial category set was not yet com-
plete. These observations led to the inclusion of a group of abilities
that had not been intensively studied, but nonetheless appeared to be of
some general importance in the description of human task performance,
such as sensory capacities, time sharing, and divided attention (Imhoff
& Levine, 1980; Schemmer, 1982), social skills, and interpersonal re-
sources (Fleishman, Cobb, & Spendolini, 1976; Mumford & Nickels,
1990).
Based on a number of expert reviews conducted in a variety of job
settings, these categories, provided in Table 1, seem to provide a reason-
ably comprehensive list of generic ability characteristics influencing task
performance. It is, however, true that these reviews have also led to the
identification of certain job-specific knowledges and skills that were not
intended to be covered by the taxonomy. Given this observation, and
the fact that ongoing research might lead to the identification of new
abilities, these categories, although comprehensive, do not necessarily
provide an exhaustive description of the attributes conditioning human
performance. We will describe some more recent developments later in
TABLE 1
Abilities Included in the Taronomy*
1. Oral comprehension 14. Category flexibility 27. Arm-Hand steadiness 40.Stamina

2. Written comprehension 15. Speed of closure 28. Manual dexterity 41. Near vision 3
3. Oral expression 16. Flexibility of closure 29. Finger dexterity 42. Far vision
4. Written expression 17. Spatial orientation 30. Wrist-Fingerspeed 43. Visual color discrimination
5. Fluency of ideas 18. Visualization 31. Speed of limb movement 44. Night vision
6. Originality 19. Perceptual speed 32. Static strength 45. Peripheral vision I
7. Memorization 20. Selectiveattention 33. Explosive strength 46. Depth perception
47. Glare sensitivity
8. Problem sensitivity 21. Time sharing 34. Dynamic strength
5U
9. Mathematical reasoning 22. Control precision 35. IRunk strength 48. Hearing sensitivity rc
10. Number facility 23. Multilimb coordination 36. Extent flexibility 49. Auditory attention
11. Deductive reasoning 24. Response orientation 37. Dynamic flexibility 50. Sound localization 5
12. Inductive reasoning 25. Rate control 38. Gross body coordination 51. Speech recognition ::
13. Information ordering 26. Reaction time 39. Gross body equilibrium 52. Speech clarity U
*Definitionsof these abilities may be found in Fleishman (1975a, 1975b), Fleishman and Quaintance (1984), Fleishman and Mumford (1988), and
Fleishman and Reilly (1991).
ul
ul
w
this article. For the present, we will confine our discussion to the evalua-
tion of the abilities taxonomy described in Xible 1, which are the abilities
included in the Manual for the Ability Requirement Scales (MARS) eval-
uated in this article.
Measuring Ability Requirements
It was clear that the ability category definitions summarized a great

deal of information about the varieties of task requirements included
within a category. The definitions also indicated distinctions with other
ability categories. For example, static strength was general to tasks in-
volving pushing, pulling, lifting, or having isometric requirements. Ex-
plosive strength, on the other hand, involved tasks such as jumping,
throwing, and sprinting, all of which require energy mobilization for
short bursts of effort. Once such a viable, comprehensive, if not fully
complete, set of ability categories and their definitions had been spec-
ified, there was still a need to devise a technique for appraising when
other new task performances could be described by these categories. A
number of strategies might have been used to generate these relational
statements. The most straightforward and flexible approach, however,
involved application of a rating format where judges were asked to eval-
uate the extent to which each ability was required for adequate task per-
formance.
The procedures used in constructing this rating format have been
described more fully in Fleishman (1975a, 1975b), Fleishman and Mum-
ford (1988), and Fleishman and Quaintance (1984). Initially, Theologus,
Romashko, and Fleishman (1973) and Theologus and Fleishman (1973)
presented descriptions of three laboratory tasks and tasks from three
jobs to a panel of 18 psychologists. Using the available ability defini-
tions and simple rating scales, panel members were asked to evaluate
these tasks in terms of the level (amount) of each ability required for ad-
equate task performance. When task ability ratings were compared to
the task loadings on the ability categories obtained in earlier factor an-
alytic studies, sufficient overlap was observed to argue for the feasibility
of this procedure (Theologus 8z Fleishman, 1973). Encouraging support
for the procedure was provided by the sizable interrater agreement co-
efficients obtained for the ratings of abilities for each task. Agreement
for some abilities was lower than for others, and follow-up interviews
indicated that clearer instructions and more precise definitions might
improve reliability for these ability ratings.
After the requisite revisions had been made, 32 psychometricians and
25 other psychologists, drawn from diverse specialties, were presented
with the tasks described above. These judges were asked to rate each
task with respect to 37 ability dimensions. To obtain an index of relia-

bility, intraclass correlations were obtained from groups of 25, 15, and
5 judges, as well as for a single judge. It was found that 15 judges were
needed to obtain reliability coefficients in excess of .70 when agreement
for each task was assessed across abilities. Once again, however, the
judges’ comments indicated the need for further revision, suggesting that
the accuracy of evaluation would be improved by more behaviorally ori-
ented definitions of the abilities and rating categories.
Development of these behaviorally anchored rating scales was car-
ried out using the following procedures (Fleishman, 1975a, 1975b;Fleish-
man & Quaintance, 1984). First, psychologists familiar with the abilities
were asked to generate more detailed behavioral descriptions for the
high and low end of each scale. Second, these high-low descriptions
and ability definitions were presented to a series of panels, and panel
members were asked to generate examples of everyday tasks requiring
high, moderate, and low levels of the ability for adequate performance.
Third, after editing, more than 1,OOO examples covering the 37 abilities
resulting from this operation were presented to a group that was asked
to rate, on a 7-point scale, the extent to which each task required each
ability. Fourth, the mean and standard deviations of these task ratings
were obtained. Fifth, and finally, tasks were selected as anchors for the
high, middle, and low points on each scale, based on their mean ratings
and low dispersion around these means. It was hoped that the presen-
tation of these rating scales, including familiar task examples with high
agreement on their positions on each scale, would contribute to a better
understanding of the general nature of the ability. In particular, it was
felt that the new, anchored scaleswould provide more reliable judgments
when raters were asked to use the scales to rate the ability requirements
of new tasks and jobs. Examples of the rating scales subsequently de-
veloped for written comprehension and static strength are presented in
Figures 1 and 2.
Evaluating the Ability Requirements Approach
Having described the nature of the ability requirement taxonomy, we

will now examine the available evidence bearing on the construct validity
of the summary descriptions provided by this classification system. Ear-
lier, we argued that evidence pointing to the substantive meaningfulness
of a classification scheme might be derived from either internal or exter-
nal relational inferences. Because the existence of meaningful external
relationships is contingent on the application of operations giving rise
to meaningful internal and external relationships, we will begin this re-
view by considering the evidence bearing on the appropriateness of the
WRI'ITEN COMPREHENSION
This is the ability to understand written sentences and paragraphs.
How Written Comprehension is Different from Other Abilities:
THIS ABILITY OTHER ABILITIES
Understand written English words, sen- Oral Comprehension(1): Listen and un-
tences, and paragraphs. derstand spoken Englishwords and sen-
tences.
VS. Oral Expression (3): and Written Ex-
pression (4): Speak or write English
words and sentences so otherswill un-
derstand.
Requires understanding of complex or

detailed information inwriting contain- 7
F
ing unusual words and phrases and in- - Understand an instruction book
volving fine distinctions in meaning a- on repairing a missileguidance sys-
mong words. 6 tem.
I J I
t5
t: - Understand an apartment lease.
Requires understanding short, simple

written informationcontainingcommon
words and phrases.
l2
L1
- Read a road map.
(Adapted from Fleishman, 1975a, 1991)
Figure 1. Definition and Ability Scale for Written Comprehension
procedures used to address the four major operational issues involved

in any classification effort.
Operational Evaluations
Domain definition. The first step in any taxonomic effort is speci-

fication of the domain of objects that is to be classified (Fleishman &
Quaintance, 1984). Although this domain definition step may seem of
little importance, poor or inadequate specification of the nature of the
objects to be summarized may result in a useless set of categories and
STATIC STRENGTH
This is the ability to use muscle force to lift, push, pull, or carxy objects. It is
the maximum muscle force that one can exert for a brief period. This ability can
involve arms, back, shoulders, or legs.
How Static Strength Is Different From Other Abilities:

THIS ABILITY OTHER ABILITIES
Use muscle to exert force against ob- VS. Dynamic Strength: Use muscle power
jects. repeatedly to hold up or move the body’:
own weight.
Use continuous muscle force, without vs. Explosive Strength: Gather energy to
stopping, up to the amount needed to move one’s own body or to propel some
lift, push, pull, or carry an object. object with short bursts of muscle force.
Does not involve the use of muscle VS. Sturnina: Does involve physical exer-
force over a long time. tion over a long time.
Requires use of all the muscle force
very heavy object. - Reach over and lift a 70 Ib. box

onto a table.
t5 - Walk a few steps on flat terrain car-
, t:
rying a 50 lb. back pack.
- Lift one package of bond paper.

Requires use of little muscle force to
lift, carry, push, or pull a light object.
(Adapted from Fleishman, 1975a, 1991)
Figure 2. Definition and Ability Rating Scale for Static Strength
poor description. Furthermore, many ambiguities may arise in formulat-

ing an adequate operational definition of the kind of objects to which a
classification does or does not apply (Sternberg, 1985). Thus, it was nec-
essary to obtain evidence indicative of the meaningfulness of Wheaton’s
(1973) definition of a task and the resulting specification of the domain
of objects to be classified.
Fleishman (1975b) indicates that task performance involves a guided,
goal-oriented response in relation to salient discriminative stimuli. Sup-
port for this domain definition has been obtained in job analysis ef-
forts explicitly concerned with specifying job tasks where incumbents
and supervisors are used as subject matter experts (SMEs). These stud-
ies with Army officers (Wallis, Korotkin, Yarkin-Levin, & Schemmer,
1985), heavy-equipment operators (Fine, 1988; Olson, Fine, Myers, &
Jennings, 1979),Federal Bureau of Investigation special agents (Cooper,
Schemmer, Jennings, & Korotkin, 1983), and automotive mechanics (Is-
raelski, 1988) indicate that this definition permits subject matter experts
to reach agreement in more than 90% of the cases as to whether a task
is, indeed, part of their job. Evidence of this sort is not unambiguous.
Nonetheless, it suggests that this domain definition reflects experts’ im-
plicit understanding of the nature of performance on their jobs.
Descriptive categories. The second step in constructing classification
systems is specification of the categories used to summarize the objects
specified by the domain definition. Thus, there is a need for evidence
indicating that a meaningful set of categories has been identified (Horn
& Knapp, 1973). We have already considered some evidence that points
to the meaningfulness of the ability categories included in the ability re-
quirement (Fleishman, 1972a, 1975b, 1982) taxonomy. First, the ability
constructs included in this taxonomy were expresslychosen because they
had been identified in multiple factor analytic investigationsfocusing on
certain subdomains of task performance. Second, these abilities were
reviewed for comprehensiveness and significance by panels of psycholo-
gists and psychometricians, who found that the proposed categories pro-
vided a reasonably comprehensive listing of the more significant ability
constructs found in the literature. Third, each of these ability constructs
could be linked to an extensive literature indicative of the construct’s
ability to account for performance differences.
Other ongoing research has also provided evidence for the meaning-
fulness of these categories. For instance, early factor analytic studies of
psychomotor tasks indicated that a common ability was related to per-
formance on both pursuit and compensatory tracking tasks (Fleishman,
1954, 1958; Fleishman & Hempel, 1954, 1955, 1956). To further estab-
lish the meaning of the construct underlying performance, other tasks
were developed that required the timing of control manipulations in re-
sponse to stimuli moving at different rates (Fleishman, 1966, 1967). In
accordance with initial interpretations of rate control, it was found that
this factor was related to performance on this new set of tasks. Later
studies involved the use of tasks intended to reveal whether this ability
stressed judgments of stimulus rate as opposed to response rate con-
trol. This was accomplished using motion picture tasks and other tasks
that required only a button press in response to the location of a moving
stimulus. These tasks were found to be unrelated to rate control. Thus,
definition of the rate control category was extended beyond tracking and
pursuit tasks, but restricted to tasks requiring the actual timing of adjus-
tive motor movements to a continuously changing stimulus.
Fleishman (1966, 1967, 1972a) describes a number of other stud-
ies intended to elucidate the character and meaning of the various psy-
chomotor constructs included in the taxonomy. Thus, reaction time abil-
ity was found to extend to simple reaction time responses to either visual
and auditory stimuli, but when two or more stimuli must be discrimi-
nated, or a choice must be made about which control to use, reaction
time is no longer measured. With such tasks, a different ability, called
response orientation, is measured. The physical capacities included in
the ability requirement taxonomy have been appraised in a similar re-
search program (Fleishman, 1964), while Fleishman and Quaintance
(1984) and Guilford and Hoepfner (1971) describe a series of studies
intended to establish the meaningfulness of the interpretations applied
to the various cognitive abilities. Because the results obtained in the ma-
jority of these investigations tend to support the substantive interpreta-
tions applied to the ability constructs, these studies provide evidence for
the meaningfulness of these categories and, therefore, the classification
system as a whole.
Similarity assessment and category assignment. The third and fourth
steps involved in classification efforts entail (a) assessments of object
similarity, and (b) specification of decision rules for assigning objects
to categories based on the observed degree of similarity. As Cronbach
and Gleser (1953) and Gregson (1975) point out, similarity may be as-
sessed in a number of ways, including correlation coefficients as well as
distance metrics. Further, a variety of decision rules might be applied
in determining when an object displays sufficient similarity to other cat-
egory members to permit assignment and common interpretation (An-
derberg, 1973). Fleishman and Quaintance (1984) and Mumford et al.
(1990) note that variations in the methodological procedures employed
in these operations may lead to marked differences in the content and
character of the resulting summary descriptions. This point has been un-
derscored by Hamer and Cunningham (1981), who have shown that the
use of different similarity metrics (e.g., correlation coefficientsvs. gener-
alized distance metrics) can lead to different classifications. This leads to
another set of questions pertinent to the system's meaningfulness: Were
appropriate indices of similarity and viable decision rules applied in con-
structing this classification scheme?
In the case of the ability requirement taxonomy, a judgmental pro-
cedure is used in similarity assessment and category assignment. Judges
are asked to apply the category definition and associated rating scale to
reach a decision as to the level of that ability required for performance
on a task. 'Qpically, abilities receiving relatively high ratings (4 or above)
are held to provide a basis for summarizingtask performance (Fleishman

& Mumford, 1988).
At a minimum, this judgmental procedure can yield a meaningful
task classification only if judges can reliably identify the abilities that
summarize task performance. Earlier, we reviewed some initial evidence
that speaks to this issue. More compelling evidence for the reliability of
these evaluations has been provided by Hogan, Ogden, and Fleishman
(1978). In this study, 19 scales focusing on cognitive, physical, and psy-
chomotor abilities were used to describe the tasks performed on 15 civil
service jobs in San Bernardino County. The 864 incumbents, 350 su-
pervisors, and 79 job analysts serving as judges were asked to (a) review
the list of tasks performed on the relevant jobs, (b) determine whether
or not each ability was required for adequate performance, and (c) rate
the extent to which each ability was required for adequate performance.
For each type of judge (e.g., incumbents, supervisors, and job analysts),
cross-task interrater agreement coefficients typically lay in the low .90s,
and fell below .80 only in a single occupational field where a limited num-
ber of incumbents were available. Thus, this judgmental assignment pro-
cedure appears to yield a reliable assessment of the abilities related to
performance.
Evidence indicative of the reliability of these judgments has also been
obtained in studies examining a number of other occupations. Similar
results have been obtained in studies examining (a) 20 jobs at a large
public utility (Inn, Schulman, Ogden, & Sample, 1982), (b) a wide range
of Army military occupational specialties (Myers, Gebhardt, Price, &
Fleishman, 1979), (c) 31 Navy and Marine Corps occupations (Cooper,
Schemmer, Fleishman, Yarkin-Levin, Harding, & McNelis, 1987), (d)
lineman and maintenance jobs in the electric power industry (Cooper,
Schemmer, Gebhardt, Marshall-Mies, & Fleishman, 1982), (e) Naval
shipboard jobs (Gebhardt, Jennings, & Fleishman, 1981), and (f) court
security officers (Myers, Jennings, & Fleishman, 1981). Again, incum-
bents’ ratings of a single ability across tasks yielded interrater agreement
coefficients in the low- to mid-80s, when 15 or more judges were avail-
able.
Cronbach (1971) has argued that agreement data can yield stronger
evidence indicative of the appropriateness of a descriptive system, if the
conditions contributing to agreement or reliability are investigated with
the intended applications of the system in mind. This generalizability
theory approach has also been used to assess the ability requirements
assessment strategy. This set of studies was intended to evaluate whether
raters drawn from different backgrounds would evaluate the abilities
required for task performance in a similar fashion.
To obtain evidence pertinent to this hypothesis, Hogan et al. (1978)

generated profiles of average ability ratings across job tasks for incum-
bents, supervisors, and job analysts. The similarity of the profiles ob-
tained from each group of judges on each job was then assessed using
Spearman’s Rho. The results obtained in this analysis are presented
in Thble 2. As may be seen, the ability profiles obtained from supervi-
sors and incumbents were quite similar, yielding Spearman Rhos lying in
the .80-.90 range across jobs. The job analysts produced ability profiles
that evidenced somewhat less similarity, perhaps because they lacked
an intimate familiarity with job tasks. Nonetheless, correlations of their
profiles correlated with those obtained from incumbents and supervisors
were in the .70-.80 range.
Because the ability requirements classification relies on judges to
evaluate the abilities associated with task performance, the convergence
observed in the Hogan et al. (1978) study tends to provide some evi-
dence for the appropriateness of this operation. Further support for
this conclusion has been provided by Romashko, Brumbach, Fleish-
man, and Hahn (1974). They obtained a median correlation of .67
between ability profiles obtained from incumbents and supervisors on
New York City sanitation, firefighter, and police jobs. Similarly, in a
study of Philadelphia police officers, Romashko, Hahn, and Brumback
(1976) obtained ability profiles for incumbents and supervisors which
yielded a correlation of .75. Substantial correlations were also obtained
in comparing ability profiles generated by job analysts and incumbents
(r = .66), as well as job analysts and supervisors ( r = .81). In a com-
parison of the ability profiles produced by incumbents and supervisors,
Zedeck (1975) obtained comparable relationships for telephone com-
pany installer-repairmen, splicers, and linemen. The median correlation
obtained in this study was .79.
It is possible that the convergence and interrater agreement coeffi-
cients obtained in these studies might be attributed to systematic rating
errors. This rating error issue has been addressed by Reilly and Zink
(1980,1981). In their study, tasks performed on three outside craft jobs
in the telephone industry were rated on 26 abilities, primarily cognitive
in nature, by supervisors and incumbents. Their findings also indicated
substantial agreement between incumbents and supervisors, as well as
between more- and less-experienced incumbents. More centrally, how-
ever, they used Stanley coefficientsto assess trait independence and halo
error. The results obtained in this analysis indicated some ha19 error.
On the other hand, this effect was not large, and raters were able to dis-
tinguish among the abilities. Furthermore, Reilly and Zink’s findings
indicate that when abilities have not been preselected for job relevance,
most abilities receive ratings near the scale midpoint and yield a normal
TABLE 2
InterraterAgreement Between Incumbents, Supervisors, and Job Analysts
on the Relative Rank Orders of the 19 Abilitiesfor Each Job Category
Soearman’s Rho ( N = 19)

Incumbents with Incumbents with SupeIvisorswith
Job category supervisors job analysts job analysts
1. Attorney .81 .46 .53
2. Registered nurse .82 .79 .70
3. Accountant .75 .25 .53
4. Eligibility worker .95 .85 .84
5. Building inspector .82 .85 .84
6. Nursing attendant .47 .73 .37
7. Police officer .96 .85 .49
8. Firefighter .92 .76 .73
9. Clerical .97 .84 .84
10. Automotive mechanic .89 .69 .69
11. Laborer .86 .81 .81
12. Heavy equipment operator .86 .58 .60
13. Stores clerk .83 .69 .56
14. Custodian .79 .79 .78
15. Painter .85 .69 .64
(From Hogan, Ogden, & Fleishman, 1978)
distribution across tasks. Some data pertinent to this conclusion may

be found in Table 3, which presents the means and standard deviations
Reilly and Zink (1980) obtained for the major task components involved
in station inspector jobs.
In a related vein, one might argue that because this assignment
procedure is intended to reflect task characteristics, raters’ evaluations
should not be much influenced by properties of the individual judge.
This issue has been addressed in a recent study by Fogli (1988). Us-
ing incumbents working as supermarket cashiers, Fogli obtained ability
requirement ratings. He then examined the relationship between these
ratings and characteristics of the rater, such as time with the company,
age, sex, and educational level. It was found that these rater character-
istics had little influence on judges’ ratings of task ability requirements.
Certainly, a number of other issues need to be addressed with re-
gard to the judgmental assignment of tasks to ability categories. For
instance, there is a need for research examining the relative merits of
quantitative, as opposed to judgmental, strategies in assessing similarity
and assigning tasks to ability categories. For example, Fleishman and
Stephenson (1972) and Malamud, Levine, and Fleishman (1980) exam-
ined the reliability and utility of binary (yes-no) decisions, organized into
TABLE 3
Means and Standard Deviations of Ability Requirement Ratings Obtainedfor Station InspectorJob Components
1 2 3 4 5 6
Organizing Analysis Reading
and and. technical Customer
planning Installation repair material Sales relations
Ability M SD M SD M SD M SD M SD M SD
1. Oral comprehension 4.1 1.5 4.3 1.4 4.6 1.2 4.8 1.5 3.5 1.5 3.9 1.6
2. Written comprehension 3.7 1.4 4.1 1.3 4.1 1.3 5.0 1.4 3.0 1.5 3.0 1.5
3. Oral expression 3.8 1.5 3.8 1.5 4.3 1.3 4.2 1.6 4.1 1.7 4.3 1.7
4. Written expression 3.2 1.6 3.4 1.5 3.7 1.5 3.7 1.7 2.8 1.5 3.1 1.5
5. Memorization 4.2 1.3 4.4 1.5 4.5 1.3 4.5 1.6 3.3 1.4 3.2 1.5
6. Problem sensitivity 3.9 1.6 4.3 1.5 4.8 1.5 3.7 1.5 3.1 1.4 3.5 1.6
7. Math reasoning 2.8 1.5 3.1 1.5 3.3 1.5 3.2 1.5 2.7 1.5 2.2 1.2
8. Number facility 2.6 1.3 2.8 1.3 2.8 1.4 2.8 1.5 2.6 1.4 2.2 1.3
9. Deductive reasoning 3.9 1.5 4.0 1.3 4.5 1.4 4.1 1.6 2.9 1.4 3.1 1.4
10. Inductive reasoning 3.4 1.3 3.6 1.2 4.1 1.3 3.7 1.4 2.6 1.4 2.1 1.4
11. Information ordering 3.9 1.4 4.0 1.3 4.1 1.3 3.8 1.4 2.8 1.4 2.8 1.4
12. Category flexibility 3.3 1.4 3.4 1.2 3.6 1.4 3.4 1.5 2.5 1.3 2.6 1.3
13. Electrical knowledge 3.1 5.1 3.7 1.3 4.0 1.3 3.9 1.6 2.0 1.2 2.0 1.2
14. Mechanical knowledge 3.3 1.8 3.6 1.6 3.6 1.8 3.1 1.8 2.1 1.4 2.0 1.3
15. Knowledge of tools and their use 4.3 1.6 4.5 1.3 4.3 1.5 3.0 1.6 1.9 1.3 2.0 1.4
16. Map reading 3.4 1.7 2.9 1.7 3.1 1.8 3.0 1.8 2.0 1.4 2.1 1.4
17. Drafting 2.5 1.4 2.5 1.4 2.5 1.4 2.6 1.6 1.7 1.0 1.7 1.o
18. Reading plans 3.1 1.5 3.2 1.5 3.3 1.5 3.7 1.6 1.8 1.3 1.8 1.2
19. Selective attention 3.8 1.5 4.2 1.4 4.3 1.5 4.3 1.6 2.9 1.6 3.1 1.6
20. Time sharing 3.5 1.3 3.7 1.2 3.8 1.3 3.7 1.5 2.6 1.4 2.7 1.4
21. Spatial orientation 3.3 1.5 3.6 1.2 3.6 1.4 3.3 1.6 2.3 1.4 2.4 1.4
22. Visualization 3.9 1.6 4.3 1.4 3.8 1.4 3.3 1.6 2.6 1.5 2.5 1.5
23. Persuasion 3.0 1.6 3.3 1.6 3.2 1.6 2.6 1.5 4.4 1.6 3.9 1.7
24. Social sensitivity 3.0 1.7 3.2 1.7 3.2 1.7 2.4 1.6 4.1 1.6 4.7 1.7
25. Fact-Findingability 3.7 1.7 3.9 1.6 4.4 1.4 3.3 1.8 3.9 1.7 4.1 1.6
26. Flexibility of closure 3.7 1.5 3.9 1.4 4.2 1.5 4.0 1.6 2.7 1.5 3.0 1.5
(From Reilly & Zink, 1980)
546 PERSOMVEL PSYCHOLOGY
decision-flow diagrams, as a method for assigning tasks to ability cate-

gories. While the method was found useful for reducing some false pos-
itives in identifying abilities required by tasks, the scaling of ability level
requirements was still needed. The rating scales were still needed to
provide quantitative distinctions within and between ability categories.
While research examining the cognitive processes that raters use in
making these category assignments might prove useful, the studies of
rater bias, rater convergence, and interrater agreement have provided
some support for the appropriateness of the ability requirements rating
approach. It now seems appropriate to consider evidence bearing on the
meaningfulness of the inferences derived from the resulting summary
descriptions of task performance.
Internal klidity
The methods used to define categories and assign tasks to these cat-
egories provide a basis for more complex relational inferences bearing
on the classification’sconstruct validity. In this section, we will consider
evidence derived from the relationships embedded in the classification.
Here we are concerned with the classification’s internal validity. Two
general kinds of relationships might be used to marshal evidence for a
classification’sinternal validity: (a) the relationships observed between
categories and the objects to be classified, and (b) the relationships ob-
served between categories. We will consider the construct validity of the
ability requirement taxonomy with respect to these two kinds of relation-
ships.
Behavior-Cafegoty relationships. The general goals and functions of
taxonomic efforts lead to certain paramount concerns, when one con-
siders behavior-category relationships. For instance, one might ask the
question as to whether the proposed categories are sufficient to account
for the behaviors under consideration, or alternatively,whether most be-
haviors can be accounted for by assignment to one or more categories.
In addressing this question, however, the issue of the system’sparsimony
should be borne in mind (Horn & Knapp, 1973). More specifically, re-
dundant categories should not be proposed, and the proposed categories
should be just sufficient to result in the assignment of most behaviors to
one or more categories.
With regard to the ability requirements approach, the procedures
used in category definition might be said to provide some evidence in-
dicative of the taxonomy’scomprehensiveness (Fleishman & Quaintance,
1984). More direct evidence bearing on this question, however, has been
provided by certain empirical investigations. In a series of panel meet-
ings, Hogan, Ogden, and Fleishman (1979) found that 80% of the tasks
performed by warehouse workers could be assigned to one or more of

the ability categories. Similar findings have been obtained in studies of
Army commissioned and noncommissioned officers (Mumford, Yarkin-
Levin, Korotkin, Wallis, & Marshall-Mies, 1985), civil service workers
(Hogan et al., 1978), and Federal Bureau of Investigation special agents
(Cooper et al., 1983). Furthermore, Landy (1988) has shown that a sub-
stantial (81.1%) proportion of job activities performed by police officers
in New York City can be assigned to one of these ability categories.
In some studies, however, panel members have indicated that com-
prehensiveness might be enhanced by the addition of categories in-
tended to capture certain job-specific skills and knowledge. For in-
stance, pipeline workers mentioned knowledge of tools, Army officers
mentioned weapon systems knowledge, and FBI special agents stressed
the importance of acting skills. In these situations, roughly 15-30% of all
tasks tend to be assigned to these categories of job-specific knowledges
and skills. Thus, 7045% of the tasks performed on these jobs were as-
cribed to one of the proposed ability categories. Given this observation,
and the fact that any general, parsimonious system is unlikely to capture
all job-specific tasks, this pattern of findings tends to provide evidence
indicative of the internal validity of the ability requirement taxonomy.
As noted above, a classification scheme should describe the domain
at hand without resorting to an unduly large number of dimensions.
Some evidence for the parsimony of this taxonomy was provided in our
discussion of the procedures used in category definition. However, in a
parsimonious taxonomy, the same set of postulated abilities should be
found useful in describing the tasks performed on a wide range of differ-
ent jobs. Thus, Hogan et al. (1978) studied jobs ranging from attorneys,
accountants, mechanics, and equipment operators; Cooper, Schemmer,
Fleishman, Yarkin-Levin, Harding, and McNelis (1987) studied a wide
range of military occupational specialties, including pilots, cryptogra-
phers, and maintenance personnel; and Mumford et al. (1985) and
Fleishman and Friedman (1990) studied military and industrial man-
agers at various organizational levels. In all of these investigations, it
was found that more than 80% of the postulated abilities were used by
incumbents or job analysts to describe the tasks performed on three or
more jobs, as indexed by tasks receiving ratings of 4 or above. Elsewhere,
Fleishman (1988) and Fleishman and Mumford (1988) have reviewed
the range of jobs in which the ability requirement scales have proven
useful in summarizingjob tasks. The proposed ability categories appear
to have value in describing the tasks performed on a wide variety of oc-
cupational specialties.
Another piece of evidence bearing on the system’s internal validity
pertains to the structure of task assignments. A classification should
yield unambiguousassignments of tasks to categories, such that (a) tasks

are not equally likely to be assigned to the alternative categories (inverse
simple structure), and (b) tasks can be assigned to a single category (sim-
ple structure). Of course, the uniform or random assignment implied by
inverse simple structure is always undesirable. In many domains, how-
ever, some degree of overlap in category assignments can be legitimately
expected (Annett & Duncan, 1967;Owens & Schoenfeldt, 1979). Simple
structure evaluations should, therefore, be tempered by an awareness of
the phenomenon at hand.
Unlike many other classifications, the ability requirement taxonomy
does not assume simple structure. Due to the complex nature of human
task performance, it is held that a task performance might be related
to a number of different abilities. Yet it still might be argued that tasks
should be ascribed to abilities in a nonrandom fashion consistent with
the notion of inverse simple structure. The interrater agreement data
presented earlier, of course, speaks to this issue by showing that different
judges tend to assign tasks to similar ability categories.
The notion of consistent nonrandom assignment implies that raters
will be able to agree on the abilities that best summarize a particular kind
of performance. Thus, one might extend this argument by showing that,
within certain performance domains (e.g., psychomotor, physical, cogni-
tive), raters can still agree on the tasks associated with different abilities.
In a series of studies expressly focusing on the ability requirements of
physically demanding tasks, interrater agreement coefficients have been
obtained for correctional officers (Gebhardt & Weldon, 1982), court se-
curity officers (Myers, Jennings, & Fleishman, 1981), telecommunica-
tions workers (Inn et al., 1982), gas pipeline repair and maintenance
personnel (Gebhardt, Cooper, Jennings, Crump, & Sample, 1983), and
Army occupational specialties (Myers, Gebhardt, Price, & Fleishman,
1981).
Table 4 presents a summary of the interrater agreement coefficients
obtained for each ability across the tasks examined in these studies.
Even with samples of 20 subject matter experts, interclass agreement
coefficients typically lay in the low .80s, and almost never fell below
.70.Because substantial agreement was still observed in a restricted do-
main of physically demanding tasks, it seems reasonable to conclude that
raters’ assignment of tasks to abilities, while complex, is clearly nonran-
dom, even under conditions where discrimination is more difficult. Sim-
ilar high interrater agreement has been obtained in recent research by
Fleishman and Friedman (1990) on the cognitive abilitiesrelated to lead-
ership tasks. This evidence, in turn, suggests that the ability requirement
taxonomy satisfies the inverse simple structure standard.
TABLE 4
Interclass Reliability Coefficientsfor the Physical Ability Scales-
Scale Reliabilities by Study
Physical ability scales s1- Sa" SQ" S4" saa ssa S7"
Static strength .92

Upper body static strength .81 .90 .97 .83 .93 .90
Lower body static strength .89 .84 .91 .72 .91 .93
Dynamic strength .90
Upper body dynamic strength .86 .82 .95 .82 .90 .90
Lower body dynamic strength .88 .77 .87 .70 .91 .84
Explosive strength .89
Upper body explosive strength .82 .78 .95 .72 .84 .86
Lower body explosive strength .85 .70 .93 .70 .88 .85
Flexibility .88 .80 .95
Extent flexibility .76
Dynamic flexibility .74
Stamina .90 .81 .75 .90 .56 .87 .87
' h n k strength .90 .82 .81 .95 .72 .87 .84
Gross body equilibrium .77 .79 .95
Gross body coordination .81 .82
Speed of limb movement .81
W a n d steadiness .84
Manual dexterity .87
Near vision .90
Far vision .90
Hearing .87
(Sample size) 26 15 20 22 30 30 30
a S l - c o u r t security officers
S2-Bell System employees
&-gas company employees
S4-correction officers
Ss-Army motor transport specialists
S6-Army supply specialists
&-Army medical specialists
(From Fleishman & Mumford, 1988)
Another source of evidence bearing on a taxonomy's internal valid-

ity may be found in the evaluation of the content and coherence of the
objects assigned to the category. Hopefully, the objects assigned to a cat-
egory will yield substantively meaningful statements about the features
or attributes giving rise to object similarity. While a variety of strategies
might be used to address this evaluative issue, evidence bearing on this
issue is most often obtained by substantive interpretation of the features
common to category members. The value of this evaluation might be
strengthened, however, if existing theory is used to specify expected re-
lationships, and the existence of these relationships is verified in further
confirmatory studies (Joreskog & Sorbom, 1979; Messick, 1989).
The coherence or interpretability of task assignments to Fleishman’s

(1972a, 1975b, 1982) ability categories has been addressed in several
studies. Two examples will be provided here. One study was concerned
with the abilitiesrequired by workers in several large grocery warehouses
(Hogan et al., 1978). The other study focused on the abilities required in
a number of Navy and Marine Corps occupations (Cooper et al., 1987).
lhble 5 presents the tasks assigned to the control precision and static
strength categories by job incumbents in the grocery warehouses and the
tasks assigned to the information ordering and written comprehension
categories by incumbent Navy aircraft avionics mates.
This table indicates that the assignment of tasks to abilities in the
Hogan et al. (1978) study produced a coherent set of relationships. Con-
trol precision, for instance, is defined as the ability to make highly con-
trolled, rapid, and accurate movements of the arms or legs (Fleishman,
1972b; Fleishman & Quaintance, 1984). Given this definition, it is not
surprisingthat warehouse tasks involvingvehicle operation tended to re-
ceive ratings of 4 or above on this dimension. On the other hand, static
strength has been defined as the exertion of maximal muscular force for
brief periods of time (Fleishman & Quaintance, 1984). Consequently,
the warehouse tasks that required lifting, carrying, and stacking objects
on a pallet were assigned to this dimension. It should also be noted that
spatial-visualization ability captured other tasks requiring the fitting and
stacking of boxes of different dimensions onto the wheelers and trucks.
A similar pattern of findings emerged in the Cooper et al. (1987)
study. As may be seen for Navy aircraft avionics mates, tasks such as
troubleshooting and component alignment tended to receive average
ratings above 4 on the information ordering dimension. This might be
expected, given the demands those tasks make for the systematicacquisi-
tion and application of information in fault-finding. Similarly, tasks that
involved reading equipment manuals received high ratings on written
comprehension. This study yielded comparable findings in a number of
other Navy and Marine Corps occupations, ranging from pilots to cryp-
tographers. Thus, it appears reasonable to conclude that tasks can be
assigned to ability categories in an interpretable and meaningful fash-
ion.
Categoly-Levelrelationships. Taken as a whole, our foregoing obser-
vations argue for the construct validity of the ability requirement taxon-
omy with respect to behavior-category relationships. Based on our dis-
cussion of internal validity, however, one might also ask whether infer-
ences derived from this classificationfind support in relationships among
the categories themselves. More specifically, are the relationships ob-
served among measures of the categories consistent with the nature of
the categories and their substantive implications?
FLEISHMAN AND MUMFORD 55 1
TABLE 5
Assignments of Tasks to Selected Ability Categories
Grocery warehouse workersa

Control precision Static strength
1) Drive tug into aisle and locate pallets 1) Lift pallets onto wheeler
2) Drive tug into aisle and slot where order 2) Lift item from stock pallet
selecting will begin
3) Steer tug close to merchandise to be 3) Carry item to wheeler
selected
4) Stop tug where most items can be 4) Carry several items to wheeler if they
selected with least walking distance are small and light
5 ) Drive tug slowly around comers and 5 ) Stack heavier and larger items on
crossing main aisle bottom of pallet
6) Drive tug slowly over rough surfaces
with full loads
7) Drive tug slowly to loading dock when
order is full
Aviation electronics mateb

Information ordering Written comprehension
1) Align components of system at 1) Troubleshoot using technical
organizational level publications
2) 'hrn up avionics systems for flight crews 2) Utilize schematicsblueprints in
troubleshooting
3) Perform connector plug assembly/ 3) Test aircraft systems
disassembly
4) Troubleshoot avionics systems 4) Perform wire repair
5 ) Apply electrical power plant 5 ) Perform step-by-stepinspection of
aircraft
aFrom Hogan, Ogden, and Fleishman (1978)
bFrom Cooper, Schemmer,Fleishman, Yarkin-Levin,Harding, & McNellis (1987)
In one investigation along these lines, Inn et al. (1982) obtained de-
scriptions of 830 tasks likely to be performed on 17physically demanding
jobs in the telecommunications industry. Ratings of each task's physical
ability requirements were then obtained. Correlations among the ability
dimensions were then generated in a series of analyses, where tasks were
treated as the unit of analysis.
With regard to convergent validity, sizable positive correlations were
obtained among the physical ability categories. This pattern of relation-
ships was anticipated because, within this limited range of physical jobs,
tasks that demand one physical ability (dynamic strength) often demand
a number of other physical abilities (e.g., static strength and extent flex-
ibility). Highest correlations were found among the physical ability re-
quirement scales. These positive, task-level correlations argue for the
meaningfulness of descriptive information provided by the physical abil-

ity scales. This convergence, however, did not preclude the emergence
of relationships indicative of the categories’ discriminant validity. For
instance, various strength abilities (e.g., static, dynamic, explosive, and
trunk) tended to be more closely related to each other than they were to
dimensions of flexibility, equilibrium, and coordination. Tasks involv-
ing flexibility, equilibrium, and coordination do not make substantial
strength demands. Hence, these relationships are consistent with both
the nature of the categories at hand and the broader literature (Fleish-
man, 1964; Myers, Gebhardt, Crump, & Fleishman, 1984).
A recent extension of the ability requirement taxonomy into the do-
main of life tasks has been conducted by Mumford and Nickels (1990).
This study produced additional evidence indicative of the categories’
convergent and discriminant validity focusing on attributes concerned
with cognitive capacities, social skills, and personality characteristics.
Here, some 600 individuals rated the abilities likely to influence ac-
tions on 389 background data items. Using items as subjects, correla-
tions among the ability categories were obtained. In a subsequent con-
firmatory factor analysis, it was found that the pattern of relationships
evidenced by these abilities could be accounted for based on an a pri-
ori theoretical structure differentiating between related abilities. The
goodness-of-fit index obtained in this study was .95, while the associated
residual term was .09. Furthermore, these categories yielded a highly in-
terpretable pattern of relationships, such that memorization was related
to verbal comprehension, but not to empathy.
The internal validity of the ability requirement taxonomy might also
be assessed by considering the relationships among jobs produced by
these categories and their interrelationships. If the classification yielded
meaningful inferences about performance requirements, one would ex-
pect jobs making similar demands to cluster together. The results ob-
tained in one effort along these lines are presented in Table 6. This table
presents the mean physical ability ratings obtained for 15 jobs in San
Bernardino County, California (Fleishman & Hogan, 1978; Hogan et
al., 1978). The similarity observed between police officers and firefight-
ers tends to argue for the meaningfulness of the descriptive informa-
tion provided by these categories, as does the fact that these jobs yielded
markedly different patterns than those obtained for attorneys and clerks.
Weldon (1983) has provided quantitative evidence supporting this con-
clusion. In her study, 27 civil service jobs in the city of Pittsburgh were
evaluated on 12 ability requirements. The resulting dimensional pro-
files were then entered into a complete linkage clustering. In accordance
with our foregoing observations, jobs such as firefighter and emergency
medical services were assigned to one cluster. School crossing guards
TABLE 6
Partial Listing of 3obs Grouped According to Common Abilities Needed
Static Explosive Dynamic Tnrnk Extent Dynamic Speedoflimb Grossbody Grossbody
strength strength strength strength Stamina flexibility flexibility movement coordination equilibrium
1- OfficeI -7
Firefighter
Firefighter
Officer Firefighter Firefighter Firefighter
Attendant Firefighter Firefighter Attendant Firefighter
6- 0 rator Officer -6
OEcer Mechanic Firefighter
ClerWainter Operator Officer Firefighter Operator E2
Laborer/ Laborer Operator
kPainter
Mechanic Officer Attendant Inspector Attendant
Operator Clerk/ Operator Painter Operator/
5- Custodian Attendant Officer Laborer Officer Clerk Painter -5
Mechanic Painter Attendant Operator Attendant LaboreriNurse Laborer Operator
Laborer Operator Mechanic/ Laborer Laborer Laborer
Clerk/ Ins ctor Painter Attendant Attendant
Painter hGer Custodian Clerk/ Officed Painter Painter Inspector
Nurse Mechanic Painter Clerk
Attendant Nurse Mechanic Clerk Mechanic Clerk
clerk/ Nurse Custodian Mechanic
4- Custodian Custodian Inspector Custodian Custodian Mechanic 4
Inspector Nurse Custodian
Inspector Nurse Nurse Inspector Inspector Nurse Custodian Mechanic Clerk
Nurse Nurse
3- Soc Worker -3
Inspector
Attorney Clerical
Accountant Attorney Clerical Clerical
Clerical Clerical Soc Worker
2- SocWorker Attorney . -2
Accountant Clerical Clerical Attorney Clerical
Soc Worker Clerical Soc Worker Accountant Clerical Accountant
Accountant Accountant Soc Worker Attorney Soc Worker Attorney
Attorney Soc Worker Accountant AccountanIt Soc Worker
Attorney Accountant Attorney Soc Worker Attorney
1- Accountant -1
(Adapted from Hogan, Ogden, & Fleishman, 1978.)
554 PERSONNELPSYCHOLOGY
and parking meter patrol officers were assigned to another cluster. The
coherent clusters of jobs derived from these categories and their interre-
lationships provide an additional piece of evidence pointing to the mean-
ingfulness of the ability requirement taxonomy.
The ability requirement taxonomy was, of course, expressly designed
to maximize meaningfulness of the proposed categories (Fleishman &
Quaintance, 1984). It should, however, be recognized that a number
of other investigators have proposed taxonomies containing abilities in-
tended to summarize task performance. These alternative ability clas-
sifications are illustrated in the work of Drauden (1988), Lopez (1988),
and Primoff and Eyde (1988). The significant point, with regard to the
current discussion, is that these taxonomies also stress the importance
of abilities such as strength, stamina, visual acuity, hearing, memory,
oral expression, written expression, numerical ability, and perceptual
speed. Because these alternative worker-oriented classification schemes
were constructed under different assumptions and used different proce-
dures in category definition, this convergence in category content pro-
vides some additional evidence for the meaningfulness of the ability cat-
egories proposed by Fleishman and his colleagues (Fleishman, 1972a,
1975b, 1982; Fleishman & Mumford, 1988; Fleishman & Quaintance,
1984). With regard to the Position Analysis Questionnaire (PAQ) (Mc-
Cormick, Jeanneret, & Mecham, 1972), McCormick (1976) explicitly ac-
knowledged that the ability concepts in the PAQ methodology drew on
Fleishman’s (1975b) taxonomies.
External Validity
This varied evidence, derived from the characteristics of the ability

requirement taxonomy, tends to argue for the meaningfulness of the
classification system. In constructing classifications, however, we are not
concerned only with applying the classification in the domain of objects
used in initial category definition. Rather, we hope that the classification
can be used to describe, predict, and understand other forms of behavior.
Tests of external validity attempt to assess the meaningfulness and utility
of a classification in this regard. Broadly speaking, two general strategies
might be used to appraise the external validity of a classification: (a)
generality tests and (b) inferential tests. In the ensuing discussion, we
will examine the meaningfulness of the ability requirement taxonomy
with respect to each of these criteria.
Generality tests. Attempts to appraise the external validity of a clas-
sification are likely to begin with simple generality tests. Here, the clas-
sification is extended to other populations and situations. In many of
these tests, the concern at hand is replication of the initial pattern of
internal relationships in new populations or situations. In other cases,

however, known population characteristics or potential situational mod-
erators will lead to changes in the expected characteristics of the classi-
fication. These systematic changes in the characteristics of a taxonomic
system can also be used to obtain construct validity evidence (Cronbach,
1971).
The preceding discussion has touched on four types of studies that
provide evidence for the taxonomy’s generality. First, the ability re-
quirement taxonomy has been found capable of generating summary de-
scriptions of most tasks performed on a number of jobs. Second, across
job settings, high interrater agreement coefficients have been obtained.
Third, the ratings used to assign tasks to ability categories appear to gen-
eralize across rater types. Fourth, the tasks are assigned to ability cate-
gories in a coherent and interpretable fashion across job settings impos-
ing markedly different performance requirements. Because these stud-
ies indicate that the internal characteristics of the ability requirement
taxonomy hold in multiple job settings, they provide evidence for the
system’s generality and, therefore, its external validity.
Several other studies, however, have provided more direct evidence
for the generality of conclusionsderived from Fleishman’s (1972a, 1975b,
1982) ability requirement taxonomy. In one study, Zedeck (1975) ob-
tained ratings of the cognitive and physical abilities required to perform
installer-repairman tasks in San Diego and Sacramento. He found a high
degree of cross-site agreement in the resulting ability profiles (T = .68).
Similarly, Hogan et al. (1978) established the cognitive, psychomotor,
and physical abilities required to perform warehouse workers’ tasks in
three different cities. When the mean ratings obtained from incumbents
on each ability at each site were compared, only one significant (p <
.05) difference emerged. Even this one difference was accounted for by
a difference in the task demands (e.g., shelf height) at one site. These
findings indicate that ability ratings obtained in one location can be gen-
eralized to other locations.
Bernardin (1988), drawing upon 51er (1984), has provided an impor-
tant extension of these initial generality tests. 51er reviewed four sepa-
rate ability requirement studies of police patrol jobs. He found that 11
abilities, including verbal comprehension, verbal expression, spatial ori-
entation, and flexibility of closure, influenced policemen’s performance
across job settings. Bernardin subsequently attempted to replicate these
findings in analyses of police patrol jobs in Las Vegas, Chicago, Williams-
burg, and at Old Dominion University. Despite some marked differ-
ences in work environment, he found the same abilities could be used to
summarize task performance across these locations. Similarly,Gebhardt
and Crump (1983) and Weldon (1983) conducted separate analyses of
the physical abilities related to tasks performed by paramedics in Pitts-

burgh and Los Angeles. Once again, a virtually identical set of abilities
was identified. When taken in conjunction with our foregoing observa-
tions, the results obtained in these studies provide some compelling evi-
dence for the cross-site generality of the summary descriptions provided
by the ability requirement scales.
Although ability requirement ratings appear to generalize across
work sites, the one significant difference obtained in the Hogan et al.,
(1978) study of warehouse workers is of interest. Here, the ratings on
the extent flexibility scale yielded a significant difference for warehouse
workers at one site. This difference was found linked to a marked shift
in task demands at this site, where it was found that the height of the
shelves in this warehouse were higher than at the other plants, increasing
the stretching movements (extent flexibility) required! From the point
of view of the present effort, this finding is noteworthy because it un-
derscores the taxonomy’s ability to capture substantively meaningful dif-
ferences in the nature of the tasks performed at different sites, thereby
arguing for the system’sconstruct validity.
In a recent study, Fleishman and Friedman (1990) obtained addi-
tional evidence for the meaningfulness of the ability requirement taxon-
omy for managerial positions. In this study, three primary performance
dimensions (project management, personnel supervision, and strategic
planning) were found to account for the relations among criticality in-
dices on 244 tasks performed by 117 managers from 15 research and de-
velopment organizations. Ability requirement ratings were subsequently
obtained from managers concerning the level of 20 different abilities re-
quired to perform the tasks within each of the three general performance
dimensions. Consistent with earlier observations of Jaques (1978) and
Pelz (1952), it was found that information processing or cognitive abil-
ities, such as information ordering, fluency of ideas, and originality,
tended to be emphasized in strategic planning tasks, while interpersonal
social skills tended to be emphasized in personnel supervision. It was
possible to show the relative importance of these abilities at different
organizational levels of management. In accordance with the earlier
observations of Fleishman and Quaintance (1984), it appears that the
ability requirements approach can capture substantively meaningful dif-
ferences in position requirements, even within a job family.
Inferential tests. The evidence presented above argues for the gen-
erality of the ability requirement taxonomy. A more fundamental issue
is whether the resulting categories give rise to viable inferences about
how category status influences other relevant forms of behavior. If these
inferences find support in observed relationships, they provide power-
ful evidence indicative of the classification’smeaningfulness. Moreover,
the results obtained in these inferential tests provide a basis for theory
development and refinement (Cronbach, 1971; Landy, 1986), thereby
contributing description, prediction, and understanding, as well as vali-
dation.
Performance requirements. Given the nature of this taxonomy, abil-
ity categories should map onto empirically derived dimensions of task
performance. In an initial evaluative effort, Theologus and Fleishman
(1973) attempted to confirm this hypothesis. In this investigation, they
had a panel of 79 judges rate descriptions of 38 tasks using 37 cognitive,
psychomotor, and physical abilities. Some 200 subjects were asked to
perform the same 38 tasks. To identify empirical dimensions summariz-
ing task performance, the correlations among the performance scores on
the 38 tasks were factor analyzed. Of the eight dimensions identified in
this factor analysis, the tasks yielding high loadings on seven dimensions
also produced high ratings on the level of the ability required on these
same dimensions. Hence, ability ratings were found to yield descriptions
of task performance similar to those obtained in quantitative analyses of
observed performance differences.
If, as Theologus and Fleishman’s (1973) data suggest, ability ratings
are related to empirically derived dimensions capable of summarizing
task performance, then another question might be posed. This observa-
tion suggests that ability requirement ratings might predict absolute dif-
ferences in task performance requirements. Fleishman, Gebhardt, and
Hogan (1986) initiated a series of investigations intended to address this
issue.
In several studies, Hogan and Fleishman (1979) and Hogan, Ogden,
Gebhardt, and Fleishman (1980) obtained ability requirement ratings
for a number of job and recreational tasks whose metabolic requirements
were known. High positive correlations ( 3 5 ) were observed between the
tasks’ known metabolic requirements and independent ratings of their
physical ability requirements.
In another study (Hogan, Ogden, Gebhardt, & Fleishman, 1979), in-
dividuals were asked to perform various material-handling tasks where
boxes of identical size, but different weights, were to be moved differ-
ent distances to establish foot-pounds of work. These individuals were
also asked to rate each task on physical abilities. When the actual foot-
pounds of work required by each task was correlated with task ability
ratings, a coefficient of .88 was obtained. This confirms that ability rat-
ings are indicative of objective performance demands.
The relationship between abilities and performance requirements
suggests that various derivative features of job activities, such as knowl-
edge requirements, should also be related to ability requirements. This
issue has been addressed in a recent study by Landy (1988), who obtained
558 PERSONNELPSYCHOLOGY
646 job knowledge items from a battery of seven tests used to assess
candidates for potential promotion to fire captain. A group of indus-
trial psychologists was then asked to rate each item in terms of the abil-
ities in the Manual for the Ability Requirement Scales (Fleishman, 1975b,
1991) found to be relevant to task performance in an earlier job analysis.
Landy’s findings indicated that roughly half of the job knowledge items
reflected abilities held to summarize task performance.
Peformance prediction. As Landy (1988) points out, one common
application of ability requirement data is to allow us to draw inferences
concerning the kinds of tests likely to predict job performance. One im-
plication of our foregoing observations is that ability requirement eval-
uations can be used to identify tasks that tap certain abilities. Individual
differences observed on these tasks might then be used to draw infer-
ences concerning the individual’s performance on other tasks calling for
this ability. But can actual performance be predicted from ability rat-
ings? In an early study, Theologus and Fleishman (1973) used six ability
requirement scales to obtain judges’ ratings of 27 laboratory tasks to pre-
dict actual performance or levels of its 400 subjects on these tasks. The
common performance metric for all these tasks was “number of units
produced per unit time.” A multiple correlation of .64 was obtained be-
tween ability ratings and this performance metric, indicating that ability
scale ratings were, indeed, correlates of task performance.
Later, Myers, Gebhardt, Price, and Fleishman (1981) identified the
physical abilities required to perform various Army tasks. Subsequently,
job sample tasks, such as grenade throwing (upper-body explosive
strength) and loading of 55-pound cartons onto a truck (upper-body
static strength) were developed to measure these constructs. When per-
formance on the job sample tasks was correlated with performance on
the physical ability marker tests of these abilities, coefficients on the or-
der of S O were obtained, confirming the expected relationship between
abilities and performance. Furthermore, the performance tests identi-
fied using this approach evidenced some convergent and discriminant
validity across job sample tasks. Another study, by Hogan et al. (1978),
confirmed these results using a job sample developed to simulate the or-
der selection and loading operations in a large warehouse. Generic abil-
ity tests, selected on the basis of the ability requirement scales, yielded a
multiple R of .45 in predicting performance on the job sample. Finally,
in a recent study, Gebhardt and Schemmer (1985) found validities in the
30s for generic tests of abilities in the taxonomy against job samples of
tasks performed by dockworkers.
Other investigations of performance prediction, based on the ability
taxonomy and rating methodology, have involved a broader set of crite-
ria and occupations. In these studies, the ability requirement taxonomy
was used to identify the abilities related to performance on the part of

warehouse workers (Hogan et al., 1978),correctional officers (Gebhardt
& Weldon, 1982), pipeline workers (Gebhardt et al., 1983), electrical
workers (Cooper et al., 1982), and Army enlistees (Myers, Gebhardt,
Price, & Fleishman, 1981). The ability requirement data obtained in
each study was then used to formulate hypotheses concerning the marker
tests that would predict performance. These tests were then adminis-
tered to a sample of incumbents and measures of job performance were
obtained in a concurrent validation design. The results obtained in these
investigations have been summarized by Fleishman (1988) and Fleish-
man and Mumford (1988). Broadly speaking, they found that measures
of the physical abilities related to performance typically yielded multiple
Rs on the order of SG.60 against various indices of job performance
obtained at the time of test administration.
Similar results have been obtained when a predictive, as opposed to
a concurrent, design was used. For instance, Hogan, Jennings, Ogden,
and Fleishman (1980) and Wunder (1981) assessed the physical ability
requirements of 11 apprentice-level occupations in an oil refinery. They
then administered marker tests to job applicants intended to capture
these abilities. Six months later, performance appraisal ratings were
obtained. When test scores were correlated with performance ratings,
multiple Rs on the order of .50 were obtained. Other work by Arnold,
Rauschenberger, Soubel, and Guion (1982), Arnold (1988), and Braith-
waite and Markos (1980), concerned with steel workers, has indicated
that physical ability measures specified by this technique will yield pre-
dictive validity coefficients on the order of S O . Finally, Reilly, Zedeck,
and Tenopyr (1979) found that performance and turnover criteria could
be predicted by tests specified on the basis of the ability requirements
derived from the analysis of telecommunication craft workers’ tasks.
The foregoing discussion has focused on physical ability require-
ments primarily because the largest and most diverse body of inferen-
tial evidence was available for these scales. This observation, however,
brings to fore another question: Can valid conclusions about selection
tests be derived for other ability constructs included in the taxonomy?
Hogan et al. (1978) provided some initial evidence pertinent to this ques-
tion. Their ability requirements analysis revealed that spatial visualiza-
tion was an important determinant of task performance. When a marker
test of this ability was administered to 127 warehouse workers, a corre-
lation of .41 was obtained with the number of units loaded on carts from
warehouse shelves. Similarly, a study of pipeline workers by Gebhardt
et al. (1983) indicated that mechanical comprehension and spatial abil-
ity were important descriptors of task performance. Marker tests de-
veloped to measure these abilities yielded significant correlations with
supervisors’ ratings of job performance. Cooper et al. (1983) analyzed

the abilities required by Federal Bureau of Investigation special agents; a
number of cognitive constructs, including inductive reasoning and origi-
nality, were identified as relevant to task performance. A battery of tests,
including measures of these abilities, yielded a cross-validated multiple
R of .35when supervisory ratings of job performance were used as a cri-
terion measure.
A final issue pertinent to predictive applications of the ability re-
quirement taxonomy pertains to the issue of test transportability. As
noted earlier, the ability requirement taxonomy was, in part, intended
to summarize tasks drawn from a variety of settings. Thus, one might
expect ability measures derived from the ability requirements analysis
to generalize across job settings requiring the same abilities. A study
by Schemmer and Cooper (1986), employing AT&T craft occupations,
identified nine clusters of job families containing occupations calling for
similar abilities. These abilities had been identified by use of the Manual
for the Ability Requirement Scales. Their analysis of test transportability
found that the same ability measures were likely to predict performance
across jobs within a job family. Where there was a shift in task demands,
however, different tests were found to predict performance in different
job families.
Influences on petfomzance. Taken as a whole, the studies discussed
above indicate that the ability requirement taxonomy can be used to
draw valid inferences about the kinds of tests likely to predict job per-
formance. This taxonomy has also proven useful in drawing inferences
about other kinds of events likely to influence performance. For in-
stance, it has traditionally been difficult to identify the kinds of variables
that influence performance on long-term monitoring tasks. Such tasks,
referred to in the literature as vigilance tasks, involve the detection of
infrequent signals over time. In a review of the literature on vigilance
research, Levine, Romashko, and Fleishman (1973) applied the ability
requirement scales to the diverse tasks used in these earlier studies. A
major finding was that these diverse tasks could be classified into tasks
where perceptual speed was the major ability requirement and tasks
where flexibility of closure was the major ability requirement. According
to the taxonomy, perceptual speed tasks involve mainly rapid identifica-
tion of the target when it appears. Flexibility of closure tasks require
identification in the presence of distracting stimuli. First of all, it was
found that performance in flexibility of closure tasks showed less decre-
ment than performance in perceptual speed tasks. This difference was
obscured in the absence of task classification by ability requirement.
Not only can the ability requirement taxonomy be used to draw valid
inferences about cross-task performance differences, it can also be used
to draw inferences about the k i d of events likely to influence task

performance. For example, the Levine et al. study (1973) showed that
the effects of such factors as signal rate and stimulus modality depended
on the primary ability underlying task performance. Generalizations
about the effects of these factors were enhanced by first classifying the
tasks according to their ability requirement.
Later, Levine, Kramer, and Levine (1975) extended these findings in
reviewing the literature on the effects of drugs and alcohol on perfor-
mance. They found that tasks used in these studies can be categorized
into those requiring selective attention, perceptual speed, and control
precision. They then examined task performance after administration
of different dosages of alcohol. In accordance with the hypothesis that
complex cognitive functions would be most affected by alcohol, it was
found that selective attention and perceptual speed tasks yielded more
pronounced performance decrements than control precision tasks. Sim-
ilarly, Elkin, Fleishman, Van Cott, Horowitz, and Freedle (1969, Fleish-
man (1975b), and Fleishman, Elkin, & Baker (1983) showed how gener-
alizations about the effects of drug dosage on human performance could
be enhanced when the task performance was categorized by ability re-
quirement. The effect of a particular drug dosage on the magnitude of
the effect, time to reach this effect, and time to recover depended on the
ability category of the task.
In another series of studies, an attempt was made to explain task per-
formance in terms of the relationship between ability requirements and
specific task characteristics. In this research, variations in task character-
istics were induced, and subsequently, changes in the patterns of abilities
related to task performance were identified. For instance, Fleishman
(195%) examined choice reaction performance under different condi-
tions of display-control compatibility. He found, for example, that pro-
gressive rotation of the display from an upright position shifted ability
requirements from perceptual speed to spatial orientation and spatial
visualization. A later study (Wheaton, Eisner, Mirabella, & Fleishman,
1976) confirmed these findings with an auditory-perceptual task, show-
ing systematic changes in ability requirements with changes in signal in-
tensity and signal-noise ratios.
Other work in this program, by Rose, Fingerman, Wheaton, Eisner,
& Kramer (1974), extended this work to cognitive tasks. Here, refer-
ence tests of various abilities were administered to individualswho sub-
sequently performed on electronic troubleshooting tasks of increasing
difficulty and complexity of circuit connections in wiring diagrams. The
findings clearly showed changes in the relative contribution of the flexi-
bility of closure, deductive reasoning, inductive reasoning, and associa-
tive memory abilities. A follow-up study by Fingerman, Eisner, Rose,
Wheaton, and Cohen (1975) extended this paradigm to concept identi-

fication tasks involving aircraft or ship identification. Again, these same
four abilities were found related to performance, but their relative con-
tribution depended on the level of task complexity. This research pro-
vides additional support of the meaningfulness of the ability taxonomy in
providing linkages of task-ability relationships. It appears that when one
has control of the criterion measure, variations in ability requirements
with changes in task characteristics can be shown.
Influences on performance acquisition. Investigations examining how
abilities influence skill acquisition have shown how inferences derived
from this ability taxonomy might be used to enhance our understanding
of human performance (Fleishman & Mumford, 1989a, 1989b). In early
studies, 200-300 subjects received a battery of reference tests known to
sample certain abilities in the taxonomy (Fleishman, 1957a; Fleishman
& Ellison, 1969; Fleishman & Hempel 1954,1955). These same subjects
then received practice on a more complex criterion task to be learned.
Through the use of factor analytic techniques applied to the correlations
between ability test and learning trial scores, the role of various abilities
at different stages of learning could be traced. In general, these studies
with a variety of practice tasks showed that the particular combinations
of abilities contributing to performance on a task may change as practice
continues, and that these changes are systematic and eventually become
stabilized at later stages of proficiency. The particular abilities that pre-
dict performance early in skill acquisition are different than those pre-
dictive of more advanced proficiency levels.
It is of note that this general pattern of relationships has been repli-
cated on more complex job-related tasks, such as Morse code learn-
ing (Fleishman & Fruchter, 1960) and air intercept mission simulations
(Parker & Fleishman, 1960). Furthermore, this basic pattern of results
has been replicated in alternative designs, using separate cross-sectional
analyses of skilled and unskilled performances (Fleishman, 1957a), re-
gression techniques for predicting practice trial loadings on reference
factors (Fleishman & Fruchter, 1960; Parker & Fleishman, 1960), analy-
ses of the interrelations among component and total-task measures at
different stages of practice (Fleishman, 1965; Fleishman & Fruchter,
1965), and by experimental methods (Fleishman & Rich, 1963). Still
other studies have shown that this pattern of relationships may be used
to draw inferences about optimal training interventions, which were sub-
sequently verified in a series of experimental studies (Parker & Fleish-
man, 1960). Having replicated this pattern of results using multiple
methods, Fleishman (1966) proposed an explanation which held that,
in perceptual-motor tasks, general cognitive abilities may play an impor-
tant role in the initial stages of skill acquisition because they represent
capacities that guide the definition and generation of responses. Over

time, these responses become more automatic, and performance is con-
ditioned by one or more psychomotor abilities.
Not only is this explanation consistent with Anderson’s (1982) no-
tions concerning the major stages of skill acquisition, where declara-
tive knowledge, knowledge compilation, and procedural stages are pro-
posed, but it also finds support in some recent theoretical and empirical
work. Ackerman (1986, 1987), for instance, induced manipulations in a
skill acquisition paradigm that prohibited the emergence of automatic
processing through use of a variable mapping condition. In accordance
with Fleishman’s (1966) earlier explanation, it was found that certain
general cognitive abilities remained important determinants of perfor-
mances throughout practice under the variable mapping condition. In
the continuous mapping condition, however, they diminished in impor-
tance. Recognizing the importance of this explanation for understanding
the general predictive power of intelligence measures, Murphy (1989)
went on to argue that ongoing changes in job requirements might ac-
count for the generalized cross-situational prediction derived from many
cognitive measures.
Beyond the support these observations provide for Fleishman’s
(1966) interpretation of the relationship between abilities and skill ac-
quisition, they point to an important source of evidence arguing for the
meaningfulness of the ability requirement taxonomy. More specifically,
these abilities are related to skill acquisition in a systematic fashion that
served to reveal something about how skilled performance emerges. By
showing that the ability requirement taxonomy can be used to enhance
our understanding of human performance and to draw meaningful in-
ferences about the conditions leading to the development of skilled per-
formance (Fleishman & Mumford, 1989a, 1989b; Parker & Fleishman,
1960), this research appears to provide further evidence of the system’s
construct validity. More recently, Fleishman and Mumford (1989b) have
shown how this research can provide a basis for developing and testing
causal models indicating how abilities influence the skill acquisition pro-
cess.
Conclusions
Some time ago, Paul Fitts (1962) stated the requirements for a tax-
onomy of human performance:
The importance of an adequate taxonomy for skilled tasks is widely rec-

ognized in all areas of psychological theorizing today. A taxonomy should
identify important correlates of learning rates, performance level, and in-
dividual differences. It should be equally applicable to laboratory tasks
and to tasks encountered in industry and in military sewice [Fleishman &

Quaintance, 1984, p. 41.
It would appear that the evidence on the ability requirement taxonomy

provides some match with Fitts’ criteria. However, the present effort
cannot be said to represent any final statement about the construct va-
lidity of the ability requirement taxonomy. As we noted earlier, the very
nature of construct validity effort makes final, absolute statements about
the validity of a classification system impossible. The present review, like
any other construct validation effort, has met its objective to the extent
that it permits one to weigh available evidence bearing on the meaning-
fulness of this system.
The broader intent of the present review, however, was to move the
evaluation of classification systems beyond simple, utilitarian criteria.
We sought to set forth principles for evaluating classifications of job be-
havior based on the postulate that classifications represent a set of con-
structs intended to provide a meaningful organization of a set of objects
or object relationships. We then attempted to show how this principle
might be used to identify validation strategies and illustrated their appli-
cations with respect to Fleishman’s (1972a, 1975b, 1982) ability require-
ment taxonomy.
In fact, the nature of classification systems suggests that construct
validity efforts should cqnsider a variety of different sorts of evidence
obtained using different methods and procedures. The nature of classi-
fication efforts, however, suggeststhat certain kinds of central inferential
issues need to be addressed in a coherent and systematic fashion. More
specifically, there is a need for research testing inferences derived from
(a) the operations used to construct the taxonomy, (b) the internal re-
lations characterizing the taxonomy, and (c) the implications of these
categories for other relevant forms of behavior.
In assessing a classification’svalidity, one must begin by considering
the meaningfulness of the operations used to define the summarization
categories. As in any other measurement effort, there is no a priori guar-
antee that the particular procedures selected will, in fact, yield viable
summary descriptions. Thus, studies should be designed that provide
evidence that the major operations employed in formulating the classi-
fication are, indeed, appropriate and likely to lead to viable descriptive
inferences. In the case of the ability requirements approach, studies of
interrater reliability, literature reviews, and expert judgment provided
evidence indicative of the appropriateness of the domain definition, cat-
egory content, and procedures used in appraising similarity and assign-
ing tasks to ability categories. Attempts to validate other classifications
may utilize these, as well as other procedures. Nonetheless, systematic
FLEISHMAN AND MUMFORD 5 65
validation efforts need to address these four inferential issues associated

with the major operations used to construct classificationsin virtually any
domain; these include domain definition, category content, similarity as-
sessment, and assignment decision rules.
Evidence arguing for the meaningfulness of the operations used in
constructing a classification cannot provide unequivocal validation evi-
dence. Because internal validity tests examine the relationships actually
produced by the classification,they provide more compelling validation
evidence. A great deal of internal validity evidence has been accrued
for the ability requirement taxonomy (Fleishman, 1972a, 1975b, 1982;
Fleishman & Mumford, 1988; Fleishman & Quaintance, 1984). Indices
of internal validity derived from behavior-category relationships indi-
cated that task performances observed in a variety of job settings could
be accounted for in a relatively unambiguous and parsimonious fashion.
Further, the assignment of tasks to categories revealed a substantively
interpretable pattern of relationships. The categories themselves also
yielded an interpretable pattern of relationships, as indicated by their
correlations, the pattern of characteristic job abilities, and the support
for these categories obtained in the broader literature.
Although these internal validity assessments argue for the potential
meaningfulness of the ability requirement taxonomy, and any classifi-
cation lacking internal validity is unlikely to evidence any real external
validity, the limited utility of internal validation evidence should be rec-
ognized. The limited value of internal validation efforts derives from
the fact that relational inferences are embedded in, and completely de-
pendent on, the classification structure as it stands. Thus, the final step
in evaluating classifications involves assessments of meaningfulness with
respect to inferences about behaviors that were not considered in the
system’s initial definition. These tests of external validity are likely to in-
clude descriptive extensions of the classification (generality tests), along
with explicit inferences concerning the implications of category status
for other forms of behavior (inferential tests).
In both kinds of external validity tests, the ability requirement taxon-
omy produced meaningful inferences about behavior in the workplace.
This classification appears to be reasonably generalizable. It was also
found that differences in ability requirement ratings across locations
were consistent with salient changes in job characteristics. More cen-
trally, however, the inferences derived from these categories proved use-
ful in the description, prediction, and understanding of people’s job be-
havior. The information provided by this taxonomy has proven useful in
identifying task performance demands and specifying the kinds of mea-
sures likely to prove useful in performance prediction. As illustrated
in the skill acquisition studies, descriptive information provided by this
taxonomy has also led to valid inferences concerning optimal training

interventions, while contributing to our further understanding of skill
development.
Taken as a whole, then, the overall pattern of the evidence accrued
to date tends to provide a strong argument for the construct validity
of descriptive data provided by the ability requirement taxonomy. The
strength of this construct validation evidence seems particularly com-
pelling in light of two characteristics of the present effort. First, a large
number of derivative inferences have been tested using multiple meth-
ods, and the vast majority of these inferences have been confirmed. Sec-
ond, this evaluation has been systematic in the sense that the evaluative
strategies in use were expressly selected to consider the central infer-
ential issues arising in classification efforts as they applied to the ability
requirement taxonomy. Without systematic validation strategies of the
sort sketched out above, it is unclear where investigators should focus
their efforts. These strategies suggest the kind of inferences that need to
be tested to determine whether the summary descriptions provided by a
classification lead to meaningful inferences. By applying the kind of con-
struct validation principles described above, systematic and progressive
evaluation efforts become possible. As a result, it becomes possible to
draw conclusions about a classification’s construct validity without over-
looking key inferential issues.
In considering these foregoing observations, however, one needs to
consider certain inherent limitations on the strength of the validity infer-
ences that can be drawn, even when these construct validation principles
have been systematically applied. As Cronbach (1971) points out, a con-
struct validation effort is never really complete. There will always be
other inferences that might be drawn, and there is always the possibil-
ity that these inferences will not be confirmed. Certainly, in the case of
the ability requirement taxonomy, there is a need for further validation
efforts in a number of areas. As new categories are added to the tax-
onomy due to ongoing research, there will certainly be a need to estab-
lish the meaningfulness of these categories within the broader structure
sketched out above (Wexley, 1989). Thus, our conclusions concerning
the validity of the ability requirement taxonomy must be viewed as con-
tingent statements based on the evidence available at this point in time.
The tentative nature of such conclusionsbecomes particularly salient
when it is recognized that research itself is a dynamic, rather than a static,
enterprise. In recent years, we have seen substantial progress in our
understanding of human abilities (Snow & Lohman, 1989; Sternberg,
1986). This cumulative theoretical progress may itself dictate future re-
visions in the ability requirement taxonomy to further enhance the mean-
ingfulness and comprehensiveness of the resulting descriptive informa-
tion. Current efforts are being made to extend this taxonomy to cap-
ture dimensions of interpersonal skills, personality, and generic knowl-
edges that might contribute to the development of skilled performance
(Fleishman & Friedman, 1990; Mumford, Weeks, Harding, & Fleish-
man, 1988; Mumford & Nickels, 1990). The current Manual for the Abil-
ity Requirement Scales (Fleishman, 1975a, 1991) is being supplemented
with such additional scales to measure these requirements. Examples
of task-anchored scales for skillfinowledge requirements developed are
mechanical knowledge, elecfricallelectronicknowledge, knowledge of tools
and uses, map reading, drafting, reading plans, driving, typing, keypunch-
ing, telefypewriting, shorthand, spelling, and grammar. Examples of per-
sonality requirements and interpersonal skill requirements are social
sensitivity,persuasion,persistence, behavior flexibility, dependability,emo-
tional stability, and self-confidence. This progressive refinement should
be viewed as a desirable characteristic of systematic construct validation
efforts.
This article has focused solely on inferences derived from the ability
requirement taxonomy. The evidence obtained for the meaningfulness
of this approach in no way speaks to its merits relative to alternative
systems for generating summary descriptions of job behavior. There is,
however, a need for this kind of comparative evaluation research. Hope-
fully, the evaluation principles described will facilitate such comparative
research by demarking certain crucial validation issues. Construct val-
idation focusing on the meaningfulness of the descriptive information
provided by a particular classification system represents a necessary first
step for conducting more complex comparative studies.
The need for further efforts along these lines becomes especially im-
portant when it is recognized that these schemes for generating sum-
mary descriptions of job behavior provide a foundation for many prac-
tical and theoretical efforts in the field of industriallorganizational psy-
chology (see Fleishman & Quaintance, 1984; Primoff & Fine, 1988). If
the classification schemes in use do not permit valid inferences about
job behavior, then much of this work will be based on an insecure foun-
dation. When we lack adequate construct validation evidence, the fail-
ure of an intervention might be plausibly attributed to poor description,
rather than to some inherent deficiency in the intervention technique
being evaluated. It would seem that substantially more effort should be
devoted to establishing the construct validity of other job analysis sys-
tems. We hope that the present effort will serve as an impetus to further
research along these lines, because it is only through the systematic ap-
plication of scientific principles that we can construct truly meaningful
descriptions of job behavior.
REFERENCES
Ackerman PL. (1986). Individual differences in information processing: An investigation

of intellectual abilities and task performance during practice. Intelligence, 10, 101-
139.
Ackerman PL. (1987). Individual differences in skill learning: An integration of the psy-
chometric and information processing perspectives. PsychologicalBulletin, 102, 3-
37.
Anderberg JL. (1973). Clusteringalgorithms. New York Academic Press.
Anderson JR. (1982). Acquisition of cognitive skill. PsychologicalReview, 89,369406.
Annett J, Duncan KD. (1967). ’Bskanalysis and training design. Occupational Psychology,
41,211-221.
Arnold JD. (1988). Entry-level steelworkers. In Gael S (Ed.), The job analysis handbook
for business, government, and industry (pp. 1287-1300). New York Wiley.
Arnold JD, Rauschenberger JM, Soubel VG, Guion RM. (1982). Validationand utility of a
strength test for selecting steel workers. Journal ofApplied Psychology,67,588-604.
Ash RA. (1988). Job analysis in the world of work. In Gael S (Ed.), The job analysis
handbook for business, government, and industry (pp. 3-13). New York: Wiley.
Ash RA, Levine EL. (1980). A framework for evaluating job analysis methods. Personnel,
57,53-59.
Bernardin JH. (1988). Police officer. In Gael S (Ed.), The job analysis handbook for
business, government, and industry (pp. 1242-1254). New York: Wiley.
Braithwaite DW, Markos VH.(1980). Preemployment physical ability tests for production
and maintenancepositions. Pittsburgh: United States Steel Corp.
Carroll JB. (1976). Psychometric tests as cognitive tasks: A new “structure of intellect.”
In Resnick L (Ed.), The nature of intelligence (pp. 27-56). Hillsdale, NJ: Lawrence
Erlbaum.
Chesler DJ. (1948). Reliability and comparability of different job evaluation systems.
Journal of Applied Psychology, 32,465-475.
Cook DT, Campbell DT. (1979). Quasi-aperimentationion:Design and analysis issuesfor field
settings. Chicago: Rand-McNally.
Cooper MA, Schemmer FM, Fleishman EA, Yarkin-Levin K, Harding FD, McNelis J.
(1987). Task analysis of Navy and Marine C o p occupations: Taxonomic basis for
evaluating CW antidotelpretreatment drugs (Tech. Rep. 3130). Bethesda, MD: Ad-
vanced Research Resources Organization.
Cooper MA, Schemmer FM, Gebhardt DC, Marshall-Mies J, Fleishman EA. (1982). De-
velopment and validation of physical ability tests forjobs in the electric power industry
(Tech. Rep. 3056). Bethesda, MD: Advanced Research Resources Organization.
Cooper MA, Schemmer FM, Jennings M, Korotkin AL. (1983). Developingselection stan-
dards for Federal Bureau of Investigation special agents. Bethesda, MD: Advanced
Research Resources Organization.
Cronbach LJ. (1971). Test validation. In Thorndike RL (Ed.), Educational measurement
(pp. 443-507). Washington,DC: American Council on Education.
Cronbach LI, Gleser GC. (1953). Assessing the similarity between profiles. Psychological
Bulletin, 50, 456-473.
Cronbach LJ, Meehl PE. (1955). Construct validity in psychological tests. Psychological
Bulletin, 52,281-302.
Crowson RA. (1970). Classification and biology. New York Atherton.

Davis LE, Wacker GJ. (1988). Job design. In Gael S (Ed.), The job analysis handbook for
business, government, andindusoy (pp. 157-172). New York Wiley.
Drauden GM. (1988). Thsk inventory analysis in industry and the public sector. In Gael
S (Ed.), Thejob analysis handbook for business, indushy, and government (pp. 1051-
1071). New York Wiley.
Dunnette MD. (1966). Personnel selection andplacement. Monterey, C A Brooks/Cole.
Ekstrom RB, French JW, Harmon HH. (1979). Cognitive factors: Their identification and
replication. Multivariate Behavioral Research Monographs (No. 79-2), 1-84.
Elkin EH, Fleishman EA, Van Cott HP,Horowitz H, Freedle RO. (1965). Effects of
drugs on human peiformance: Research concepts, test development, and preliminary
studies (Report AIR-E-25-10/65-AR-l). Washington, D C American Institute for
Research.
Farina AJ Jr, Wheaton GR. (1973). Development of a taxonomy of human performance:
The task characteristics approach to performance prediction. JSAS Catalog of Se-
lected Documents in Psychology,3 , 2 6 2 7 (Ms. No. 323).
Fine SA.(1988). Heavy equipment operators. In Gael S (Ed.), The job analysis handbook
for business, government, and indusby (pp. 1301-1310). New York: Wiley.
Fingerman P, Eisner E, Rose AM, Wheaton GR, Cohen E (1975). Methodr forpredict-
ing job-ability requirements: III. Ability requirements as a function of changes in the
characteiistics of a concept identification task (Tech. Rep. 75-4). Washington, D C
American Institutes for Research.
Fitts PM. (1962). Factors in complex skill learning. In Glaser R (Ed.), Training research
and education (pp. 177-198). Pittsburgh: University of Pittsburgh Press.
Fleishman EA. (1954). Dimensional analysis of psychomotor abilities. Journal of fiperi-
mental Psychology,8,437454.
Fleishman EA. (1957a). A comparative study of aptitude patterns in unskilled and skilled
psychomotor performance. Journal of Applied Psychology,41,263-272.
Fleishman EA. (195%). Factor structure in relation to task difficulty in psychomotor
performance. Educational and Psychological Measurement, 17,522-532.
Fleishman EA. (1958). Dimensional analysis of movement reactions. Journal of Experi-
mental Psychology,55,438-453.
Fleishman EA. (1964). The structure and measurement ofphysicalfitness. Englewood Cliffs,
N J Prentice Hall.
Fleishman EA. (1965). The prediction of total task performance from prior practice on
task components. Human Factors, 7,18-27.
Fleishman EA. (1966). Human abilities and the acquisition of skill. In Bilodeau EA (Ed.),
Acquisition of skill. New York: Academic Press.
Fleishman EA. (1967). Individual differences and motor learning. In GagnC RM (Ed.),
Learning and individual differences (pp. 165-191). Columbus, OH: Charles Merrill.
Fleishman EA. (1972a). On the relation between abilities, learning, and human perfor-
mance. American Psychologist, 27,1017-1032.
Fleishman EA. (1972b). Structure and measurement of psychomotor abilities. In Singer
RN (Ed.), The psychomotor domain: Movement behavior. Philadelphia: Lea &
Febiger.
Fleishman EA. (1975a). Manual for the Ability Requirement Scales. Bethesda, M D : Man-
agement Research Institute.
Fleishman EA. (1975b). Toward a taxonomy of human performance. American Psycholo-
@f, 30,1127-1 149.
Fleishman EA. (1982). Systems for describing human tasks. American Psychologist, 37,
1-14.
Fleishman EA. (1988). Some new frontiers in personnel selection research. PERSONNEL
PSYCHOLOGY,41 679-701.
Fleishman EA. (1991). Manual for the Ability Requirement Scales (MARS, revised). Palo
Alto, CA: Consulting PsychologistsPress.
Fleishman EA, Cobb AT, Spendolini MJ. (1976). Development of ability requirement scales
for the analysis ofyellowpage salesjobs in the Bell System (Final Report). Bethesda,
MD: Management Research Institute.
Fleishman EA, Elkin EH, Baker WJ. (1983). Effects of drugs on human peqormance: The
effects of scopolomine on representative tests of human peformance. Bethesda, MD:
Advanced Research Resources Organization.
Fleishman EA, Ellison GD. (1969). Prediction of transfer and other learning phenomena
from ability and personality measures. Journal ofEducationa1Psychology, 60,300-
314.
Fleishman EA,Friedman L. (1990). Cognitive competencies related to management per-
formance requirements in R&D organizations (Tech. Rep.). Fairfax, VA Center for
Behavioral and Cognitive Studies, George Mason University.
Fleishman EA, Fruchter B. (1960). Factor structure and predictability of successive stages
of learning Morse code. Journal ofApplied Psychology, 44,96101.
Fleishman EA, Fruchter B. (1965). Component and total task relations at different stages
of learning a complex tracking task. Perceptual and Motor Skills, 20,1305-1311.
Fleishman EA, Gebhardt DL, Hogan JC. (1986). The perception of physical effort in job
tasks (pp. 225-242). In Borg G (Ed.), Perception of aertion in physical exercise.
London: MacMillan Press.
Fleishman EA, Hempel WE Jr. (1954). Changes in factor structure of a complex psy-
chomotor test as a function of practice. Psychometrika, 18,239-252.
Fleishman EA, Hempel WE Jr. (1955). The relation between abilities and improvement
with practice in a visual discrimination reaction task. Journal of Experimental Psy-
chology, 49,301-312.
Fleishman EA, Hempel WE Jr. (1956). Factorial analysis of complex psychomotor per-
formance and related skills. Journal of Applied Psychology, 40,96104.
Fleishman EA, Hogan JC. (1978). Taronomicmethod for assessing the physical requirements
of jobs: Thephysical abilities analysis approach (Tech. Rep. 3012lR78-6). Bethesda,
MD: Advanced Research Resources Organization.
Fleishman EA, Mumford MD. (1988). Ability requirement scales. In Gael S (Ed.), Thejob
analysis handbook for business, government, and industry (pp. 917-935). New York:
Wiley.
Fleishman EA, Mumford MD. (1989a). Individual attributes and training performance. In
Goldstein IL (Ed.), Training and development in organizations (pp. 183-255). San
Francisco: Jossey-Bass.
Fleishman EA, Mumford MD. (1989b). Abilities as causes of individual differences at
different stages of skill acquisition. Human Peformance, 2,201-222.
Fleishman EA, Quaintance MK. (1984). Taronomies of human performance: The descrip-
tion ofhuman tasks. Orlando, F L Academic Press.
Fleishman EA, Reilly ME. (1991). Human abilities: Their definition, measurement, and job
task requirements. Palo Alto, C A Consulting Psychologists Press.
Fleishman EA, Rich S. (1963). Role of kinestheticand spatial-visualabilities in perceptual-
motor learning. Journal of Experimental Psychology, 66,611.
Fleishman EA, Stephenson RW. (1972). Development of a taxonomy of human perfor-
mance: A review of the third year’s progress. JSAS Catalog of Selected Documents
in Psychology,2,39-40 (Ms. No. 112).
Fogli L. (1988). Supermarket cashier. In Gael S (Ed.), The job analysis handbook for
business, government, and industty (pp. 1215-1228). New York: Wiley.
FLEBHMAN AND MUMFORD 571
French JW, Ekstrom RB, Price LA. (1963). f i t of reference testsfor cognitivefactors. Prince-
ton, NJ: Educational Testing Service.
G a p e RM. (1962). Human functions in systems. In Gagne RM (Ed.), Psychologicalprin-
ciples in system development (pp. 35-74). New York Holt, Rinehardt, & Winston.
Gebhardt DL, Cooper M, Jennings MC, Crump C, Sample RA. (1983). Development
and validation of selection tests for a natuml gas company (Final Report 30789).
Bethesda, MD: Advanced Research Resources Organization.
Gebhardt DL, Crump CE. (1983). Development of physical performance selection tests
for paramedics in the city of Los Angeles (Tech. Rep.). Bethesda, M D Advanced
Research Resources Organization.
Gebhardt DL, Jennings MC, Fleishman FA. (1981). Factors affecting the reliability of
physical ability and effort ratings of Navy tasks (Tech. rep. 3034). Bethesda, MD:
Gebhardt DL, Schemmer FM. (1985). Development and validation of selection testsfor long-
shoreman and marine clerks (Tech. Rep. 3113). Bethesda, MD: Advanced Research
Resources Organization.
Gebhardt DL, Weldon LJ.(1982). Development and validation ofphysicalperformance tests
for correctional officers (Final Report 3080). Bethesda, MD: Advanced Research
Goldstein IL. (1986). Training in organizations: Nee& assessment, development, and evalu-
ation. Monterey, C A Brooks/Cole.
Gregson RA. (1975). Psychometrics of similariy. New York: Academic Press.
Guilford JP. (1967). Nature of human intelligence. New York McGraw-Hill.
Guilford JP, Hoepfner R. (1966). St~~cture of intellect factors and their tests. Los Angeles:
Psychological Laboratory, University of Southern California.
Guilford JP, Hoepfner R. (1971). The anatysis of intelligence. New York McGraw-Hill.
Guion RM. (1978). “Content validity” in moderation. PERSONNEL PSYCHOLOGY,32,205-
213.
Guion RM. (1980). On trinitarian doctrines of validity. Professional Psychology, 12, 385-
398.
Hackman JR. (1968). Tasks and task performance in research on stress. In McGrath JE,
Social and psychologicalfactors in s!ress. New York Holt, Rinehart, and Winston.
Hamer RM, Cunningham JW. (1981). Cluster analyzing profile data confounded with in-
terrater differences: A comparison of profile association measures. Applied Psy-
chological Measurement, 5,63-73.
Harvey RJ, Lozada-Larson SR. (1988). Influence of amount of job descriptive information
on job analysis rating accuracy. Journal ofApplied Psychology, 73,457-461.
Henderson RI. (1988). Job evaluation, classification, and pay. In Gael S (Ed.), The job
analysis handbook for business, government, and industry (pp. 90-118). New York
Wiley.
Hogan JC, Fleishman EA. (1979). An index of the physical effort required in human task
performance. Journal of Applied Psychology, 64,197-204.
Hogan JC, Jennings MC, Ogden GD, Fleishman EA. (1980). Determining the physical
ability requirements of Emon apprenticejobs (Final Report 3044). Bethesda, MD:
Hogan JC, Ogden GD, Fleishman EA. (1978). Assessing the physical requirements in se-
lected benchmark jobs (Final Report 3012). Bethesda, MD: Advanced Research
Hogan JC, Ogden GD, Fleishman EA. (1979). The development and validation of tests
for the order selector job at Certijied Grocers of California, Ltd. (Tech. Rep. 3029).
Hogan JC, Ogden GD, Gebhardt DL, Fleishman EA. (1979). An index of physical effort
required in human task performance. Journal ofApplied Psychology, 65,672-679.
Hogan JC, Ogden GD, Gebhardt DL, Fleishman EA. (1980). Reliability and validity of
methods for evaluating perceived physical effort. Journal ofApplied Psychology: 65,
672-679.
Horn GL, Knapp JR. (1973). On the subjective character of the empirical base for Guil-
ford’s structure of intellect model. PsychologicalBulletin, 80,3043.
Horn JL. (1976). Human abilities: A review of research and theory in the early 1970s.
In Rosennueig MR, Porter LW (Eds.), Annual review ofpsychofogy(Vol. 27). Palo
Alto, C A Annual Review.
Imhoff DL, Levine JM. (1980). Development of apercept%al-motorancognitiveperformance
task battery for pilot selection (Tech. Rep.). Bethesda, MD: Advanced Research
Inn A, Schulman DR, Ogden GD, Sample RA. (1982). Physical ability requirements of
Bell System jobs (Final Report 3057/R82-1). Bethesda, MD: Advanced Research
Israelski EW. (1988). Automobile mechanic. In Gael S (Ed.), The job analysis handbook
for business, government, and industry (pp. 1311-1328). New York Wiley.
James LR, Muliak SA, Brett JM. (1984). Causal analysis: Assumptions, models, and data.
Beverly Hills, C A Sage.
Jaques E. (1978). Levels of abstracfionin logic and human action. Exeter, NH: Heinemann
Educational Books.
Joreskog CK, Sorbom DS. (1979). Structural equations analysis. Reading, M A : ABT
Books.
Konz SA. (1988). Designing the work environment. In Gael S (Ed.), The job analysis
handbook for business, government, and industry (pp. 731-748). New York Wiley.
Kulik Cr,Oldham GR. (1988). Job diagnostic survey. In Gael S (Ed.), The job analysis
handbook for business, government, and industiy (pp. 936-959). New York Wiley.
Landy FJ. (1986). Stamp collection versus science: Validation as hypothesis testing. Amer-
ican Psychologist, 41,1183-1192.
Landy FJ. (1988). Selection procedure development and usage. In Gael S (Ed.), The job
analysis handbook for business, government, and industry (pp. 271-287). New York:
Wiley.
Latham GP, Wexley KN. (1980). Increasing productivity through performance appraisal.
Reading, MA: Addison-Wesley.
Levine EL, Ash RA, Bennett N. (1980). Exploratorycomparativestudy of four job analysis
methods. Journal of Applipd Psychology, 65,524-535.
Levine JM, Kramer GG, Levine EN. (1975). Effects of alcohol on human performance:
An integration of research findings based on an abilities classification. Journal of
Applied Psychology,60,285-293.
Levine JM, Romashko T, Fleishman EA. (1973). Evaluations of an abilities classification
system for integrating and generalizing findings about human performance: The
vigilance area. Journal of Applied Psychology,58, 147-149.
Lopez FM. (1988). Threshold traits analysis system. In Gael S (Ed.), The job analysis
handbook for business, government, and industry (pp. 880-901). New York Wiley.
Madden JM. (1962). What makes work difficult? Personnel Journal, 41,341-344.
Malamud SM, Levine JM, Fleishman EA. (1980). Identifying ability requirements by
decision flow diagrams. Human Factors, 22,5748.
Mayr E. (1969). Principles of systematic biology. New York: McGraw-Hill.
McCormick ET. (1976). Job and task analysis. In Dunnette MD (Ed.), Handbook of
industrial and organizationalpsychology(pp. 651-696). Chicago: Rand-McNally.
McCormick El. (1979). Job analysis: Methods and applications. New York: Amacom.
McCormick El, Jeanneret PR, Mecham RC. (1972). A study of job characteristics and
job dimensions as based on the Position Analysis Questionnaire (PAQ). Journal of
Applied Psychology, 56,347-368.
Messick S. (1975). The standard problem: Meaning and values in measurement and
evaluation. American Psychologist, 30,955-966.
Messick S. (1980). R s t validity and the ethics of assessment. American Psychologist, 35,
1012-1027.
Messick S. (1989). Validity. In Linn RL (Ed.), Educational measurement (pp. 13-103).
New York Macmillan.
Miller EE. (1969). A tuwnomy of response processes (Tech. Rep. 69-16). Fort b o x , Ky:
Human Resources Research Organization.
Miller RB. (1962). Task description and analysis. In Gagnt RM (Ed.), Psychologicalprin-
cipks in system development (pp. 187-230). New York Holt, Reinhart, & Winston.
Miller RB. (1967). Task taxonomy: Science or technology? In Singleton W,Easterly RS,
Whitfield DC (Eds.), The human operator in complex systems. London: Taylor &
Francis.
Miller RB. (1973). Development of a taxonomy of human performance: Design of a
systemstaskvocabulary.JSAS Catalogof Selected Documents in Psychology,3,29-30
(Ms. No. 327).
Muliak SA.(1986). Toward a synthesis of deterministic and probabilistic formulation of
causal relations by the functional relation concept. Philosophy of Science, 53,313-
337.
Mumford MD, Nickels BJ.(1990). Making sense of people’s lives: Applying principles of
content and construct validity to background data. Forensic Reports, 3,143-168.
Mumford MD, Stokes GS, Owens WA. (1990). Patterns of life history: The ecology of human
individuality. Hillsdale, NJ: Lawrence Erlbaum.
Mumford MD, Weeks JL, Harding FD, Fleishman EA. (1987). Measuring occupational
difficulty: A construct validation against training criteria. Journal of Applied Psy-
chology, 72,578-587.
Mumford MD, Weeks JL, Harding FD, Fleishman EA. (1988). Relations between student
characteristics, course content, and training outcomes: An integrative modeling
effort. Journal ofApplied Psychology, 73,443-456.
Mumford MD, Yarkin-Levin K,Korotkin AC, Wallis MR, Marshall-MiesJ. (1985). Char-
acteristics relevant to performance as an Army leader: Knowledges, skilk, abilities,
other characteristics, and generic skillr (Tech. Rep.). Alexandria, VA: U.S. Army
Research Institute for the Behavioral and Social Sciences.
Murphy KR. (1989). Is the relationship between cognitive ability and job performance
stable over time? Human Performance, 2,183-200.
Myers DC, Gebhardt DL, Crump CE, Fleishman EA. (1984). Factor analysis of strength,
cardio-vascular endurance, flexibility, and body composition measures (Tech. Rep.
R83-9). Bethesda, MD: Advanced Research Resources Organization.
Myers DC, Gebhardt DL, Price SJ, Fleishman EA. (1979). Development ofphysicalperfor-
mance standardsforAmy jobs: Thejob ana&sismethodology (Report 3045/R79-10).
Myers DC, Gebhardt DL, Price SJ, Fleishman EA. (1981). Development ofphysicalperfor-
mance standards for Army jobs: Validation of the physical abilities analysis methodol-
ogy (Final Report 3045). Bethesda, MD: Advanced Research Resources Organiza-
tion.
Myers DC, Jennings MC, Fleishman EA. (1981). Development of job-related medical stan-
dards andphysical testsfor court securify officerjobs (Final Report 3062). Bethesda,
Olson HC, Fine SA,Myers DC, Jennings MC. (1979). The use of functional job analysis
in establishing performance standards for heavy equipment operators. PERSONNEL
PSYCHOLOGY, 34,351-364.
Owens WA, Schoenfeldt LE (1979). Toward a classificationof persons. Journal ofApplied
Psychology,68,570-607.
Parker JR Jr, Fleishman EA. (1960). Ability factors and component performance mea-
sures as predictors of complex tracking behavior. Psychological Monographs, 74
(No. 503).
Pelz DC. (1952). Influence: A key to effective leadership in first-line managers. Personnel,
29,205-217.
Peterson NG, Bownas DA. (1982). Skill, task structure, and performance acquisition. In
Dunnette MD, Fleishman EA (Eds.), Humanpeflomance andproductivity: Human
capability assessment. Hillsdale, N J Lawrence Erlbaum.
Primoff ES, Eyde LD. (1988). Job element analysis. In Gael S (Ed.), The job analysis
handbook for business, indushy, and government (pp. 807-824). New York: Wiley.
Primoff ES, Fine SA. (1988). A history of job analysis. In Gael S (Ed.), The job anafysis
handbook for business, indushy, and government (pp. 14-29). New York: Wiley.
Reilly RR, Zedeck S, Tenopyr ML. (1979). Validity and fairness of physical ability tests for
predicting performance in craft jobs. Journal of Applied Psychology, 64,262-274.
Reilly RR, Zink DL. (1980).Analysis of three outside crafijobs (AT&T Research Report).
New York American Telephone and 'Megraph Co.
Reilly RR, Zink DL. (1981). Analysis of four inside crafl jobs (AT&T Research Report).
New York American Telephone and Telegraph Co.
Romashko T, Brumbach GB, Fleishman EA, Hahn CF! (1974). Development of aprocedure
to validate physical tests (Tech. Rep.). Washington, D C American Institutes for
Research.
Romashko T, Hahn CP, Brumback GB. (1976). The prototype development of job-related
physical testingfor Philadelphia policeman selection (Tech. Rep.). Washington, DC:
American Institutes for Research.
Rose A, Fingerman P,Wheaton G, Eisner E, Kramer G. (1974). Methodsforpredictingjob
ability requirements. II: Ability requirements as a function of changes in the charac-
teristics of an electronicfault-finding task. Washington,D C American Institutes for
Research.
Rupe JC. (1956). Research into basic methods and techniques of Air Force job analysis
Tv (ALPTRC-TN-56-51). Chanute AFB, I L Air Force Personnel and ltaining
Research Center, Air Research and Development Command.
Schemmer FM. (1982).Development of rating scales for selected visual, auditory, and speech
abilities (Final Report 3064). Bethesda, MD: Advanced Research Resources Orga-
nization.
Schemmer FM, Cooper MA. (1986). Test transportability study for technician jobs in the
telephone industry. Bethesda, MD: Advanced Research Resources Organization.
Simon HA. (1953). Causal ordering and identifiability. In Hood WC, Koopmans TC
(Eds.), Studies in econometric methods (pp. 49-74). New York Wiley.
Simon HA. (1957). Making management decisions. The Academy of Management Execu-
tive, 1(2), 57-64.
Simpson GG. (1961). Principles of animal taxonomy. New York Columbia University
Press.
Snow RE, Lohman DF. (1984). Toward a theory of cognitive aptitude for learning from
instruction. Journal of Educational Psychology, 76,347-376.
Snow RE, Lohman DE (1989). Implications of cognitive psychology for educational mea-
surement. In Linn RL (Ed.), Educational measurement (pp. 263-331). New York:
Macmillan.
Sokal RR. (1974). Classification: Purposes, principles, progress, prospects. Science, 185,
1115-1 123.
Sokal RR, Sneith PHA. (1963). Princ@lesof numerical taronomy. San Francisco: Freeman.
Sternberg RJ. (1985). Implicit theories of intelligence, creativity, and wisdom. Journal of
Pemomli@and Social Psychology, 44,607-627.
Sternberg RJ.(1986). Synopsis of a triarchic theory of human intelligence. In Irvine SH,
Newstead SE (Eds.), Intelligence and cognition. Boston: Nijhoff.
Theologus GC, Fleishman EA. (1973). Development of a taxonomy of human perfor-
mance: Validation study of ability scales for classifying human tasks. JSAS Catahg
of Selected Documents in Psychology,3,29 (Ms. No. 326).
Theologus GC, Romashko T, Fleishman EA. (1973). Development of a taxonomy of
human performance: A feasibility study of ability dimensions for classifying human
tasks. JSAS Catalog of Selected Documents in Psychology,3,25-26 (Ms. No. 321).
'Qler TA. (1984). Police officerjob description. Flomnoor, I L Merit Employment Assess-
ment Services.
Wagner M. (1985, August). On the use of maintenance reports in job analysis. Paper pre-
sented at the 93rd annual convention of the American Psychological Association,
Los Angeles, CA.
Wallis MR, Korotkin AL, Yarkin-Levin K, Schemmer FM. (1985). Leadership job dimen-
sions and competencyrequirementsfor commissioned and non-commissioned oficers:
Remediation of inadequacies in existing databases (Report 3084, Vol. 2). Bethesda,
Weldon L. (1983). Recommendationsfor physical ability testing and medical guidelines for
the city of Pittsburgh (Final Report 3075). Bethesda, MD: Advanced Research
Wexley KN. (1989). Contributions to the practice of training. In Goldstein IL (Ed.),
Training and a'evebpment in organizations (pp. 487-500). San Francisco: Jossey-
Bass.
Wheaton GR. (1973). Development of a taxonomy of human performance: A review
of classificatorysystems relating tasks and performance. JSAS Catalog of Selected
Documents in Psychology,3,22-23 (Ms. No. 317).
Wheaton GR, Eisner E, Mirabella G, Fleishman EA. (1976). Ability requirements as a
function of changes in the characteristics of an auditory signal identification task.
Journal ofAppied Pqchology,61,663-676.
Wunder SR. (1981). Predictive validity of a physical abilities testing program for process
apprentices. Baton Rouge, LA: Exxon C o p , Employee Relations Department.
Zedeck S. (1975). klidation of physical abilities tests for AT&T craffpositions: Program
report with special emphasis on detailed job analyses (Tech. Rep. 5 ) . New York
American Telephone and Telegraph Co.

Fleischman Evaluating Classifications of Job Behaviour

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Fleischman Evaluating Classifications of Job Behaviour

Uploaded by

Copyright:

Available Formats

PERSONNEL PSYCHOLOGY

EVALUATING CLASSIFICATIONSOF JOB BEHAVIOR:

EDWIN A. FLEISHMAN, MICHAEL D. MUMFORD

The varied research efforts of industrial and organizational psychol-

COPYRIGHT 0 1991 PERSONNEL PSYCHOLOGY, INC.

(1981), and Rupe (1956) have conducted studies intended to appraise

behaviorally anchored scales covering a range of cognitive, psychomo-

Principles and Procedures

To understand the role of classification in job analysis, it is neces-

classifications might have provided an unambiguous solution to the sim-

Construct systems. Variation in the nature and content of classifica-

categories. Because future relational tests may yield disconfirmatory evi-

that should be assigned to each category, or concerning expected rela-

Description ofthe Abilily RequirementsApproach

In the foregoing section, we specified the general principles underly-

Definitional Issues: Abilities, Tasks, and Units

Within the ability requirements approach, an ability is held to re-

similarity in the abilities relevant to performance. For instance, tasks

10 separate factor analytic investigations. In the psychomotor and phys-

1. Oral comprehension 14. Category flexibility 27. Arm-Hand steadiness 40.Stamina

Measuring Ability Requirements

It was clear that the ability category definitions summarized a great

task with respect to 37 ability dimensions. To obtain an index of relia-

Evaluating the Ability Requirements Approach

Having described the nature of the ability requirement taxonomy, we

Requires understanding of complex or

Requires understanding short, simple

(Adapted from Fleishman, 1975a, 1991)

Figure 1. Definition and Ability Scale for Written Comprehension

procedures used to address the four major operational issues involved

Domain definition. The first step in any taxonomic effort is speci-

How Static Strength Is Different From Other Abilities:

Requires use of all the muscle force

very heavy object. - Reach over and lift a 70 Ib. box

t5 - Walk a few steps on flat terrain car-

- Lift one package of bond paper.

(Adapted from Fleishman, 1975a, 1991)

Figure 2. Definition and Ability Rating Scale for Static Strength

poor description. Furthermore, many ambiguities may arise in formulat-

are held to provide a basis for summarizingtask performance (Fleishman

To obtain evidence pertinent to this hypothesis, Hogan et al. (1978)

Soearman’s Rho ( N = 19)

distribution across tasks. Some data pertinent to this conclusion may

decision-flow diagrams, as a method for assigning tasks to ability cate-

performed by warehouse workers could be assigned to one or more of

yield unambiguousassignments of tasks to categories, such that (a) tasks

Static strength .92

Another source of evidence bearing on a taxonomy's internal valid-

The coherence or interpretability of task assignments to Fleishman’s

Grocery warehouse workersa

Aviation electronics mateb

meaningfulness of descriptive information provided by the physical abil-

This varied evidence, derived from the characteristics of the ability

internal relationships in new populations or situations. In other cases,

the physical abilities related to tasks performed by paramedics in Pitts-

was used to identify the abilities related to performance on the part of

supervisors’ ratings of job performance. Cooper et al. (1983) analyzed

to draw inferences about the k i d of events likely to influence task

Wheaton, and Cohen (1975) extended this paradigm to concept identi-

capacities that guide the definition and generation of responses. Over

The importance of an adequate taxonomy for skilled tasks is widely rec-

and to tasks encountered in industry and in military sewice [Fleishman &

It would appear that the evidence on the ability requirement taxonomy

validation efforts need to address these four inferential issues associated