Professional Documents
Culture Documents
valid score, i.e., a score correctly indicating what he can do rather than
overestimating or underestimating his abilities. Most persons will under
stand that an incorrect decision, which might result from invalid test
scores, would mean subsequent failure, loss of time, and frustration for
them. This approach can serve not only to motivate the individual to
try his best on ability tests but also to reduce faking and encourage frank
reporting on personality inventories, because the examinee realizes that
he himself would otherwise be the loser. It is certainly not in the best
interests of the individual to be admitted to a course of study for which
he is not qualified or assigned to a job he cannot perform or that he
would find uncongenial.
TE ST ANXIETY
When the teacher says she is going to find out how much you have learned,
does your heart begin to beat faster?
While you are taking a test, do you usually think you are not doing well?
Of primary interest is the finding that both school achievement and intel
ligence test scores yielded significant negative correlations with test anx
iety. Similar correlations have been found among college students (I. G.
Sarason, 1961). Longitudinal studies likewise revealed an inverse relation
between changes in anxiety level and changes in intelligence or achieve
ment test performance (Hill & Sarason, 1966; Sarason, Hill, & Zim-
bardo, 1964).
Such findings, of course, do not indicate the direction of causal relation
ships. It is possible that children develop test anxiety because they per
38 Context of Psychological Testing
form poorly on tests and have thus experienced failure and frustration in
previous test situations. In support of this interpretation is the finding
that within subgroups of high scorers on intelligence tests, the negative
correlation between anxiety level and test performance disappears
(Denny, 1966; Feldhusen & Klausmeier, 1962). On the other hand, there
is evidence suggesting that at least some of the relationship results from
the deleterious effects of anxiety on test performance. In one study
(Waite, Sarason, Lighthall, & Davidson, 1958), high-anxious and low-
anxious children equated in intelligence test scores were given repeated
trials in a learning task. Although initially equal in the learning test, the
low-anxious group improved significantly more than the high-anxious.
Several investigators have compared test performance under conditions
designed to evoke “anxious” and “relaxed” states. Mandler and Sarason
( 1952), for example, found that ego-involving instructions, such as telling
subjects that everyone is expected to finish in the time allotted, had a
beneficial effect on the performance of low-anxious subjects, but a dele
terious effect on that of high-anxious subjects. Other studies have likewise
found an interaction between testing conditions and such individual char
acteristics as anxiety level and achievement motivation (Lawrence, 1962;
Paul & Eriksen, 1964). It thus appears likely that the relation between
anxiety and test performance is nonlinear, a slight amount of anxiety
being beneficial while a large amount is detrimental. Individuals who are
customarily low-anxious benefit from test conditions that arouse some
anxiety, while those who are customarily high-anxious perform better
under more relaxed conditions.
It is undoubtedly true that a chronically high anxiety level will exert a
detrimental effect on school learning and intellectual development. Such
an effect, however, should be distinguished from the test-limited effects
with which this discussion is concerned. To what extent does test anxiety
make the individual’s test performance unrepresentative of his customary
performance level in nontest situations? Because of the competitive pres
sure experienced by college-bound high school seniors in America today,
it has been argued that performance on college admission tests may be
unduly affected by test anxiety. In a thorough and well-controlled investi
gation of this question, French ( 1962) compared the performance of high
school students on a test given as part of the regular administration of
the SAT with performance on a parallel form of the test administered at
a different time under “relaxed” conditions. The instructions on the latter
occasion specified that the test was given for research purposes only and
scores would not be sent to any college. The results showed that per
formance was no poorer during the standard administration than during
the relaxed administration. Moreover, the concurrent validity of the test
scores against high school course grades did not differ significantly under
the two conditions.
Nature and Use of Psychological Tests 39
thal & Rosnow, 1969). An experiment conducted with the Rorschach will
illustrate this effect (Nlasling, 1965). The examiners were 14 graduate
student volunteers, 7 of whom were told, among other things, that ex
perienced examiners elicit more human than animal responses from the
subjects, while the other 7 were told that experienced examiners elicit
more animal than human responses. Under these conditions, the two
groups of examiners obtained significantly different ratios of animal to
human responses from their subjects. These differences occurred despite
the fact that neither examiners nor subjects reported awareness of any
influence attempt. Moreover, tape recordings of all testing sessions re
vealed no evidence of verbal influence on the part of any examiner. The
examiners’ expectations apparently operated through subtle postural and
facial cues to which the subjects responded.
Apart from the examiner, other aspects of the testing situation may
significantly affect test performance. Military recruits, for example, are
often examined shortly after induction, during a period of intense read
justment to an unfamiliar and stressful situation. In one investigation
designed to test the effect of acclimatization to such a situation on test
performance, 2,724 recruits were given the Navy Classification Battery
during their ninth day at the Naval Training Center (Gordon & Alf,
1960). When their scores were compared with those obtained by 2,180
recruits tested at the conventional time, during their third day, the 9-day
group scored significantly higher on all subtests of the battery.
The examinees’ activities immediately preceding the test may also af
fect their performance, especially when such activities produce emotional
disturbance, fatigue, or other handicapping conditions. In an investiga
tion with third- and fourth-grade schoolchildren, there was some evidence
to suggest that IQ on the Draw-a-Man Test was influenced by the chil
dren’s preceding classroom activity (McCarthy, 1944). On one occasion,
the class had been engaged in writing a composition on “The Best
Thing That Ever Happened to Me”; on the second occasion, they had
again been writing, but this time on “The Worst Thing That Ever Hap
pened to Me.” The IQ’s on the second test, following what may have
been an emotionally depressing experience, averaged 4 or 5 points lower
than on the first test. These findings were corroborated in a later investi
gation specifically designed to determine the effect of immediately pre
ceding experience on the Draw-a-Man Test ( Reichenberg-Hackett, 1953).
In this study, children who had had a gratifying experience involving the
successful solution of an interesting puzzle, followed by a reward of toys
and candy, showed more improvement in their test scores than those who
had undergone neutral or less gratifying experiences. Similar results were
obtained by W. E. Davis (1969a, 1969b) with college students. Per
formance on an arithmetic reasoning test was significantly poorer when
preceded by a failure experience on a verbal comprehension test than it
Nature and Use of Psychological Tests 41
was in a control group given no preceding test and in one that had taken
a standard verbal comprehension test under ordinary conditions.
Several studies have been concerned with the effects of feedback re
garding test scores on the individuals subsequent test performance. In a
particularly well-designed investigation with seventh-grade students,
Bridgeman (1974) found that “success” feedback was followed by sig
nificantly higher performance on a similar test than was “failure” feed
back in subjects who had actually performed equally well to begin with.
This type of motivational feedback may operate largely through the goals
the subjects set for themselves in subsequent performance and may thus
represent another example of the self-fulfilling prophecy. Such general
motivational feedback, however, should not be confused with corrective
feedback, whereby the individual is informed about the specific items he
missed and given remedial instruction; tinder these conditions, feedback
is much more likely to improve the performance of initially low-scoring
persons.
The examples cited in this section illustrate the wide diversity of test-
related factors that may affect test scores. In the majority of well-admin
istered testing programs, the influence of these factors is negligible for
practical purposes. Nevertheless, the skilled examiner is constantly on
guard to detect the possible operation of such factors and to minimize
their influence. When circumstances do not permit the control of these
conditions, the conclusions drawn from test performance should be
qualified.
made. Thus, it can be stated that a test score is invalidated only when a
particular experience raises it without appreciably affecting the criterion
behavior that the test is designed to predict.
coaching. The effects of coaching on test scores have been widely in
vestigated. Many of these studies were conducted by British psycholo
gists, with special reference to the effects of practice and coaching on the
tests formerly used in assigning 11-year-old children to different types of
secondary schools (Yates et al., 1953-1954). As might be expected, the
extent of improvement depends on the ability and earlier educational
experiences of the examinees, the nature of the tests, and the amount and
type of coaching provided. Individuals with deficient educational back
grounds are more likely to benefit from special coaching than are those
who have had superior educational opportunities and are already pre
pared to do well on the tests. It is obvious, too, that the closer the re
semblance between test content and coaching material, the greater will
be the improvement in test scores. On the other hand, the more closely
instruction is restricted to specific test content, the less likely is improve
ment to extend to criterion performance.
In America, the College Entrance Examination Board has been con
cerned about the spread of ill-advised commercial coaching courses for
college applicants. To clarify the issues, the College Board conducted
several well-controlled experiments to determine the effects of coaching
on its Scholastic Aptitude Test and surveyed the results of similar studies
by other, independent investigators (Angoff, 1971b; College Entrance
Examination Board, 1968). These studies covered a variety of coaching
methods and included students in both public and private high schools;
one investigation was conducted with black students in 15 urban and
rural high schools in Tennessee. The conclusion from all these studies is
that intensive drill on items similar to those on the SAT is unlikely to
produce appreciably greater gains than occur when students are retested
with the SAT after a year of regular high school instruction.
On the basis of such research, the Trustees of the College Board issued
a formal statement about coaching, in which the following points were
made, among others ( College Entrance Examination Board, 1968,
pp. 8 -9 ):
The results of the coaching studies which have thus far been completed in
dicate that average increases of less than 10 points on a 600 point scale can
be expected. It is not reasonable to believe that admissions decisions can be
affected by such small changes in scores. This is especially true since the tests
are merely supplementary to the school record and other evidence taken into
account by admissions officers. . . . As the College Board uses the term, ap
titude is not something fixed and impervious to influence bv the way the child
Nature and Use of Psychological Tests 43
lives and is taught. Rather, this particular Scholastic Aptitude Test is a meas
ure of abilities that seem to grow slowly and stubbornly, profoundly influenced
by conditions at home and at school over the years, but not responding to
hasty attempts to relive a young lifetime.
It should also be noted that in its test construction procedures, the Col
lege Board investigates the susceptibility of new item types to coaching
(Angoff, 1971b; Pike & Evans, 1972). Item types on which performance
can be appreciably raised by short-term drill or instruction of a narrowly
limited nature are not included in the operational forms of the tests.
day to three years (Angoff, 1971b; Droege, 1966; Peel, 1951, 1952).
Similar results have been obtained with normal and intellectually gifted
schoolchildren, high school and college students, and employee samples.
Data on the distribution of gains to be expected on a retest with a parallel
form should be provided in test manuals and allowance for such gains
should be made when interpreting test scores.
Social a n d E th ica l
Im plications o f Testing
USER QUALIFICATIONS
PROTECTION OF PRIVACY
CONFIDENTIALITY