Professional Documents
Culture Documents
John and Jean Raven. 30 Great King Street. Edinburgh EH3 6QH. Scotland. United
Kingdom.
223
224 John and Jean Raven
r--
•
I I
nO
I[J\DJ
1 2 3 4
r-:T\rr==\
~ Lhd/LLJ i
5 6 7 8
on many different abilities were composited into total scores, and the
individual subscale scores were too unreliable to use. He therefore set
about developing tests that would be easy to administer, theoretically
based, and directly interpretable without the need to perform complex cal-
culations to arrive at scores on latent, or underlying, "factors" or variables.
Raven was a student of Spearman's, and Spearman's formulations of
intelligence influenced him. It is well known that Spearman (1927a, 1927b)
was the fIrst to notice that measures of what had previously been thought
of as separate abilities tended to correlate relatively highly and to suggest
that the resulting pattern of intercorrelations could be largely explained by
positing a single underlying factor of general cognitive ability or g.
It is less well known that Spearman thought of 9 as being made up of
two very different abilities which normally work closely together, one he
termed eductive ability-meaning making ability-and the other repro-
ductive ability-the ability to reproduce explicit information and learned
skills. He did not claim that these were separate factors. Rather, he
argued that they were analytically distinguishable components of g. Raven
developed his RPM tests as measures of eductive ability and his Mill Hill
Vocabulary (MHV) tests as measures of reproductive ability. (Raven,
Raven, & Court, 1998d, 2000). These abilities are often, inappropriately,
reffered to as "fluid" and "crystallized" intelligence.
Raven Progressive Matrices 225
CD.·····
+ ::::::- EZ1
2
+ + +
3
+
...... + +
4 5 6
Figure 11.2 More Difficult Standard Progressive Matrices Type Test Item.
Standardization
Original normative data for Britain for the SPM were assembled by
J. C. Raven and his colleagues immediately before and during World War II.
The way in which the samples were combined is described in Raven (2000c).
As also indicated in that article, sporadic (but, as it later turned out, inad-
equate) checks on the continuing appropriateness of these norms and their
relevance to the United States were made by a number of researchers until
the mid 1970s. Since then, many researchers, working largely with the
present authors, have assembled a vast pool of international, ethnic, and
time series norms that enable users to set the scores of groups and indi-
viduals they have tested in a variety of contexts. Many of these norms are
included in the main Sections of the Manual (Raven, 2000b; Raven, Raven,
& Court, 1998a, b, c, d; Raven et al., 2000), and others are summarized in
its Section 7: Research and References (Court & Raven, 1995).
Internal Consistency
The appropriate model for assessing the internal consistency of the
RPM and MHV is Item Response Theory. Given that the tests conform to
226 John and Jean Raven
Test-Retest Reliability
Reports of retest reliability are contained in over 120 articles, sum-
marized in Court and Raven (1995). These studies differ widely in method-
ology, the populations studied (some being, for example, Native Americans
or Mricans and others psychiatric patients), and the intervals between the
initial testing and retest. The retest intervals range from 1 week to 3 years.
As would be expected, the test-retest correlations vary markedly with
all these conditions. It would be misleading to report any single value.
A textual summary of the main studies is presented by Raven et al.
(2000) and concludes that well-conducted studies point to adequate retest
reliabilities (Le., 0.85 and upward) for intervals up to a year.
Validity
Establishing the validity of a test is somewhat difficult: How does one
validate a test claiming to measure an ability-like eductive ability-when:
(a) those concerned are only likely to display the ability when they are
undertaking activities they care about; (b) the effective performance of
such tasks as would lead them to display the ability also requires them to
engage in a number of other activities which they may lack the ability or
interest to perform; (c) the important contributions they are making to
their school, group or organization are likely to be overlooked by their
colleagues and their managers; (d) their behavior is constrained by orga-
nizational arrangements, the expectations which others have of them, and
by what others do; and (e) the effects of their actions-and especially
indices of their "profitability"-are heavily dependent on what others do?
In many real-world settings not only do many important contributions
go unrecognized, but people are often not doing what others think they are
doing and have recruited them to do. Thus, according to Hogan (1990,
1991), some 50% of American managers apply their minds to achieving per-
sonal goals-often personal advancement-which are to the detriment of
their subordinates and their organizations. The problem is not that they lack
intellectual ability, but the end to which they apply it. To find out what they
are doing and how effectively they "think" about how to achieve it, one would
have to interview them-and in a very nonjudgmental manner at that.
It follows that, as McClelland (1973) and Messick (1989, 1995) have
been almost alone in emphasizing, the only way to validate tests is by
undertaking a careful conceptual analysis of the nature of the qualities
Raven Progressive Matrices 227
being assessed by the test, the psychological nature of the various crite-
ria available for judging performance, the range of personal qualities
required for "success" in terms of each of these criteria, and the factors
that are likely to moderate the strength of observed relationships between
test scores and criterion measures, and by making a conceptual analysis
of a network of empirical relationships. In short, the way forward is to be
found through complex, almost ethnographic, studies and not via the
multivariate analyses that have such high prestige in psychology.
These problems are no less severe in school environments. Despite
the fact that the word "education" comes from the same Latin root as
"eductive" ability-and thus implies activities that "draw out the talents
of" the students-most of what is passed off as education in schools and
colleges involves "putting in" rather than drawing out. As Goodlad (1983)
and Raven (1994) have observed, the activities that dominate schools do
not merit description as "academic," "intellectual," or "educational."
In fact, to draw out the talents of the students, teachers would need
concepts to think about multiple-talents-about diversity-and tools to
help them design endless individualized developmental programs, moni-
tor students' idiosyncratic development, take remedial action when nec-
essary, and give the students credit for the particular talents they had
developed. The concepts and tools required to do this are precisely those
that psychologists have so conspicuously failed to develop to enable par-
ents, teachers, managers, and other members of society to think about
and assess aspects of competence outside the domain in which the RPM
may be said to "work."
We have seen that people are only likely to display their eductive
ability-their ability to make meaning out of confusion-while they are
working at tasks they care about. A selection of activities that people may
be strongly motivated to undertake is listed across the top of Figure 11.3
in our Model of Competence. 1 Down the side are listed a series of compo-
nents of competence which, if the individual displays them, will help to
ensure that the chosen activity is carried out effectively. These compo-
nents of competence are cumulative and substitutable. The more of them
an individual deploys to carty out any given activity, the more likely they
are to carty out that activity effectively.
Three things follow directly from the theoretical model depicted in
Figure 11.3:
1. Effective behavior is dependent on engaging in a range of other activ-
ities besides making meaning out of the situation in which one finds
oneself.
2. It is only reasonable to seek to assess someone's meaning-making
ability (and their ability to engage in the other activities that make
for effective behavior) after one has identified the kinds of activity
they are strongly motivated to undertake.
IThe way in which this model has been built up from the work of David McClelland and his
colleagues is described in Raven and Stephenson (2001).
228 John and Jean Raven
e Uj
c;
~ :;:.
01 0. "2 .~
f~ ~ 0
c
:0 E gis
sU
E EllS
! :2-8 'iii !!I i!;f!
8.!!
~ti
~~ ~~" ,,0.2 E
-"
'!!.o
liB E
ii~
§'~
Ea i"
",.0
~~u
-"
. .e. ~~
~.., c
dB
d §'i~
I!:!
I·~ ti ~'i
h .il
.9
~g .~&ZU
ij "'0" 8,eH
l'"c ~~ i.e ~~ H 'iii.li!i E!""
::a~0.
~~~~
~!i'l! :25
·§ei
I·~ ~!i
c
Examples of components g.J:! :§;ii ~:;; ~.~.~ ~~5- ~
'!1 ~~!i
.!ae:
lj lij2 ";::.1:.
~c
~.;~ ·-H~~
of effective behavior
811
~
~ aU olll
1;'5 ~
B:.! Iii~ ~} ~! ""
c-"
w" <!!c; ~~.g~'6
CogniUve
Thinking (by opening one's mind to experience, dreaming, and using
other subconscious process) about what is to be achieved and how
it is to be achieved.
Consequence anticipated:
Personal. e.g. -I know there will be difficulties, but I know from my
previous experience that 1can find ways round them."
Personal nonnative beliefs: e.g. -I would have to be more devious
and manipulative than I would like to be to do that.'
Social nonnativa beliefs: e.g. 'My friends would approve HI did that':
'It would not be appropriate for someone in my position to do thet.' I--l-+--l!---+--+--+--t--l!--+--I------/
Affective
Tuming one's emotions into the task:
AdmiHing and harnessing feelings of delight and frustration:
using the unpleasantness of tasks one needs to complete as an
incentive to get on with them rather than as an excuse to avoid them.
Using one's feelings to initiate action, monitor its effects, and change
one's behavior.
Conative
Putting in extra effort to red\JC8 the likeHhood of failure.
and the Binet and Wechsler scales range from 0.54 to 0.86. Correlations
with verbal intelligence and vocabulary tests tend to be slightly lower, gen-
erally falling below 0.70. When compared with the British studies, the
intertest correlations obtained in cross-cultural research with non-English-
speaking children and adolescents tend to be lower although generally
moderate, ranging from about 0.30 to 0.68. Intertest correlations for adult
respondents are similar in magnitude and pattern to those for children.
Evidence of Fairness
Among young people in the 1979 British standardization (see Raven,
1981), the correlations between the item difficulties established separately
for eight socioeconomic groups ranged from 0.97 to 0.99, with the low of
0.97 being a statistical artifact. In the U.S. standardization (Raven, 2000b),
the correlations between the item difficulties established separately for
232 John and Jean Raven
different ethnic groups (Black. Anglo. Hispanic. Asian. and Navajo) ranged
from 0.97 to 1.00. Jensen (1980) reported similar results for the CPM.
According to Owen (1992). the test has the same psychometric properties
among all ethnic groups in South Mrica-that is. it scales in much the
same way. has a similar reliability. correlates in almost the same way with
other tests. and factor analysis of these correlations yields a similar facto-
rial structure. The correlations between the item difficulties established
separately in the United States. United Kingdom. Germany. New Zealand.
and Chinese standardizations range from 0.98 to 1.00. The test is there-
fore extremely robust and works in the same way (measures the same
thing) in a wide range of cultural. socioeconomic. and ethnic groups
despite the (sometimes huge) variation in mean scores between these
groups.
respondents would benefit from practice. Other advantages of the SPM are:
• Lower-scoring respondents encounter fewer problems that are too
difficult for them and, as a result, have a more positive experience.
• There are more research data for the SPM, including separate
norms for different subpopulations.
• As an untimed test, it is less stressful for respondents.
INTERPRETATION
Percentile equivalents of the raw scores can be read off from a wide
selection of reference data covering a range of age, nationality, ethnic group,
and occupational groups. There is a special supplement-Research
Supplement No.3 (year 2000 edition; Raven, 2000b)-devoted mainly to pro-
viding norms for a range of North American school districts. Users should
select the norms they use carefully in relation to the uses to which the data
are to be put. Indeed, depending on the objectives of the assessment, it may
be desiTable to adopt nonstandard procedures of administration.
Because of the criteria of eligibility for Special Education laid down in
U.S. Federal statutes, detailed US norms are included in the Manual.
However, their use is discouraged on the grounds that they incline users
to attach importance to small differences in score-and too much impor-
tance to "intelligence" in general. Neither the discriminative power of the
tests currently aVailable, their reliability, nor the explanatory power of the
Raven Progressive Matrices 235
The RPM tests are perhaps the most researched of the measures of
cognitive ability that have items presented in a nonverbal format. Their
construct Validity is, on the face of it, so well established that they are
widely used as the criterion against which to validate other tests.
Nevertheless, the tendency to describe them as "intelligence" tests has
made for endless confusion. The tests are easy to administer, and there
are a variety of tests and norms available for different purposes as well as
tables to facilitate conversion of scores from one test to another. However,
normative data for U.S. examinees are not as well developed as some
critics would like (e.g .. see Sattler, 2001).
The intergenerational increase in scores that has occurred on all
measures of eductive ability has resulted in a ceiling effect among older
adolescents and young adults on the Classic SPM. This has sometimes
resulted in inappropriate conclusions being drawn in both research and
individual assessment. Although the test's discriminative power at this
level has been restored with the publication of the SPM Plus. the pool
of normative data that are available for the new test is nothing like as
extensive as that for the Classic SPM, and it is, in fact, important that the
pool of normative data for all the tests be extended.
The chief limitations of the suite of Raven tests stem from psycholo-
gists' collective failure to operationalize the McClelland/Raven framework
for thinking about competence and thus address Spearman's (1926)
observation that "Every normal man, woman, and child is a genius at
something ... it remains to discover at what." When the only tool one has
is a hammer, all problems are treated as if they were nails. In the absence
of an appropriate framework for thinking about and describing multiple
talents, it is not only impossible to contextualize RPM scores appropri-
ately. but it is also impossible to resist the pressures that result in meas-
ures of eductive and reproductive ability-and especially the latter-being
used in ways that are altogether inappropriate.
236 John and Jean Raven
SUMMARY
REFERENCES
Court, J. H., & Raven, J. (1995). Manualjor Raven's Progressive Matrices and Vocabulary
Scales. Section 7: Research and rejerences: Summaries oj normative, reliability, and valid-
ity studies and rejerences to all sections. Oxford: Oxford University Press: San Antonio,
TX.: The Psychological Corporation.
Crawford, J. R, DealY, I. J., Starr, J., & Whalley, L. J. (in press). The NART as an index or
prior intellectual functioning: A retrospective validity study covering a 66 year interval.
Psychological Medicine.
Deltour, J. J. (1993). Echelle de Vocabulaire Mill Hill de J. C. Raven: adaptationjranr;aise et
normes comparees du mill hill et du standard progressive matrices (PM38). Manuel et
annexes. Braine Ie Chateau, Belgium: Editions L'Application des Techniques Modernes
SPRL.
Eysenck, H. J. (1953). Uses and abuses ojpsychology. Harmondsworth, UK: Penguin.
Flynn, J. R (1984). The mean IQ of Americans: Massive gains 1932 to 1978. Psychological
Bulletin, 95, 29-51.
Flynn, J. R (1987). Massive IQ gains in 14 nations: What IQ tests really measure. Psychological
Bulletin, 101, 171-191.
Goodlad, J. (1983). A place called school. New York: McGraw-Hill.
Hogan, R (1990). Unmasking incompetent managers. Insight, May 21, 42-44.
Hogan, R (1991). An alternative model oj managerial effectiveness. Mimeo. Tulsa, OK:
Institute of Behavioral Sciences.
Irvine, S. H. (1969). Factor analysis of African abilities and attainments: Continuities across
cultures. Psychological Bulletin, 71, 20-32.
Jensen, A. R (1980). Bias in mental testing. New York: Free Press.
Jensen, A. R (1998). The gjactor: The science oj mental ability. Westport, CT: Praeger.
McClelland, D. C. (1973). Testing for competence rather than for "intelligence." American
Psychologist, 28, 1-14.
Messick, S. (1989). Meaning and values in test validation: The science and ethics of assess-
ment. Educational Researcher, 18(2), 5-11.
Messick, S. (1995). Validity of psychological assessment. American Psychologist, 50(9),
741-749.
Owen, K. (1992). The suitability of Raven's Standard Progressive Matrices for various groups
in South Africa. Personality and Individual Differences, 13, 149-159.
Penrose, L. S. (1938). A clinical and genetic study oj 1280 cases oj mental dejectives (the
'Colchester Survey'). Medical Research Council. Republished by the Institute of Mental
and Multiple Handicaps, 1975.
Raven, J. (1981). Manualjor Raven's Progressive Matrices and Vocabulary scales. research
supplement no. 1: The 1979 British standardisation oj the standard progressive matrices
and mill hill vocabulary scales, together with comparative datajrom earlier studies in the
UK, US, Canada, Germany and Ireland. Oxford: Oxford University Press: San Antonio,
TX: The Psychological Corporation.
Raven Progressive Matrices 237
Raven, J. (1991). The tragic illusion: Educational testing. New York: Trillium Press; Oxford,
UK: Oxford Psychologists Press.
Raven, J. (1994). Managing educationfor effective schooling: The most important problem is to
come to terms with values. Unionville, NY: Trillium Press; Oxford, UK: Oxford
Psychologists Press.
Raven, J. (1997). Educational research, ethics and the BPS. Starter paper and peer reviews
by P. Mortimore, J. Demetre, R. Stainthope, Y. Reynolds, & G. Lindsay. Education
Section Review, 21(2), 3-26.
Raven, J. (2000a). Ethical dilemmas. The Psychologist, 13, 404-406.
Raven, J. (2000b). Manualfor Raven's Progressive Matrices and Vocabulary scales. Research
supplement no. 3 (2nd ed.): A compendium of international and north American normative
and validity studies together with a review of the use of the RPM in neuropsychological
assessment by Court, Drebing, & Hughes. Oxford, UK: Oxford Psychologists Press; San
Antonio, TX: The Psychological Corporation.
Raven, J. (2000c). The Raven's Progressive Matrices: Change and stability over culture and
time. Cognitive Psychology, 41, 1-48.
Raven, J., Raven, J. C., & Court, J. H. (1998a). Manualfor Raven's progressive matrices and
vocabulary scales. Section 1: General overview. Oxford, UK: Oxford Psychologists Press;
San AntoniO, TX: The Psychological Corporation.
Raven, J., Raven, J. C., & Court, J. H. (1998b). Manualfor Raven's progressive matrices and
vocabulary scales. Section 2: The coloured progressive matrices. Oxford, UK: Oxford
Psychologists Press; San AntoniO, TX: The Psychological Corporation.
Raven, J., Raven, J. C., & Court, J. H. (1998c). Manualfor Raven's progressive matrices and
vocabulary scales. Section 4: The advanced progressive matrices. Oxford, UK: Oxford
Psychologists Press; San Antonio, TX: The Psychological Corporation.
Raven, J., Raven, J. C., & Court, J. H. (1998d). Manualfor Raven's progressive matrices and
vocabulary scales. Section 5: The Mill Hill vocabulary scale. Oxford, UK: Oxford
Psychologists Press; San Antonio, TX: The Psychological Corporation.
Raven, J., Raven, J. C., & Court, J. H. (2000). Manualfor Raven's progressive matrices and
vocabulary scales. Section 3: The standard progressive matrices. Oxford, UK: Oxford
Psychologists Press; San Antonio, TX: The Psychological Corporation.
Raven, J., & Stephenson, J. (Eds.). (2001). Competence in the learning society. New York:
Lang.
Sattler, J. M. (2001) Assessment of chilldren: Cognitive applications. (4th ed.). San Diego,
CA: Jerome M. Sattler.
Schmidt, F. 1., & Hunter, J. E. (1998). The validity and utility of selection methods in per-
sonnel psychology: Practical and theoretical implications of 85 years of research find-
ings. Psychological Bulletin, 124(2), 262-274.
Silvey, J. (1972). Long-range predictions of educability and its determinants in East Africa
(Chapter 42). In 1. J. Cronbach & P. J. Drenth (Eds.), Mental tests and cultural adapta-
tion. The Hague: Mouton.
Spearman, C. (1926). Some issues in the theory ofg (including the law of diminishing returns).
Address to the British Association Section J-Psychology, Southampton, England, 1925.
London: Psychological Laboratory; University College: Collected Papers.
Spearman, C. (1927a). The abilities of man. London: Macmillan.
Spearman, c. (1927b). The nature of "intelligence" and the principles of cognition (2nd ed.).
London: Macmillan.