You are on page 1of 10

RATER AGREEMENT FOR THE REY-OSTERRIETH COMPLEX FIGURE TEST

JOSHUA LIBERMAN AND WALTER STEWART


School of Hygiene and Public Health
The Johns Hopkins University

OLA SEINES AND BARRY GORDON


School of Medicine
The Johns Hopkins University

This report assesses the intrarater and interrater reliability of quantitatively


scoring the Rey-Osterrieth Complex Figure Test (RCF). The intrarater cor-
relation coefficients were .96, .99, and .96, and the interrater correlation
coefficients were 38, .97, and .96 for the Copy, Immediate Recall, and
Delayed Recall, respectively. However, statistically significant mean dif-
ferences in score were found between raters on the Copy, Immediate, and
Delayed Recall. Though the majority of structural units within the RCF are
reliably scored, several units had a greater magnitude of observed scoring
differences compared to the other units after adjusting for expected
differences. Overall, reliability estimates demonstrate high intrarater relia-
bility and acceptable interrater reliability except for the potential for
systematic scoring differences.

The Rey-Osterrieth Complex Figure Test (RCF; Osterrieth, 1944; Rey, 1942), a
measure of short- and long-term visuographic memory, was developed originally as a
clinical tool and has been used to assess memory in clinic populations of normal and
emotionally disturbed children (Taylor, 1959; Waber, Bernstein, & Merola, 1989), head-
injured adults (Bigler, Rosa, Schultz, Hall, & Harris, 1989; Brooks, 1972), dementia
patients (Bigler et al., 1989), and epilepsy patients (Loring, Lee, & Meador, 1988; Milner,
1975; Taylor, 1979). As a clinical instrument, the RCF does not require a specific and
highly repeatable scoring system. “Minor” changes in a score do not alter clinical judg-
ment, and, thus, the score’s repeatability is not of concern. Recently, however, the RCF
has been used in population-based studies (Concha et al., 1992; Stewart et al., 1994)
to examine the relationship between suspected central nervous system impairment and
changes in memory function. In population studies, small differences between exposed
and non-exposed populations or small changes over time in the same individual are of
primary interest. An unreliable scoring method will hamper the ability to detect such
differences. Furthermore, without a standardized scoring procedure, it is difficult to com-
pare results across different studies.
Published interrater reliability estimates for RCF quantitative scoring systems
typically report correlation coefficients of .98 (Loring, Martin, Meador, & Lee, 1990;
Strauss & Spreen, 1990) with estimates for the various stages of the RCF that range
from .80 (Berry, Allen, & Schmitt, 1991) to .99 (Carr & Lincoln, 1988). However, these

This investigation was supported by Public Health service grant number NS26450 awarded by the
National Institute of Neurological Disorders and Stroke and supported by the United States Olympic Foun-
dation. The authors gratefully acknowledge Roderick D. Randall for his contribution to the completion of
this study.
Reprint requests should be addressed to Joshua N. Liberman, The Johns Hopkins University, School
of Hygiene and Public Health, Department of Epidemiology, 615 N. Wolfe St., Room 6038, Baltimore, MD
21205.

615
616 Journal of Clinical Psychology, July 1994, Vol. SO, No. 4

reports simply report overall between-rater correlations and do not describe within-rater
correlations or the distribution of scoring discrepancies.
In this paper we describe both within-rater and between-rater variation in the quanti-
tative assessment of the RCF. We estimate the measurement error introduced by multi-
ple ratings and multiple raters, as well as the scoring reproducibility of individual com-
ponents within the Rey Figure. Finally, while applying a quantitative scoring system,
we assess the RCF’s reliability as a neuropsychologic instrument for population research.

METHOD
Subjects
RCFs were completed as part of a neuropsychology test battery by 486 male amateur
boxers enrolled in a longitudinal study. Details of the study have been published elsewhere
(Stewart et al., 1993). In brief, testing was completed between November 1986 and August
1988, and all active members of the United States Amateur Boxing Federation between
13 and 21 years of age and residing in six cities were invited to participate in the study.
The six study sites were Washington, D.C., Houston, TX, Lake Charles, LA, Cleveland,
OH, St. Louis, MO, and New York, NY. Study participants ranged from individuals
who simply registered with the ABF, but never competed, to Olympic-class amateur
boxers.
Procedure
The RCF is administered in three stages: the Copy, Immediate Recall (IR), and
Delayed Recall (DR). First, the subject attempts to copy the RCF Figure (Figure 1; subse-
quently referred to as the “Figure”), referring to the Figure as necessary (Copy). After
completing the Copy, the subject is asked to reproduce the Figure again; this time from
memory (IR). Finally, after a 30-minute delay, the subject again is asked to reproduce
the Figure from memory (DR).
In September 1990, 2 years after testing was completed, a random sample of 60
RCFs was selected to examine scoring reliability. The 60 tests were scored independently
by two reviewers (raters 1 and 2), blind to subject identifiers. Each RCF was assigned
four scores, score 1 and score 2 from rater 1 and rater 2. The entire scoring process
required 2 months to complete, with a minimum interval of 1 week between repeated
scorings of an individual test, The two observers received identical training in the
administration, interpretation, and scoring of the RCF and had a minimum of 3 years’
experience scoring more than 500 individual RCF tests.
For scoring purposes, the Figure is partitioned into 18 structural units (Table 1).
Each unit is assigned a maximum of 2 points in half-point increments (Lezak, 1976;
Osterrieth, 1944) for a maximum score of 36. Each of the 18 units is evaluated in-
dependently for two characteristics, completeness and placement. A score of 2 points
is assigned if the unit is complete and properly placed within the Figure. However, an
error in either placement or in completeness results in a loss of 1 point. If both errors
occur, the structural unit is incomplete and misplaced, and 1% points are deducted from
the 2-point maximum. Finally, if the unit is missing completely, a score of zero is assigned.
The final score is the sum of the points assigned to the 18 units, independently scored
for each stage (Copy, IR, and DR).

Analysis
Within-rater and between-rater repeatability of the total score are assessed for the
Copy, IR, and DR. Within-rater reliability is assessed using the first and second rating
from both observers, while between-rater repeatability is assessed using only the first
RCF Reliability 617

I
/
7
FIG. 1 . The Rey-Osterrieth Complex Figure.

rating from each observer. Repeatability is assessed for the total score and the individual
scores assigned to each structural unit.
Within- and between-rater repeatability for structural units is measured using an
observed-to-expected(O/E) ratio. The O/E ratio compares the observed differences be-
618 Journal of Clinical Psychology, July 1994, Vol. 50, No. 4

Table 1
The Structural Units and Scoring Criteria for the Ray Complex Figure

Unit Structure within figure

1. Cross upper corner, outside of rectangle


2. Large rectangle
3. Crossing diagonals
4. Horizontal midline
5. Vertical midline
6. Small rectangle, within the large rectangle to the left
7. Small segment above the small rectangle
8. Four parallel lines within the large rectangle, upper left
9. Triangle above the large rectangle, upper right
10. Small vertical line within the large rectangle, below nine
11. Circle with three dots within the large rectangle
12. Five parallel lines within the large rectangle crossing 3, lower right
13. Sides of triangle attached to the large rectangle on the right
14. Diamond attached to 13
15. Vertical lines within triangle 13 parallel to the right vertical of the large rectangle
16. Horizontal lines within 13, continuing 4 to the right
17. Cross attached to 5 below the large rectangle
18. Square attached to the large rectangle, lower left

Scoring criteria Points awarded

Correct, placed properly 2 points


Correct, placed poorly 1 point
Distorted or incomplete, placed properly 1 point
Distorted, placed poorly .5 point

tween two ratings of an RCF to the difference expected given the distribution of scores.
An O/E ratio is calculated for each of the 18 structural units within the Figure. To
calculate the O/E ratio for a specific unit, the sum of observed differences between score
1 and score 2 across all subjects is divided by the expected differences between score
1 and score 2.
The observed difference (Oi) is simply the sum of the differences between Score
1 and Score 2 for unit i across all 60 tests. In other words,
60
Oi =, C (Rik - Rij)
j,k= 1

where Rik is the score from score 2, and Rij is the score from score 1. The expected
difference between score 1 and score 2 is derived by multiplying the probability of each
scoring combination by the score differential for that combination. That is, if scores
were assigned randomly, it would be expected that 25% of the scores would be 0, 25%
would be .5, 25% would be 1, and 25% would be 2. However, scores are assigned to
units based on accuracy and completeness, not at random. Thus, after obtaining the
distribution of scores for score 1 and the distribution of scores for score 2, the prob-
ability of a specific scoring combination occurring “at random” is obtained by multiply-
ing the probability of score x occurring by the probability of score y occurring. The
score differential for each combination is simply the absolute difference between score
x and score y for score 1 and score 2, respectively. In other words, the expected difference
for structural unit i, Ei, is derived as:
RCF Reliability 619

where P j k is the expected probability of score j from score 1 and score k from score
2, and w j k is the absolute difference in score between j and k. Ei represents the expected
difference in the score for a specific unit on a single test. Because the expected difference
for unit i does not differ between tests, multiplying Ei by 60 provides the expected
difference for the sample of 60 tests. The entire ratio is multiplied by 100to yield a percent.
In practical terms, an O/E ratio of 0% means that two observers never disagreed
on the specific score assigned to that particular unit across all subjects. For example,
to attain an O/E ratio of 0070 for unit #1 (Table l), rater 1 and rater 2 would assign
the identical score to unit #1 for every subject. As the percent O/E ratio increases, the
amount of disagreement, i.e., the number of disagreements and the magnitude of the
disagreements, increases.

RESULTS

The distribution of within- and between-rater differences in scoring the RCF is


described first, followed by a description of the distribution of differences in scoring
the structural units of the Figure.
RCF Total Score, Intrarater Reliability
There is a strong ceiling effect for the Copy; on a 36-point scale, more than 50%
of the scores are greater than 33 points. The scores for the IR and DR exhibit a broader
range of scores. The IR scores range from 2 to 30 for rater 1 and from 4 to 31.5 for
rater 2. Delayed Recall scores for both rater 1 and rater 2 range from 4 to 30. The DR
scores for both observers are bimodal with modes near 15 and 22 points.
The Spearman Rank Correlation Coefficients (rs)for rater 1 range from .957 for
the Copy to .988 for the IR; the correlations for rater 2 are slightly lower and range
from .934 for the Copy to .979 for the IR. For both observers, the highest intrarater
correlation was found for the Immediate Recall and the lowest correlation for the Copy.

Table 2
Average Dizerence and 99% Confidence Intervals (CI) for Intrarater and Interrater Scores on
the RCF Tests

Spearman Spearman
M correlation correlation
Test difference Range 99% CI (rs 1 P
Intrarater'
Rater 1 COPY .61 (-3.0,4.0) (. 17,1.05) .957 < .oO01
IR .55 ( - 1.5,3.5) (.20,.90) .988 < .o001
DR .48 (-5.0,S.O) (-.06,1.02) .959 < .o001
Rater 2 COPY .18 ( - 3.0,3.5) ( - .16, .52) ,934 < .o001
IR - .06 ( - 2.5.4.5) ( - .46, .34) .979 < .o001
DR - .02 (-4.0,4.5) ( - .43, .40) .964 < .o001
Interraterb
COPY 1.42 (-4.0,2.0) (.65,2.19) .878 < .o001
Rater 2 - rater 1 IR 1.19 ( - 6.0,3.0) (.70,1.68) .966 < .o001
DR 1.35 ( - 11.0,2.0) (.85,1.85) .956 < .o001

'Difference is calculated as the Score 1 subtracted from Score 2.


bDifference is calculated as Score 1 from rater 1 subtracted from Score 1 from rater 2.
620 Journal of Clinical Psychology, July 1994, Vol. SO, No. 4

The absolute difference between repeated scores of individual tests for Rater 1 ranges
from -3.0 to 4.0 for the Copy, from - 1.5 to 3.5 for the IR, and from -5.0 to 5.0
for the DR (Table 2). The average difference between score 1 and score 2 of rater 1
was near .5 for each of the three tests, and for rater 2 it was less than .2 point. The
mean difference between ratings for rater 1 is statistically significant (99'70 Confidence
Interval) for the Copy and IR tests, but the differences are not statistically significant
for rater 2, with differences in scores that range from - 3.0 to 3.5 for the Copy, from
-2.5 to 4.5 for the IR, and from -4.0 to 4.5 for the DR.
RCF Total Score, Interrater Reliability
The score of rater 2 subtracted from rater 1 for score 1 is used to estimate inter-
rater reliability. The mean score difference between rater 1 and rater 2 was 1.4 on the
Copy, 1.2 on the Immediate Recall, and 1.4 on the Delayed Recall (Table 2). The Spear-
man rank correlations that compared rater 1 to rater 2 range from .88 to .97 for the
Copy and IR, respectively. Score differences range from - 2.5 to 7.5 for the Copy, from
- 1.5 to 5.5 for the IR, and from -2.5 to 4.5 for the DR.
The Copy scores of rater 2 are systematically higher than the scores of rater 1 (Figure
2a). The absolute scoring differences are greater for lower score values and decrease
as the original score increases. That is, as a Figure score approaches the maximum of
36, the differences between rater 1 and rater 2 decrease. The IR and DR scores of rater
2 are also systematically higher than the scores of rater 1; however, the differences do
not vary by score level (Figures 2b and 2c).
RCF Structural Units
Unit scores for the Copy test rarely differed between the first and second rating
for rater 1. Only 7.2% of the paired ratings (n = 1,080) differed. Score 1 and score
2 scores were identical on 55 or more of the 60 Copy tests for 16 of the 18 units; units
#3 and #13 were scored identically on 51 tests. When differences in Copy score did oc-
cur, the magnitude of the difference was minimal; only one unit (#16) ever differed by
more than 1 point. In fact, scores for units #4 and #7 never differed by more than .5 point.
Scoring for several units was considerably less reliable for the IR and DR than for
the Copy, especially as the total score for the IR and DR decreased. For example, the
same observer scored units #6, #7, and #9 as completely correct on one rating and
completely incorrect or missing on the other rating. Furthermore, unit #3 had 9, 10,
and 12 scoring differences for the Copy, IR, and DR, respectively. Only unit #6 on the

b) Immediate Recall c) Delayed Recall


8
8

-6
-8
0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 35

Rater 1, Copy Score Rater 1, IR Score Rater 1, DR Score

FIG. 2. Plots of the differencefrom subtracted interrater scores from rating 1 of the RCF, Copy, Im-
mediate, and Delayed Recall. (The difference is calculated as Rater 2 minus Rater 1.)
R CF Reliability 62 1

DR has more discrepant scores with differences on 14 of 60 tests. Other than the noted
differences for these units, all intrarater scoring differences were minimal. Differences
of 1 or more points are found only for unit #8 for the IR test and units #6, #7, and
#9 for the DR test.
All of the intrarater O/E ratios for each of the 18 units were less than 100, which
indicates that the differences between score 1 and score 2 from rater 1 were less than
expected. However, there were notable relative differences in the O/E ratio between struc-
tural units. For the Copy (Table 3), O/E ratios range from a low of 2.5% for unit #7

Table 3
Observed-to-expectedRatios for Intrarater and Interrater Agreement of the 18 Structural Units
of the Rey-Osterrieth Complex Figure, Copy, IR, and DR Tests

Expected difference” Observed difference Observed/expected (To)


Unit Copy IR DR Copy IR DR Copy 1R DR

lntraobserver
1 22.1 39.5 40.7 1.0 5.0 4.0 4.5 12.7 9.8
2 22.2 32.1 30.2 3.0 3.5 6.0 13.5 10.9 19.9
3 13.8 47.6 48.4 7.5 5.0 7.5 54.3 10.5 15.5
4 23.1 41.6 40.9 1.5 4.5 2.5 6.5 10.8 6.1
5 3.6 54.1 48.8 3.0 4.5 3.5 83.3 8.3 7.2
6 21.3 51.1 48.8 4.5 5.0 13.5 21.1 9.8 27.7
7 40.7 44.7 43.2 1.0 2.5 7.5 2.5 4.5 17.4
8 24.4 45.5 45.5 4.0 5.5 5.0 16.4 12.1 11.0
9 16.0 49.9 43.6 4.0 3.0 5.5 25 .O 6.0 12.6
10 12.0 11.0 4.8 5.0 1.5 2.0 41.7 13.6 41.7
11 10.2 40.0 40.1 2.5 6.5 5.0 24.5 16.2 12.5
12 7.5 48.7 47.6 2.0 5.5 4.0 26.7 11.3 8.4
13 23.1 45.8 45.3 6.0 5.5 7.0 26.0 12.0 15.5
14 10.1 47.4 50.3 2.0 3.0 3.0 19.8 6.3 6.0
15 10.3 51.0 48.3 4.0 5.0 3.0 38.8 9.8 6.2
16 20.3 59.8 58.8 5.5 2.0 6.5 27.1 3.3 11.1
17 14.2 36.9 43.1 5.0 0 1 .o 35.2 0 2.3
18 11.0 39.4 37.4 2.0 4.5 4.5 18.2 11.4 12.0
~~~

Interobserver
1 22.1 39.5 40.7 15.0 12.0 20.5 67.9 30.4 50.4
2 22.2 32.1 30.2 4.5 5.5 4.0 20.3 17.1 13.2
3 13.8 47.6 48.4 11.5 6.5 5.0 83.3 13.7 10.3
4 23.1 41.6 40.9 5.5 9.0 9.5 23.8 21.6 23.2
5 3.6 54.1 48.8 5.0 11.5 7.0 138.9 21.3 14.3
6 21.3 51.1 48.8 7.0 8.0 11.5 32.9 15.7 23.6
7 40.7 44.7 43.2 2.0 6.0 5.5 4.9 13.4 12.7
8 24.4 45.5 45.5 5.5 9.5 9.5 22.5 20.9 20.9
9 16.0 49.9 43.6 7.5 4.5 4.5 46.9 9.0 10.3
10 12.0 11.0 4.8 6.0 4.5 1.5 50.0 40.9 31.2
11 10.2 40.0 40.1 10.5 6.5 10.5 102.9 16.2 26.2
12 7.5 48.7 47.6 3.0 9.5 6.5 40.0 19.5 13.7
13 23.1 45.8 45.3 9.5 6.5 9.5 41.1 14.2 21.0
14 10.1 47.4 50.3 5.5 6.0 5.5 54.5 12.7 10.9
15 10.3 51.0 48.3 7.0 10.5 9.5 68.0 20.6 19.7
16 20.3 59.8 58.8 8.5 7.0 10.5 41.9 11.7 17.9
17 14.2 36.9 43.1 5.0 7.5 7.0 35.2 20.3 16.2
18 11.0 39.4 37.4 5.5 9.5 12.0 50.0 24.1 32.1
~ ~~~ ~~ ______ ____

‘See Method Section for a description of “Expected Difference.”


622 Journal of Clinical Psychology, July 1994, Vol. 50, No. 4

to a high of 83.3% for unit #5.IR ratios range from 0% to 16.2’70,and DR ratios range
from 2.3% to 41.7% and on average are substantially lower than the Copy.
Interrater comparisons of unit scores show that 16 of the 18 units in the Copy and
IR tests and 14 of the 18 units in the DR received identical scores more than 80% of
the time. The most difficult units to score were units #1, #3, #6,#8, #13, and #18 with
the most striking differences shown for unit #1, the cross on the upper left side of the
large rectangle of the Figure. For unit #1 on the DR, rater 1’s score differs from rater
2’s score on 21 of the 60 tests, with the majority of disagreements differing by 1 point.
The interrater O/E ratios (Table 3) are again consistently higher for the Copy than
either the IR or DR tests. Unit #5 on the Copy has an O/E ratio of 138% and unit
#11 has a ratio of 102.9%, but no other ratios are greater than expected (100%). The
largest ratios for the IR and DR tests, respectively, are for unit #10 (40.9%) and unit
#1 (50.4%). O/E ratios for the IR range from 9.0% (unit #9)to 40.9% (unit #lo), and
DR ratios range from 10.3% (units #3 and #9)to 50.4% (unit #l), with the next highest
ratio at 32.1070(unit #18). The only structure that differs from the others across all tests
is the cross on the upper left side of the large rectangle, unit #1. For the Copy test,
unit #1 has the fifth highest ratio, while in the IR and DR, it has the second highest
and highest ratios, respectively.

DISCUSSION
In the current analysis, the quantitative scoring of the RCF is highly repeatable.
Intrarater reliability is very high, and interrater reliability is good. Systematic differences
were detected on all three components of the test, and statistically significant mean
differences were found for the Copy, Immediate Recall, and Delayed Recall. Given these
interrater differences, it seems plausible that clinical or research judgment may be in-
fluenced by an aberrant score. It must be noted however, that the RCF is rarely, if ever,
used as the sole measure in determining a clinical diagnosis, and poor repeatability in
scoring the RCF simply decreases its effectiveness as a clinical tool. Nonetheless, with
the possibility of differences in score approaching 20% of the measurement scale, a
modification of existing scoring procedures may improve its sensitivity in population-
based studies.
There are several methods for scoring the RCF. Each scoring method assesses one
dimension of the RCF, which may correspond to somewhat independent clinical/neuro-
psychologic disorders. Proposed methods include scoring of the drawing strategies (e.g.,
the order of each line drawn), the organization of the completed Figure, sequencing,
degree of fragmentation, directional tendencies, the attention given to the task, and a
recently developed scoring system based upon qualitative errors (Bennett-Levy, 1984;
Binder, 1982;Loring et al., 1988;Waber & Holmes, 1985). The most frequently used
scoring method is a quantitative measure of performance represented by an absolute
score assigned to the entire Figure based upon the presence, placement, and accuracy
of each unit within the Figure. This score, when compared to published age-specific nor-
mative data (Osterrieth, 1944)can reveal clinically significant neuropsychological deficits.
The current scoring guidelines allow flexibility in determining the accuracy of struc-
tural units within the Figure. Before scoring the RCF, a rater must decide on the ade-
quacy of the subject’s motor control and on the subject’s interpretation of the instruc-
tions. Poor motor control or misinterpretation of the instructions may result in a poor-
ly drawn Figure that does not necessarily reflect a neuropsychologicimpairment. A poorly
drawn Figure may make accurate scoring of the Figure difficult by increasing the number
of careless mistakes, which are not addressed adequately in the scoring instructions.
The difficulty in interpreting a structure that is inaccurately drawn may differ substan-
tially from that required to score a slightly misplaced object. However, there are no
specific rules to define the distance or degrees of rotation required for an element to
RCF Reliability 623

be judged as misplaced, nor are rules provided to define the amount of distortion,
fragmentation, or omission that constitutes an incomplete structure. The current analysis
shows that units #1, # 5 , #6, and # l o frequently receive discrepant scores and that the
scoring differences are often greater than for the other units. These differences persist
for both intra- and interrater scores. The inability to score these units reliably while
others receive comparable scores suggests that these units may require more stringent
and objective scoring criteria than are currently available.
Without a specific and objective scoring criterion, the decision that determines a
correctly drawn and appropriately placed scoring unit will differ from observer to
observer. For example, it seems quite plausible in the absence of focal or severe brain
damage or behavioral disorders to expect a near-perfect drawing on the Copy given that
the Figure is presented for viewing and no time limits are imposed. Under this assump-
tion, there is little reason to expect a less than perfect drawing in the Copy. Therefore,
digressions from the Figure may be interpreted as true errors as opposed to poor draw-
ing skills. This scenario, however, dictates that even minor digressions from the Figure
be assessed subjectively for accuracy, thus actually increasing the number of borderline
decisions that must be made to score the Figure. Second, fewer “obvious” errors occur
on the Copy stage; omitted or misplaced segments are less likely to occur if the Figure
can be referred to repeatedly. Thus, the errors, in the form of minor inaccuracies or
misplacements, require a subjective assessment for accuracy and precision, which are
difficult to standardize for even a single observer.
Overall, the RCF’s scoring system presents adequate repeatability for clinical pur-
poses. However, population studies with repeated measures require consistently high
repeatability, and the RCF’s scoring system shows the potential for systematic scoring
differences as well as repeatability that varies from unit to unit. The differences that
occur within-rater should have no impact upon clinical judgments and only a slight im-
pact upon population research estimates. RCF scores generated by different raters,
however, may not always be comparable, for relatively strong differences may occur
both in the raw score and in the assessment of the individual structures. A simple strategy
may diminish the impact of the lower interrater reliability without altering the current
scoring guidelines. First, each RCF should receive two independent scores from com-
parably trained raters. If a discrepancy exists after scoring, the average of the two original
scores may be averaged for a pooled score, or a third independent score may be used
to resolve the disagreement.
REFERENCES

BENNETT-LEVY, J. (1984). Determinants of performance on the Rey-Osterrieth Complex Figure Test: An


analysis, and a new technique for single-case assessment. British Journal of Clinical Psychology, 23,
109-119.
BERRY,D. T. R., ALLEN,R. S., 81 SCHMITT, F. A. (1991). Rey-Osterrieth Complex Figure: Psychometric
characteristics in a geriatric sample. Clinical Neuropsychologist, 5 , 143-153.
E. D., ROSA,L., SCHULTZ,
BIGLER, F., HALL,S., & HARRIS, J. (1989). Rey-Auditory Verbal Learning and
Rey-OsterriethComplex Figure design performance in Alzheimer’sdisease and closed head injury. Journal
of Clinical Psychology, 45, 277-280.
BINDER,L. M. (1982). Constructional strategies on complex figure drawings after unilateral brain damage.
Journal of Clinical Neuropsychology, 4, 5 1-58.
BROOKS, D. N. (1972). Memory and head injury. Journal of Nervous and Mental Disease, 155, 350-355.
CARR,E. K., & LINCOLN, N. B. (1988). Inter-rater reliability of the Rey Figure Test. Britr;FhJournal of Clinical
PSychOlOgy, 27, 267-268.
CONCHA,M., GRAHAM, N. M. H., MUNOZ,A., VLAHOV, D., ROYAL111, W., UPDIKE,M., NANCESPROSON,
T., SELNES, 0. A., & MCARTHUR, J. C. (1992). Effect of chronic substance abuse on the neuro-
psychological performance of intravenous drug users with a high prevalence of HIV-1 seropositivity.
American Journal of Epidemiology, 136, 1338-1348.
LEZAK,M. D. (1976). Neuropsychological assessment. New York: Oxford University Press.
624 Journal of Clinical Psychology, July 1994, Vol. 50, No. 4

LORING,D. W., LEE, G. P., & MEADOR,K. J. (1988). Revising the Rey-Osterrieth Complex Figure Test:
Rating right hemisphere recall. Archives of Clinical Neuropsychology, 3, 239-247.
LORING,D. W.,MARTIN,R. C., MEADOR,K. J., & LEE, G. P. (1990). Psychometric construction of the
Rey-Osterrieth Complex Figure: Methodological considerations and interrater reliability. Archives of
Clinical Neuropsychology. 5 , 1-14.
MILNER,B. (1975). Psychological aspects of focal epilepsy and its neurosurgical management. In D. P. Pur-
pura, J. K. Penry, & R. D. Walter (Eds.), Advances in neurology (pp. 299-321). New York: Raven Press.
OSTERRIETH, P. A. (1944). Le test du copie d‘une figure complexe. Archives of Psychology, 30, 206-356.
b y , A. (1942). L’examen psychologique dans les cas d‘encephalopathietraumatique. Archives of Psychology,
28, 286-340.
STEWART, W. F., GORDON, B., SELNES, O., BANDEEN-ROCHE, K., ZEGER,S., TUSA,R. J., CELENTANO, D.
D., SHECHTER, A., LIBERMAN, J., HALL,C., SIMON,D., LESSER,R., dr RANDALL, R. D. (1994). A pro-
spective study of CNS function in the United States amateur boxers. American Journalof Epidemiology,
139, 573-588.
STRAWS,E., & SPREEN, 0. (1990). A comparison of the Rey and Taylor Figures. Archives of Clinical Neuro-
psychology, 5 , 417-420.
TAYLOR,E. M. (1959). Psychological appraisal of children with cerebral defects. Cambridge, MA: Harvard
University Press.
TAYLOR L. B. (1979). Psychological assessment of neurosurgical patients. In T. Rasmussen & R. Marino
(Eds.), Functional neurosurgery (pp. 165-180). New York: Raven Press.
WABER,D. P., BERNSTEIN, J. H., & MEROLA,J. (1989). Remembering the Rey-Osterrieth Complex Figure:
A dual-code cognitive neuropsychological model. Developmental Neuropsychology, 5 , 1-15.
WABER,D. P., HOLMEG, J. M. (1985). Assessing children’s copy productions of the Rey-Osterrieth Com-
plex Figure. Journal of Clinical and Experimental Neuropsychology, 7 , 264-280.

LAY THEORIES OF PSYCHOTHERAPY AND


PERCEPTIONS OF THERAPISTS:
A REPLICATION AND EXTENSION OF FURNHAM AND WARDLEY
JANE L. WONG
University of Northern lo wa

This study examined laypersons’ perceptions of psychotherapy, the experience


of psychotherapy clients, and therapist credibility. Most of Furnham and
Wardley’s recent findings of positive and realistic lay views on psychotherapy
and therapists were replicated. Age, sex, and psychological experience
mediated some beliefs about therapy and clients’ experiences. Although
positive views were associated with higher ratings on some aspects of therapist
credibility, only the modality of the therapy (behavior vs. client-centered
vs. rational-emotive individual therapy) influenced perceptions of overall
credibility. Neither the fee charged nor the participant’s sex had any effect
on perceptions of therapist credibility.
According to the social influence theory (Strong, 1968), therapists who are seen as
more credible by their clients will be stronger social influences than therapists perceived to
be less credible. Considerable research has focused on what influences clients’ perceptions

I thank Michael Clark and Camille Salmon for their help in collecting the data. This work was supported
by a Publication Grant from the Graduate College of the University of Northern Iowa.
Reprint requests should be addressed to Dr. Jane L. Wong, Department of Psychology, University of
Northern Iowa, Cedar Falls, IA 50614-0505.

You might also like