You are on page 1of 15

This article was downloaded by: [Moskow State Univ Bibliote]

On: 29 January 2014, At: 19:31

Publisher: Routledge
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered
office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Clinical and Experimental

Publication details, including instructions for authors and
subscription information:

Development and Preliminary

Standardization of the Extended
Complex Figure Test (ECFT)
Philip S. Fastenau

Michigan State University

Published online: 04 Jan 2008.

To cite this article: Philip S. Fastenau (1996) Development and Preliminary Standardization of the
Extended Complex Figure Test (ECFT), Journal of Clinical and Experimental Neuropsychology, 18:1,
63-76, DOI: 10.1080/01688639608408263
To link to this article:


Taylor & Francis makes every effort to ensure the accuracy of all the information (the
Content) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever as to
the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content
should not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions, claims,
proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or
howsoever caused arising directly or indirectly in connection with, in relation to or arising
out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
Conditions of access and use can be found at

Journal of Clinical and Experimental Neuropsychology

1996, Vol. 18, NO. 1, pp. 63-76

0 Swets & Zeitlinger

Development and Preliminary Standardization of the

Extended Complex Figure Test (ECFT)*
Philip S. Fastenau
Michigan State University

Downloaded by [Moskow State Univ Bibliote] at 19:31 29 January 2014

Recognition and matching trials were designed for the Rey-Osterrieth Complex Figure Test (ROCFT).
Following pilot testing and expert review, they were standardized using 90 community-dwelling adults
(58% female, ages 30 to 88). Recognition has 30 multiple-choice items for different figural elements;
scores distributed normally with strong item-total correlations and with normally distributed item difficulties. Cronbach alphas were .84, .61, and .81 for the Total, Global, and Detail Scales. Recognition correlated
.81 with ROCFT recall and .65 with Visual Reproductions. Matching has 10 multiple-choice items; scores
were negatively skewed with a substantial ceiling effect. Alpha for Matching was .58, limited in part by
few items. Matching correlated .h8 with Judgment of Line Orientation and .74 to .90 with copy trials. Both
Recognition and Matching discriminated 34 patients with intractable epilepsy from 34 matched controls.
Overall, Recognition appeared to be reliable and showed evidence of validity. By comparison, Matching
reliability and validity were less impressive and warrant further examination.

In addition to the psychometric standards that

apply to all psychological and educational tests
(American Psychological Association, 1985;
Anastasi, 1982), a visual-spatial memory test
should contain several standard features. First,
comparison between immediate and delayed
trials provides an index of consolidation (retention or rate of forgetting). Second, comparison
between free recall and recognition performances clarifies the relative contributions of
encoding and retrieval processes to total memory performance. Diagnostically, inclusion of
both recall and recognition measures increases
the sensitivity of the memory test:
Failure to recognize words [or figures] as
having been previously presented is a more

reliable sign of memory disorder than failure

to recall, and recognition failure denotes a
more severe disorder as well. The use of recall and recognition tests together makes it
possible to detect subtle, early signs o f impairment (Squire, 1986, p. 280).
Third, constructional and perceptual deficits
must b e ruled out as confounds in memory performance. Visual-spatial memory tests often
require patients to draw figures from memory.
Low memory scores may be indicative of motor
or praxis limitations only, thereby masking intact visual memorization skills. Thus, a copy
trial is essential to examine constructional ability for the memory stimuli. In addition, a matching trial using those same stimuli is desirable to

* This study was partially funded by the APA Science Directorate. The author acknowledges A1 Manning and
Broughton Hospital, Morganton, 3, for supporting the pilot; Jane Holmes Bernstein for her expert input; Norm
Abeles, Lauren Harris, Neal Schmitt, and Bert Karon for suggestions on the standardization study; John Fisk,
Jeanne Bennett, and Henry Ford Hospital for supporting the clinical validation study; Natalie Denburg, Linda
Sloan, Eric Fertuck, Jennifer Winer, Sandy Scott, Katy Parcells, Lidia Domitrovic, and Mike Finton for assisting
with data collection and processing; Roger Halley for mobilizing material resources; and Dana Atkinson Fastenau
for her loving support. Address correspondence to the author at University of Michigan Medical Center, Neuropsychology Program, 480 Med Inn Building, Box 0840, 1500 East Medical Center Drive, Ann Arbor, Michigan,
48 109-0840, USA.
Accepted for publication: June 15, 1995.

Downloaded by [Moskow State Univ Bibliote] at 19:31 29 January 2014



verify accurate perception, which is integral to

performance on all of the other trials (copy and
Therefore, any measure of visual-spatial memory should contain immediate and delayed trials, recall and recognition trials, and matching
and copy trials. Of the many tests of visual-spatial memory that exist in our field, few even approximate these standards. For example, the Recurring Figures Test (Kimura, 1963), the Recognition Memory Test (Warrington, 1984), and the
Delayed Recognition Span Test (DRST; Moss,
Albert, Butters, & Payne, 1986) measure only
recognition, without recall, and none of these
includes a matching trial.
The Benton Visual Retention Test (BVRT;
Benton, 1974) measures only free recall, which
depends on graphomotor output. It contains neither a recognition trial nor a matching trial. Furthermore, the BVRT has been criticized for the
simplicity of its stimuli (Hemsley, 1974; Zubrick & Smith, 1978, as reported in Lezak,
1983, pp. 450-451).
The Wechsler Memory Scale-Revised (WMSR; Wechsler, 1987) contains two subtests that
measure visual-spatial memory. Visual Reproductions (VR) and Figural Memory (FM)
require recall and recognition, respectively, but
use separate stimuli, thereby precluding direct
comparisons between unfacilitated and facilitated retrieval for the same figures. Several investigators have expanded VR into a more comprehensive memory tool. Kaplan (1988) described adding recognition items, and data have been
reported for a recognition trial (Domitrovic,
Denburg, & Fastenau, 1995; Hanger, Montague,
& Smith, 1991) and for a matching trial
(Domitrovic et al., 1995). Fastenau and Sloan
(1993) added a copy trial.
Even with the recognition, matching, and
copy trials, VR may be an inadequate measure
of visual-spatial memory in many cases. First,
the figures are relatively simple, and most are
symmetrically organized. Consequently, they
are more likely to be encoded verbally (see
Reed, 1974) and they may not sufficiently tax
the upper range of visual-spatial memory capacities (see Palmer, 1977). Second, although the
four or five items in Kaplans and Hangers rec-

ognition trials may suffice as screening instruments, they may be too few for reliable and sensitive diagnostics.
The Rey-Osterrieth Complex Figure Test
(ROCFT; Osterrieth, 1944; Rey, 1941; Rey &
Osterrieth, 1993), like VR, measures free recall
only. As an advantage over VR, the ROCFT
uses an intricate stimulus that is asymmetrical in
its design. The complexity of this stimulus
seems to better tax the upper range of visualspatial processing as compared to the VR stimuli. Furthermore, it appears to be more resistant
to verbal mediation (Casey, Winner, Hurwitz, &
DaSilva, 1991). As a product of this complexity,
patterns of fragmentation, neglect, rotation, and
distortion on the ROCFT correspond to some
degree with the location and type of neurological insult (e.g., Binder, 1982; Brouwers, Cox,
Martin, Chase, & Fedio, 1984; Kaplan, 1988;
Lezak, 1983; Milberg, Hebben, & Kaplan,
The ROCFT administration that is most popular (Knight, Kaplan, & Ireland, 1994; Lezak,
1983) includes a copy and immediate recall trial,
followed 20 to 60 min later by a delayed recall
trial. However, there is neither recognition nor
matching. In the three studies presented here,
recognition and matching trials were developed
to supplement the ROCFT. These trials were
designed to follow the copy, immediate free recall, and delayed free recall trials. This elaborated administration will be called the Extended Complex Figure Test (ECFT). This article describes the design of the ECFT, pilot results and expert review, preliminary standardization with a relatively healthy population, and
preliminary validation with a clinical sample.

Initially, 20 recognition items were designed for
the ROCFT using theory, findings in the literature, and patient records. Figure l exemplifies
the format. Each item consisted of a vertical array of five choices: one element from the original ROCFT stimulus and four distractors, or

Downloaded by [Moskow State Univ Bibliote] at 19:31 29 January 2014


Fig. 1.

Sample recognition item featuring an Outer

Configurational element in a left-specific

design (Left Detail subscale item).


foils, that contained common and/or clinically

significant errors.
Because of its complexity, the Rey figure can
be processed in a variety of ways. However,
analysis of common errors (e.g., Binder, 1982;
Kaplan, 1988; Lezak, 1983) and analysis of patterns by which people organize the stimulus features while drawing them (Waber & Holmes,
1985, 1986) support the inference that people
tend to perceive and encode the Rey figure according to the 18 units identified by Osterrieth
(Osterrieth, 1944; Rey & Osterrieth, 1993).
The classification of the constructional elements by Waber and Holmes (1985, 1986) further guided the development of the instrument.
Base rectangle (BR) and main substructure (MS)
elements comprised the Global Scale; these included the large rectangle, the diagonal cross,
and the horizontal and vertical midlines. Outer
configuration (OC; e.g., cross at far left, diamond at far right) and internal detail (ID; e.g.,
circle with three dots, five horizontals in upper
left quadrant) elements comprise the Detail
Scale. The items sampled fairly representatively
from the different scorable constructional elements of the complete stimulus figure (Osterrieth, 1944; Rey & Osterrieth, 1993).
Within the Detail Scale, some items were designed to be left-specific for the assessment of
left-side neglect (Figure 1). For these items, the
discriminating features were concentrated on the
left side of the drawing for all of the choices so
that the patient would have to attend to that side
of the page and that side of each figure to discriminate between the choices.
For subjects 13 years of age and older, BR
and MS units are typically drawn first, followed
by the more detailed elements (Milberg et al.,
1986; Waber & Holmes, 1985, 1986). Consequently, for the recognition task, Global elements were presented before Detail elements. In
addition, different constructional elements were
alternated (e.g., BR, MS, BR, MS; OC, ID, OC,
ID) to prevent comparisons across consecutive
items and to limit the extent to which previous
choices could provide cues in successive responses.

Downloaded by [Moskow State Univ Bibliote] at 19:31 29 January 2014






As evidence of reliability, an index of internal

consistency for the new instrument indicated
fairly homogeneous content. With regard to construct validity, modest but significant relationships between the two WAIS-R visuospatial
subtests and the recognition task were indicative
of convergence in a common domain (visuospatial functions) yet without complete redundancy. In addition to reliability and validity, this
pilot study showed that most items effectively
discriminated between good and poor performers on the task. Therefore, gross psychometric
indices with a small sample of nonimpaired
adults justified further development of the recognition task.

Using the 20 items described above, Fastenau and

Manning (1992) conducted a pilot study to examine
the psychometric properties of the new recognition
items. The sample consisted of 42 nonimpaired volunteers. Twenty-nine subjects (69%) were hospital employees; 13 volunteers were solicited from an introductory psychology class at a nearby university. Many
of the participants were White (93%) and female
(76%).Age ranged from 18 to 55 years with a mean of
32.5 (SD = 12.2); education varied from 12 to 18
years with a mean of 13.6 (SD= 1.6). The group had
mean peer-equivalent scaled scores of 10 (SD= 2.2,
range 6-15) for the Picture Completion subtest and
10.5 (SD = 2.2, range 6-15) for the Block Design
subtest of the Wechsler Adult Intelligence Scale-Revised (WAIS-R; Wechsler, 1981). A demographicbased formula (Wilson, Rosenbaum, & Brown, 1979)
yielded a mean estimated IQ of 104.5 (SD= 5.0, range
91-1 16).
Subjects were tested individually. The ROCFT copy
trial was immediately followed by a free recall trial.
During a 20-min delay, subjects completed Picture
Completion and Block Design from the WAIS-R. A
delayed free recall trial for the ROCFT was then administered, followed by the recognition task.

Cronbachs alpha for the recognition task was
0.68 (p < .001). Three items detracted from the
overall reliability; deletion of those three items
would raise alpha slightly (0.70). Recognition
total scores (number correct) correlated moderately with raw scores on Picture Completion ( r =
.62, p < .001) and Block Design ( r = .60, p <
.001). For Recognition, most corrected item-total correlations ranged from .26 to .60 (p < .05);
four were not significant ( r < .200, p > .05).
Recognition scores distributed fairly normally
between 12 and 20 with the exception of two
outliers in the lower tail (scores of 5 and 8) and
a slight ceiling effect. The mean was 15.7 (SD=
3.0) with the outliers and 16.2 (SD = 2.2) without the outliers.


Some revisions were made based on the pilot
results and based on other research. One item
was dropped because the item-total correlation
was negative, indicating that those subjects with
good overall performances tended to fail that
item. Right-specific items were added based on
the results obtained by Ogden (1987). She found
that, when speech impairments could be controlled in studies of neglect, neglect in the right
visual field was as frequent as neglect in the left
visual field, although right neglect tended to be
less severe and less enduring. Because the recognition trial can circumvent language deficits,
it seemed especially important to add right-specific items to the Detail Scale. These revisions
expanded the set to 27 items.
Expert appraisal was solicited for initial evaluation of the instruments content validity and
for suggestions to improve on its design. Holmes-Bernstein has examined the design qualities
of the ROCFT and has used it extensively with
children (see Waber, Bernstein, & Merola, 1989;
Waber & Holmes, 1985, 1986). She reviewed
the recognition items and suggested additional
modifications, including the addition of some

Downloaded by [Moskow State Univ Bibliote] at 19:31 29 January 2014


distractors that would reflect idiosyncrasies in

normal and neurological children (J. H.
Bernstein, personal communication, September
12, 1990). In addition to these revisions, some
original items were modified to better reflect
errors made by the normal pilot group. This was
expected to eliminate the ceiling effect observed
in the pilot study.
The final revised set contained 30 recognition
items, which formed the Total Scale. As further
evidence of content validity, the proportions of
BR items (.lo), MS items (.13), OC items (.40),
and ID items (.37) were shown to approximate
the proportions in Osterrieths criteria used for
scoring copy and recall constructions (.06, .17,
.44, and .33, respectively; Osterrieth, 1944; Rey
& Osterrieth, 1993). Seven items comprised the
Global Scale; 23 items comprised the Detail
Scale. Nine of the 23 detail items were left-specific (Left-Detail Subscale); 1 1 were right-specific (Right-Detail Subscale).

Matching items were created from I0 of the 30
recognition items by placing each vertical array
next to a reproduction of the standard (Figure 2 ) .
The matching set included one base rectangle,
one main substructure, four left-detail, and four
right-detail items, All 10 items constituted the
Total Scale; the left-detail items and the rightdetail items comprised the Left- and Right-Detail Subscales, respectively. The total administration of the ROCFT and the two supplementary trials will hereafter be referred to as the
Extended Complex Figure Test (ECFT;
Fastenau & Denburg, 1994).


ECFT Recognition Total scores were expected
to distribute normally in this normal sample,
with no ceiling or floor effects. Every item was
expected to correlate positively with the Total
score, indicating that each item discriminates
between good and poor performance overall. In
addition, there was an attempt to achieve a wide
range of item difficulties (easy items to foster a


Fig. 2.

Sample matching item featuring a Main Substructure element.

sense of confidence in less-able learners and

difficult items to challenge the most skillful learners) with a mean and mode approaching .50
(50% of the subjects answering an item incorrectly), the point of maximum discrimination.
Cronbachs alpha reliability coefficient, an index of internal consistency, were expected to be
very good for the Total Scale and moderate for
the other scales and subscales.
As evidence of construct validity, ECFT Recognition was expected to correlate positively
with other measures of visual-spatial memory
(convergent validity). Discriminant validity
would be demonstrated by much smaller correlations with non-memory visual-spatial measures.

ECFT Matching was designed to be a much easier task. Total scores were expected to skew
negatively with a prominent ceiling effect. It
was predicted that difficulty indices would be
very low and that item-total correlations would
be limited by a restriction of range on the Total



scores. Cronbachs alpha w a s expected to b e

good f o r the Total Scale and modest f o r the
subscales, limited substantially by the few number of items. ECFT Matching was expected to
correlate highly with another measure of visualspatial perception, as evidence of convergent
validity. For discriminant validity, lower correlations were expected with visual-spatial mem-

Downloaded by [Moskow State Univ Bibliote] at 19:31 29 January 2014

ory measures.

The normative sample was comprised of 90 healthy
community-dwelling adults who reported no recent or
active central nervous system conditions and who
lived independently in the community. Volunteers
with uncorrected visual or hearing impairment or with
impaired use of the preferred hand were excluded.
Subjects were solicited from four religious organizations in a midwestern city of 150,000 residents. The
organizations received a monetary contribution for
each of the participants from their group, together
with a bonus for recruiting equal numbers of men and
women from each of ten 5-year age bands (30-34,3539, .._70-74, 75-and-over). This incentive created an
age- and sex-stratified sample; furthermore, because
each organization was equally represented among men
and women and across age groups, potential socioeconomic differences between organizations were unlikely to confound age and sex analyses.
The total normative sample consisted of 38 men
and 52 women. Age ranged 30 to 80 years with one
88-year-old ( M = 55.9, M n = 54.5, Sf)= 14.1). The
younger group (ages 30 to 54, n = 47) was 55% female, and the older group (ages 55 and beyond, n =
43) was 61% female. These sex ratios closely approximate the 1990 U.S. census (51% and 57%, respectively; U. S. Department of Commerce, 1990). Education ranged 8 to 25 years (M and M n = 15.2, SD =
3.0). In this sample, 97% had at least a high school
education, which is higher than the 75% observed nationally among people over age 25 (U. S. Department
of Commerce, 1990). Among the older adults, 88%
had 12 or more years of schooling, compared to 56%
nationwide (U. S. Department of Commerce, 1990).
Therefore, this sample was more educated than the
average U.S. citizen. Age-corrected WAIS-R Vocabulary scale scores ranged 5 to 19 ( M and M n = 12.5, SD
= 2.3).
A structured interview assessed the past history of
potentially confounding health conditions. It addressed the following conditions (percent of the sample
with a positive history): closed-head injury with loss
of consciousness (16%). unexplained loss of con-

sciousness @%), cerebrovascular disease (4%),hydrocephalus (O%), seizures (1 %), intracranial surgery
(1 %), hypertension (3 I%, all well-controlled), coronary artery disease (18%), diabetes (lo%, all wellcontrolled), pulmonary disease (lo%), renal disease
(16%), and hepatic disease (2%).
Levels of depression were assessed using the Beck
Depression Inventory (BDI; Beck, 1978); the mean
and median scores were well within normal limits (5.3
and 4, respectively). Eight percent of the sample
scored in the mildly depressed range, and 2% scored
in the moderate to severe range; these percentages
correspond with national incidence rates, indicative
that the sample is representative on this dimension.
Levels of reactive and chronic anxiety were assessed
using the State-Trait Anxiety Inventory (Spielberger,
Gorsuch, Lushene, Vagg, & Jacobs, 39833; summary
indices for State scores ( M = 30.8, Mn = 29, SD = 8.8)
and for Trait scores ( M = 32.8, M n = 31.5, SD = 8.9)
were virtually identical to those for the standardization sample.

Table 1 provides the descriptive data for all of the
measures analyzed in this study. The battery included
the Wechsler Adult Intelligence Scale-Revised Vocabulary subtest (Wechsler, 1981) and Judgment of
Line Orientation Test (JOLO; Benton, Hamsher, Varney, & Spreen, 1983). Wechsler Memory Scale-Revised (WMS-R; Wechsler, 1987) Visual Reproductions (VR) Immediate and Delayed Recall were also
included, followed by a copy trial for those same stimuli (Fastenau & Sloan, 1993).
The ECET trials were administered in the following order: Copy, Immediate Recall (no latency after
the copy trial), Delayed Recall (20-min latency), Recognition, and Matching. Scoring criteria for ECFT
drawings were modeled after the WMS-R VR scoring
(Wechsler, 1987). Interrater reliability (two raters) on
23 sets of drawings that spanned a wide range of ability was good for the copy drawings (Pearson product
moment r = .90) and very good for immediate and
delayed recall drawings (Y = .97 for both).
All subjects were tested individually. Most subjects
completed the exam in one session; several older subjects required two sessions for optimal testing. The
testing for all subjects consisted of two segments of
cognitive testing (each lasting 50 to 75 min), separated by a break during which they completed the
emotional inventories. The ECFT was administered in
one segment, and the WMS-R was administered in the
other segment. The order of the segments (ECFT first
or WMS-R first) was counterbalanced; subjects were
assigned to the two segment conditions blindly, stratified within each age-sex cell.



Table 1. Demographic and Test Data, by Group, and T Tests Comparing Patients to Controls.






( n = 90)

( n = 34)

( n = 34)







Downloaded by [Moskow State Univ Bibliote] at 19:31 29 January 2014

Age (Years)
Education (Years)
% Female
% Left-Handed
















Test Scores
ECFT Delayed Recall
ECFT Recognition
ECFT Matching
VR Copy
VR Delayed






Note. T tests were two-tailed for demographics and one-tailed for test scores. ECFT = Extended Complex Figure
Test; VR = WMS-R Visual Reproductions; JOLO =Judgment of Line Orientation.

The analyses in this study were conducted using
SPSS (SPSS, Inc., 1990).Because of the relatively
large number of significance tests performed on
this data set, precautions were taken to control for
alpha inflation. Hypotheses were clearly articulated at the outset of the study. Also, a conservative alpha was adopted: Results significant at .05
were regarded as trends; results with p < .01 only
were considered to be reliable.

ECFT Recognition
Total scores distributed normally. Descriptives
are presented in Table 2; where age or sex effects neared significance (p < .05), the results
were stratified accordingly.
Point-biserial correlations between each item
and the Total score were corrected by partialling
out the variance in the Total score that was due
to the item itself. All coefficients were positive:
26 were significant at p < .01 (rs = .24 - .60);
two approached significance ( r = .22 and .18; p
< .05); and two others were not statistically sig-

nificant ( r = .12 and .09; p > .05). Item-difficulty indices (percent of the sample that responded incorrectly) ranged from 3% to 85% and
distributed roughly normally. The mean item
difficulty for the 30 items was 46.5%.
Correlations among scales and subscales are
presented in Table 3. The Global Scale and Detail Scale correlated moderately but not perfectly (.75, corrected for unreliability). The
Left- and Right-Detail Subscales correlated perfectly with one another (.99, corrected). Cronbachs alpha reliabilities are presented on the
diagonal. Within the scales and subscales, no
item detracted from any of the reliabilities except for one item on the Left Detail Subscale,
and that alpha was reduced by less than .02.
Odd-even and split-half reliability coefficients
were similar to alpha (.81 and .78, respectively)
after the Spearman-Brown correction for lengthrelated attenuation.
ECFT Recognition scores were correlated
with other measures of visual memory and with
measures of visual perception (Table 4). After
correcting for attenuation due to imperfect reli-



Table 2. ECFT Recognition Raw Scores.




Total Scalea

Downloaded by [Moskow State Univ Bibliote] at 19:31 29 January 2014







5 .o

Detail Scalea








Right Detail Subscale'




Left Detail Subscalea








Global Scaleb



Note. Sample sizes were: Total, 88; Younger, 47; Older, 41; Men, 37; Women, 5 1; Younger Men, 21 ; Younger
Women, 26; Older Men, 16; Older Women, 25.
a Age and sex effects (p < .05).
No age, sex effects (p > .05).
Age effect only (p < .05).

ability, ECFT Recognition Total scores correlated positively and highly with ECFT Delayed
Recall and with VR Delayed Recall (.81 and .65,
respectively). Correlations with immediate trials
of ECFT and VR were virtually redundant of
those with delayed trials and, therefore, were not

tabulated. ECFT Recognition correlated with

ECFT Matching, ECFT Copy, and VR Copy (.26
- .46) more weakly than it correlated with memory measures ( t > 2 . 3 7 , ~'. .01, one-tailed), as determinedusing the t test for dependent correlations
(Cohen & Cohen, 1983).However, the correlation



Table 3. Correlations Among ECFT Scales and Subscales by Trial.

Recognition Trial
Global (GLO)
Detail (DET)
Left Detail (L-DET)
Right Detail (R-DET)
Total (TOT)










.9 1








.8 1



Matching Trial

Downloaded by [Moskow State Univ Bibliote] at 19:31 29 January 2014


Left Detail (L-DET)

Right Detail (R-DET)
Total (TOT)









Note. Diagonal values (italicized) are Cronbachs alpha reliabilities. Values above the diagonals are corrected for
attenuation due to unreliability in the scales; values below the diagonal are not corrected. All correlations are
significant (p < ,0005, one-tailed). GLO = Global Scale, DET = Detail Scale, L-DET = Left-Detail Subscale, RDET = Right-Detail Subscale, TOT = Total Scale.

of ECFT Recognition with JOLO was not significantly weaker than its correlation with VR Delayed Recall ( t = 1.238, p > .05, one-tailed).

ECFT Matching
Total scores were negatively skewed. They correlated negatively with age ( r = --.25, p = .Ol),
but not with sex or the interaction term 0,> .05).
Scores for the total sample ranged from 6 to 10
( M = 9.4, Mn = 10.0, SD = 1.0). Scores for the
younger group ranged from 7 to 1 0 ( M = 9.6, M n
= 10.0, SD = 0.8), whereas scores for the older
group ranged from 6 to 10 ( M = 9.3, Mn = 10.0,
SD = 1.2).
Corrected item-total correlations were positive and significant, ranging from .24 (p < .01)
to .41 (p < .001). Three items were answered
correctly by everyone in the sample, resulting in
zero variance. Difficulty indices were very low
(ranging from 0% to 18%).
Cronbachs alphas are presented in Table 3.
The Total Scale had a modest inter-item reliability ( . 5 8 ) , suppressed in part by three items with
zero variance. None of the other items detracted
from the Total Scale alpha coefficient. Subscale
reliabilities were limited by too few items and

by items with zero variance.

Correlations among scales and subscales can
be found in Table 3. Correlations between the
Total Scale and the other scales are very large,
of course, because each subscale is nested
within the Total Scale. The correlation (corrected for attenuation) between the two unnested
subscales, Left Detail and Right Detail, exceeds
1.OO because of their unreliability.
The ECFT Matching Total score correlated
highly with JOLO (.68) and with perception-intensive copy trials (.74 - .90). With one exception, all of ECFT Matching correlations with
memory measures (.26 - .64) showed at least a
trend toward being significantly smaller than all
of its correlations with perception measures (VR
.05, oneCopy vs. VR Delayed, t > 1.66, p
tailed; other comparisons, t > 2.37, p <- .01, onetailed). Only the comparison between Matching
correlation with JOLO and its correlation with
VR Delayed failed to reach significance ( t =
0.59, p > .05, one-tailed).



Table 4. Correlations Among ECFT Trials and Related Measures.

Downloaded by [Moskow State Univ Bibliote] at 19:31 29 January 2014






.6 1












1 . ECFT Recognition
2. ECFT Delayed
3. VR Delayed Recall
4. ECFT Matching


6. ECFT Copy
7. VR Copy





Note. Diagonal values (italicized) are Cronbachs alpha reliabilities. Values above the diagonals are corrected for
attenuation due to unreliability in the measures; values below the diagonal are not corrected. ECFT = Extended
Complex Figure Test; VR = WMS-R Visual Reproductions; JOLO = Judgment of Line Orientation.
* p < .OS ** p < .01 *** p < ,005 All other correlations,p < .0005, one-tailed.


ECFT Recognition
Predictions regarding the distribution of Total
scores and regarding item characteristics were
completely supported. Total scores distributed
normally; they correlated significantly with age,
so descriptives were tabulated for younger and
older subjects. Corrected item-total correlations
showed that all but two items effectively discriminated between higher and lower performance on the Total score. Item-difficulty indices
reflected that the test samples a broad range of
ability levels so that there were items within virtually everyones capability and items to challenge people with even very good memory abilities. Yet, the set of items converges on an item
difficulty of 50%, where discrimination is maximized.
Reliability hypotheses were also well supported. The ECFT Recognition Total Scale had
very homogeneous content. Subscale alphas
ranged from moderate to high, in rough relation
to the number of items on each scale. Each of
the items contributed substantially to the integrity of its host scale. Correlations among scales
and subscales supported the uniqueness of the
Global and Detail Scales, but the Left Detail and
Right Detail Subscales did not measure any
unique variance in this healthy sample. These

findings were expected because Global-Detail

distinctions have been observed elsewhere with
nonneurological (albeit much younger) samples
(Waber et al., 1989; Waber & Holmes, 1985,
1986). By contrast, left and right hemi-inattention syndromes are rarely observed in healthy
As predicted, preliminary evidence of validity
was obtained. As an index of convergent validity, ECFT Recognition correlated highly with
ECFT free recall and moderately with VR free
recall. Discriminant validity can be inferred
from weaker relationships between Recognition
and non-memory trials of the ECFT and VR,
although the correlation with JOLO was stronger than expected.

ECFT Matching
As predicted, Total scores were negatively
skewed. They declined with age, so descriptives
were provided for younger and older subjects.
Although the relationship with age was statistically significant, the actual raw score differences were minimal (approximately one-quarter
point). The Total sample descriptives should
serve the clinician well for preliminary norms.
Predictions regarding item characteristics
were supported. Seven coefficients were positive and moderate, indicating that the items effectively discriminated between higher and

Downloaded by [Moskow State Univ Bibliote] at 19:31 29 January 2014


lower performance on the task as a whole. Three

item-total correlation coefficients could not be
computed due to zero variance for those items.
Item-difficulty indices showed that, unlike
ECFT Recognition, ECFT Matching was very
easy for most healthy adults and did not discriminate among these individuals very well. Data
comparing clinical samples with healthy samples and comparing patients with different types
or degrees of impairment will be necessary to
demonstrate the utility of this measure for discriminating perceptual performance in lower
ranges of ability.
The Total Scale had modest internal consistency, suppressed in part by three items with
zero variance; all of the remaining items contributed to the integrity of the scale. The Leftand Right-Detail Subscales were not expected to
be reliable due to the very few number of items
on each subscale and due to the infrequency of
hemispace inattention among healthy adults.
This trial will need to be administered to a clinical sample, where more variability can be expected, in order to examine the reliabilities of
the scale and subscales more fully.
As expected, convergent and discriminant
validity were successfully demonstrated. There
was convergence between Matching and an established measure of visual-spatial perception
and with perception-intensive construction measures. As evidence of discriminant validity, correlations with memory measures were considerably lower, with the exception of a relatively
strong relationship with VR Delayed.


With regard to criterion-related validity, concurrent validation is considered the most appropriate evidence for diagnostic tests (American Psychological Association, 1985; Anastasi, 1982).
Consequently, the ECFT was administered to a
group of patients with intractable epilepsy. Intractable epilepsy of various etiologies frequently interferes with memory efficiency. Therefore,
it was predicted that a group of matched controls
would exceed patients on ECFT Recognition.
Although epilepsy has not been shown to pro-


duce discrete or profound deficits in perceptual

abilities, it was expected that ECFT Matching
would nonetheless grossly differentiate neurological patients from nonneurological patients.


The clinical sample was comprised of 34 patients who

were undergoing evaluation for surgical treatment of
intractable epilepsy. Data on the localization of the
foci were not available for this study. A subset of the
previously described standardization sample was
matched to the patient sample on age, education, sex,
and handedness. The descriptives and t-test values
(two-tailed) are presented in Table 1.

The patients were administered the ECFT as part of a

more comprehensive clinical test battery. Most were
tested presurgically; for 6 patients, however, only
postsurgical data were available. The testing of
matched controls is described in the standardization
study above.

The analyses in this study were conducted using
SPSS 6.1 (SPSS, Inc.; 1994). Predictions were
directional so one-tailed tests were used. Results
at p < .05 were considered reliable.
ECFT Recognition
In this mixed sample, corrected item-total correlations were positive: 26 were significant at p <
.05 (rs = .24 - .61); four were not statistically
significant (i-s = .06 - .18; p > .05). Item-total
correlations were similar when each group was
analyzed separately. Item-difficulty indices
(percent of the sample that responded incorrectly) ranged from 9% to 8 I % and distributed roughly normally. These ranged 15% to
82% for patients and 3% to 82% for controls.
The mean item difficulty for the 30 items was
46.0% for the mixed sample, proving slightly
more difficult for patients (51.1% for patients,
41.9% forcontrols). Cronbachs alpha reliability
was .83-.84 when the groups were analyzed separately and .84 when they were analyzed
together. None of the items detracted from alpha

Downloaded by [Moskow State Univ Bibliote] at 19:31 29 January 2014



for either group.

Comparisons of the distributions of the two
groups showed considerable overlap in scores.
Nonetheless, the patients scored slightly, and
significantly, lower than did matched controls.
The differences between the two groups is perhaps best reflected by the patients median and
modal score of 14 in comparison to a median of
18 and mode of 2 1 for the matched controls. The
other descriptives and t-test values (one-tailed)
are presented in Table 1.

group and the standardization sample. As one

exception, item-difficulty indices were higher
for patients. Comparisons between patients and
matched controls showed both trials to be sensitive to the diffuse disruption of intractable epilepsy, although there was considerable overlap
between patient and control performances, especially for the matching task.

ECFT Matching
In the mixed sample, corrected item-total correlations were positive: Six were significant at p <
.05 ( T S = .25 - .49); two were not statistically
significant ( T S = .04 - .13;p > .05); two had zero
variance (answered correctly by all). Analyzed
by subgroup, similar results were obtained from
the patient group; for the controls, five items
were statistically significant while the other five
had zero variance. Item-difficulty indices ranged
from 0% to 22%, and their distribution was negatively skewed. The mean item difficulty for the
10 items was 9.4% (9.9% for patients, 5.4% for
controls). Cronbachs alpha reliability was .58
for the patient group and for the combined
group; the control group yielded an alpha of .47,
suppressed in part due to half of the items being
answered correctly by all controls. None of the
items detracted significantly from alpha for either group.
Similar to the recognition trial, comparisons
of the matching trial distributions of the two
groups showed considerable overlap in scores.
The patients scored significantly lower than
matched controls, but the difference in scores is
smaller than that which could be detected clinically (one-half point). The median and mode for
both groups was 10, a perfect performance. The
other descriptives and t-test values (one-tailed)
are presented in Table 1.

In this project, recognition and matching trials

were designed to supplement the Rey-Osterrieth
Complex Figure Test. Pilot test results were
used to refine the original test items. Data from
an age- and sex-stratified sample of relatively
healthy community-dwelling adults were used to
describe the psychometric properties of the instrument and to generate preliminary norms,
although it should be noted that this sample was
better educated than the average U.S. citizen.
For ECFT Recognition, reliability was observed
in the form of internal consistency. Content validity was supported by initial expert review and
by comparisons to Osterrieth s scoring system.
Construct validity was demonstrated by convergence with similar measures and by discrimination from dissimilar measures. As preliminary
evidence of concurrent validity, as a group, patients scored lower than did their matched controls.
Psychometric evidence on ECFT Matching
was less impressive, perhaps due to the small
number of items on the measure. The ceiling
effect on the matching trial also limited the psychometric characteristics, especially with 30%
of the items being answered correctly by everyone in the sample. The distribution of scores did
not become any less skewed when patient data
were analyzed, and reliabilities remained modest. Nonetheless, ECFT Matching did correlate
highly with other perception tasks, while correlating lower with memory trials. Also, it did discriminate between patients and controls in the
present study, even if the differences were subtle.
Other forms o f reliability and validity were
not demonstrated here. Test-retest reliability is

For the patient sample, item characteristics and
alpha reliability coefficients were virtually identical to those obtained from the matched control


Downloaded by [Moskow State Univ Bibliote] at 19:31 29 January 2014


usually an important index, showing the degree

of temporal stability. For the ECFT, this may be
inappropriate because of the incidental nature of
the administration. That is, subjects are not told
that they are to remember the figure until after
the stimulus has been removed. Upon presentation of the stimulus in the second testing, subjects will likely approach the lask differently
and encode information intentionally during the
copy trial. Another form of reliability, interscorer agreement, is less relevant to the new trials presented here because they use an objectively scored multiple-choice format.
With regard to validity, formal testing of the
construct validity of ECFT Recognition and
Matching via confirmatory factor analysis
(CFA) would be especially valuable. CFA could
be used to further verify that Recognition and
Matching assess memory and perceptual acuity,
respectively. In addition, with CFA one could
test whether the ECFT is a measure of visualspatial functioning, as distinct from verbal functioning. Finally, CFA could be applied to confirm the a priori assignment of the recognition
items onto their respective scales and subscales.
The size of the present sample was insufficient
for those analyses. In addition, more compelling
evidence of concurrent validation is warranted.
More elaborate clinical trials for ECFT Recognition and ECFT Matching with different diagnostic groups are needed to examine diagnostic sensitivity.

American Psychological Association. (1985). Standards for educational and psychological testing.
Washington, DC: Author.
Anastasi, A. (1982). Psychological resting (5th ed.).
New York: Macmillan.
Beck, A. T. (1978). Beck Depression Inventory. San
Antonio, TX: Psychological Corp.
Benton, A. L. (1974). Revised Visual Retention Test
(4th ed.). New York: Psychological Corp.
Benton, A. L., Hamsher, K. deS., Varney, N. R., &
Spreen, 0. (1983). Contributions to neuropsychological assessment: A clinical manual. New York:
Oxford University Press.
Binder, L. M. (1982). Constructional strategies on
Complex Figure drawings after unilateral brain


damage. Journal of Clinical Neuropsychology, 4 ,

5 1-58.
Brouwers, P., Cox, C., Martin, A,, Chase, T., & Fedio,
P. (1 984). Differential perceptual-spatial impairment in Huntingtons and Alzheimers dementias.
Archives of Neurology, 41, 1073-6.
Casey, M. B., Winner, E., Hurwitz, I., & DaSilva, D.
(1991). Does processing style affect recall of the
Rey-Osterrieth or Taylor Complex Figures? Journal of Clinical and Experimental Neuropsychology, 13,600-606.
Cohen, J., & Cohen, P. (1983). Applied multiple regression/ correlation analysis for the behavioral
sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum.
Domitrovic, L. A,, Denburg, N. L., & Fastenau, P. S.
( 1 995, February). Recognition and matching trials
for WMS-R Visual Reproductions: Psychometric
properties. Paper presented at the meeting of the
International Neuropsychological Society, Seattle,
Fastenau, P. S., & Denburg, N. L. (1994, February).
Reliability and validity of the Extended Complex
Figure Test (ECFT). Paper presented at the meeting of the International Neuropsychological Society, Cincinnati, OH.
Fastenau, P. S., & Manning, A. A. (1992, February).
Development of a recognition task for the Complex
Figure Test. Paper presented at the meeting of the
International Neuropsychological Society, San
Diego, CA.
Fastenau, P. S., & Sloan, L. E. ( 1 993, February). Copy
trial for the Wechsler Memory Scale-Revised Visual Reproductions subtest: Normative study. Paper presented at the meeting of the International
Neuropsychological Society, Galveston, TX.
Hanger, P. A., Montague, J. R., & Smith, M. (1991,
February). A recognition memory test for the Visual Reproduction subtest of the Wechsler Memory
Scale-Revised. Paper presented at the annual meeting of the International Neuropsychological Society, San Antonio, Texas, USA.
Hemsley, D. ( 1 974). Relationship between two tests
of visual retention. Perceptual and Motor Skills,
39, 1132-1 134.
Kaplan, E. ( 1988). A process approach to neuropsychological assessment. In T. Boll & B. K. Bryant
(Eds.), Clinical neuropsychology and brain function: Research, measurement, and practice (pp.
125- 167). Washington, DC: American Psychological Association.
Kimura, D. (1963). Right temporal-lobe damage. Archives of Neurology, 8,264-27 I .
Knight, J. A., Kaplan, E. F., & Ireland, L. D. (1994,
February). Survey$ndings of Rey-Osterrieth Complex Figure use among the INS membership. Paper
presented at the meeting of the International Neuropsychological Society, Cincinnati, OH.
Lezak, M. D. ( 1983). Neuropsychological assessment

Downloaded by [Moskow State Univ Bibliote] at 19:31 29 January 2014



(2nd ed.). New York: Oxford University Press.

Milberg, W. P., Hebben, N., & Kaplan, E. (1986). The
Boston Process Approach to neuropsychological
assessment. In I. Grant & K. M. Adams (Eds.),
Neuropsychological assessment of neuropsychiatric disorders (pp. 65-86). New York: Oxford University Press.
Moss, M. B., Albert, M. S., Butters, N., & Payne, M.
(1986). Differential patterns of memory loss
among patients with Alzheimers disease, Huntingtons disease, and alcoholic Korsakoffs syndrome.
Archives of Neurology, 43, 239-246.
Ogden, J. A. (1987). The neglected left hemisphere and its contribution to visuospatial neglect.
In M. Jeannerod (Ed.), Neurophysiological and
neuropsychological aspects of spatial neglect (pp.
215-233). Amsterdam: Elsevier Science.
Osterrieth, P. A. (1944). Le test du copie dune figure
complexe. Archives of Psychology (Chicago). 30,
Palmer, S. E. (1977). Hierarchical structure in perceptual representation. Cognitive Psychology, 9, 441 414.
Reed, S. K. (1974). Structural descriptions and the
limitations of visual images. Memory and Cognition, 2(2), 329-336.
Rey, A. (1941). Lexamen psychologique dans les cas
dencephalopathie traumatique. Archives of Psychology (Chicago), 28,286-340.
Rey, A., & Osterrieth, P. A. (1993). Translations of
excerpts from Andre Reys Psychological examination of traumatic encephalopathy and P. A.
Osterrieths The complex figure copy test (J.
Corwin & F. W. Bylsma, Trans.). Clinical Neuropsychologist, 7,3-21. (Original works published in
1941 and 1944, respectively).
Spielberger, C. D., Gorsuch, R. L., Lushene, R.,
Vagg, P. R., & Jacobs, G. A. (1983). Manual for
the State-Trait Anxiety Inventory. Palo Alto, CA:
Consulting Psychologists Press.
SPSS, Inc. (1990). Statistical Package for the Social
Sciences: Release 4 . I (Mainframe version, IBM
VM/CMS) [Computer program]. Chicago, IL: Au-

SPSS, Inc. (1994). SPSS 6.1for Windows [Computer
software]. Chicago, IL: Author.
Squire, L. R. (1986). The neuropsychology of memory dysfunction and its assessment. In I. Grant &
K. M. Adams (Eds.), Neuropsychological assessment of neuropsychiatric disorders (pp. 268-299).
New York: Oxford University Press.
U. S. Department of Commerce ( I 990). I990 census
of p o p u l a t i o n : S o c i a l a n d e c o n o m i c
characteristics--United States. Washington, DC:
Waber, D. P., Bernstein, J. H., & Merola, J. (1989).
Remembering the Rey-Osterrieth Complex Figure:
A dual-code, cognitive neuropsychological model.
Developmental Neuropsychology, 5( I), 1-15.
Waber, D. P., & Holmes, J. M.(1985). Assessing childrens copy productions of the Rey-Osterrieth
Complex Figure. Journal of Clinical and Experimental Neuropsychology, 7,264-280.
Waber, D. P., & Holmes, J. M. ( I 986). Assessing childrens memory productions of the Rey-Osterrieth
Complex Figure. Journal of Clinical and Experimental Neuropsychology, 8, 563-580.
Warrington, E. K. (1984). Recognition Memory Test
manual. Windsor, Berkshire: NFER-NELSON
Publishing Company.
Wechsler, D. (1981). Wechsler Adult Intelligence
Scale-Revised manual. New York: Psychological
Wechsler, D. ( 1 987). Wechsler Memory Scale-Revised
manual. San Antonio, TX: Psychological Corporation.
Wilson, R. S., Rosenbaum, G., & Brown, G. (1979).
The problem of premorbid intelligence in neuropsychological assessment. Journal of Clinical Neuropsychology, 1,49-53.
Zubrick, S., & Smith, A (1978, February). Factors
uffecting BVRT performance in adults with mute
,focal cerebral lesions. Paper presented at the meeting of the International Neuropsychological Society, Minneapolis, MN.