You are on page 1of 14

Journal of Educational Psychology 2000, Vol. 92, No.

2, 377-390

Copyright 2000 by the American Psychological Association, Inc. 0022-0663/(10/$5.00 DOI: 10.1037//0022-0663.92.2.377

Assessment of Working Memory in Six- and Seven-Year-Old Children


Susan E. Gathercole and Susan J. Pickering
University of Bristol Using cognitive methods to fractionate functionally distinct aspects of short-term memory, a test battery was designed to tap the A. D. Baddeley and G. Hitch (1974) model of working memory. The battery was administered to 87 children aged 6 and 7 years; other standardized attainment measures were obtained for a subgroup of the children concurrently and 1 year later. Correlations between subtest scores indicated high construct validity for the central executive and phonological loop measures, although not for the visuospatial measures. Central executive scores shared unique associations with performance on the vocabulary, literacy, and arithmetic tests, whereas phonological loop scores shared specific associations with vocabulary knowledge only. The substantial internal and external validity of the test battery establishes its suitability for guiding detailed theoretical analyses of working memory function.

One of the objectives of our recent research program was to develop a battery of tests capable of providing a fine-grained theoretical analysis of working memory function of children and adults. The test battery has now been administered to a large cohort of British 6- and 7-year-old children. In this article, we report data relating both to the internal validity of the test battery in terms of the patterns of interrelationships between test scores and to its external validity with respect to associations between test scores and children's achievements of standardized measures of scholastic and intellectual ability. The choice of the individual tests in the working memory test battery was guided largely by the specific theoretical account of short-term memory offered by Baddeley and Hitch's (1974) model of working memory and by the large body of research that this model has stimulated (for reviews, see Baddeley, 1986; Gathercole & Baddeley, 1993). In brief, the working memory model comprises three principal components: the central executive and two slave systems known as the phonological loop and the visuospatial sketchpad. Many roles have been ascribed to the central executive, including the deployment of flexible strategies for the storage and retrieval of information, control of the flow of information through working memory, the retrieval of knowledge from long-term memory, the control of action, planning, and the scheduling of multiple concurrent cognitive activities (e.g., Baddeley, 1986, 1996; Baddeley, Emslie, Kolodny, & Duncan, 1998; Baddeley & Hitch, 1974). Whether or not the central executive is a unitary system is

Susan E. Gathercole and Susan J. Pickering, Department of Experimental Psychology, University of Bristol, Bristol, England. This research was supported by a program grant from the Medical Research Council of Great Britain. We thank Simon Lloyd and Melanie Hall for their invaluable assistance in collecting data for this study. Thanks also go to the children and staff of Parson Street Primary School, Broomhill Infant School, and St. Michael on the Mount Church of England Primary School in Bristol, England for their time and assistance in this study. Correspondence concerning this article should be addressed to Susan E. Gathercole, Department of Experimental Psychology, University of Bristol, 8 Woodland Road, Bristol BS8 1TN, England. Electronic mail may be sent to sue.gathercole@bristol.ac.uk. 377

open to debate, but it is notable that a common feature of all of the putative roles of the central executive is that they are fueled by a resource that can be flexibly deployed but whose capacity is limited. This particular aspect of the central executive aligns closely with that of the parallel working memory framework advanced originally by Daneman and Carpenter (1980) and developed by these researchers and their collaborators in subsequent years (e.g., Daneman & Carpenter, 1983; Daneman & Merilde, 1996; Just & Carpenter, 1992; Turner & Engle, 1989). According to this view, a limited resource is available to support working memory, and this can be flexibly used to support storage or processing activities in a variety of domains. This notion has also been used as a basis for an account of the development of verbal short-term memory by Case, Kurland, and Goldberg (1982), who proposed that the total resource available to working memory does not change as children grow older but instead that processing efficiency increases and so releases more resource to support memory storage in older age groups. Whereas the central executive is flexible and domainindependent in nature, the two slave systems are highly specialized for the processing and maintenance of materials within constrained domains. The phonological loop consists of a short-term store that retains verbal material in terms of its phonological characteristics and is subject to rapid decay. Decay of representations in the store can, however, be offset by a serial subvocal rehearsal process that may be articulatory in nature (Vallar & Baddeley, 1984). DevelopmentaUy, the phonological store appears to be in place by about 3 years of age (Ford & Silber, 1994; Gathercole & Adams, 1993), but the strategic rehearsal process does not emerge typically until about 7 years of age (for review, see Gathercole & Hitch, 1993). The current model of the visuospatial sketchpad is less welladvanced than the phonological loop model described previously. There is, however, accumulating evidence that it is composed of two separable subcomponents, one of which is visual and one of which is spatial (Della Sala, Gray, Baddeley, Allamano, & Wilson, 1999; Farah, Hammond, Levine, & Calvanio, 1988; Hanley, Young, & Pearson, 1991; Logic, 1986, 1995; Logic & Pearson, 1997). Support for visuospatial short-term memory appears to be heavily dependent upon the central executive in at least some

378

GATHERCOLE AND PICKERING memory fits extremely well with the claims for their relative functional independence. The current model of working memory is therefore grounded in a body of convergent evidence drawn from a number of empirical perspectives and provides a substantial basis for the construction of a theoretically motivated battery of tests designed to selectively tap subcomponents of working memory. In selecting individual tests, we drew largely upon established experimental paradigms that are known to yield relatively pure assessments of particular aspects of working memory. This approach to creating a working memory test battery stands in marked contrast to the alternative, well-established psychometric tradition of developing large-scale ability test batteries. Whereas the cognitive psychological tradition that guided our test construction uses behavioral dissociations to analyze and fractionate functionally distinct components, the classic psychometric tradition leads to tests of development based principally on face validity and uses statistical techniques to synthesize and infer underlying structure rather than theoretical analysis grounded in empirical evidence. Although useful, the insights gained by the latter psychometric approach are limited by the theoretical validity of the tests initially constructed (Baddeley & Gathercole, 1999). In our test development, we aimed to capitalize on the secure theoretical and empirical foundations provided over the course of many years of research using cognitive psychological techniques rather than on sophisticated statistics alone as a means of ensuring that our measures provide valid assessments of key aspects of short-term memory. In this respect, the tests incorporated into this test battery are already theoretically grounded as experimental techniques, although at a more detailed level--in common with all new tests--they inevitably require psychometric refinement in order to optimize reliability and validity. There are other grounds too for considering assessment of an individual's core cognitive skills on the basis of his or her performance on immediate memory tasks to be a useful complement to evaluations based on general ability test batteries. The verbal components of psychometric, norm-referenced ability tests depend largely on measures of crystallized ability, such as vocabulary knowledge, which are known to be extremely sensitive to the environmental and cultural environment of the individual. Ethnic minorities are particularly disadvantaged by the interpretation of their test scores with reference to norms collected mainly from individuals of nonethnic backgrounds. Campbell, Dollaghan, Needleman, and Janosky (1997) recently established that the degree of cultural and environmental bias in test performance is considerably diminished for information-processing measures, such as tests of short-term memory in which the material to be processed or stored is equally unfamiliar to all individuals, rather than for knowledge-based measures on which performance is strongly influenced by differential degrees of familiarity across individuals. Relatedly, short-term memory measures also appear to be even more impervious to environmental indicators, such as socioeconomic status, than are other information-processing measures (Haggard, Hinde, Wade, Bennett, & Greenwood, 1996). Working memory measures therefore appear to provide a relatively culture-fair method of assessment of cognitive abilities. Two short-term memory measures already included in some standardized test batteries are forward and backward digit span. Although useful as preliminary indicators of short-term memory function, these measures alone are insufficient to provide a corn-

contexts (Miles, Morgan, Milne, & Morris, 1996; Phillips & Christie, 1977; Wilson, Scott, & Power, 1987). For the present purposes, the term working memory is reserved to refer only to the current form of the original Baddeley and Hitch (1974) model of short-term memory described earlier, with its characteristic tripartite structure. However, the same term is frequently used by other theorists to denote a mental work space having limited resources that can be flexibly deployed in a variety of processing and storage activities (e.g., Daneman & Carpenter, 1980; Just & Carpenter, 1992). The working memory model adopted here locates this combined processing and storage aspect of short-term memory within the central executive, and indeed in developing the present test battery, we have been strongly informed by the empirical and conceptual advances of the North American group headed by Daneman and Carpenter in identifying appropriate measures of central executive function. It is, however, important to note that according to the working memory model, this aspect of working memory is quite distinct from the two slave systems that appear to serve much more specific memory functions, namely, the phonological loop and the visuospatial sketchpad (Baddeley, 1996). The research base that has stimulated the successive refinement of the working memory model has broadened considerably since its original formulation. The original tripartite structure of the model was based on experimental studies of adult participants (e.g., Baddeley & Lieberman, 1980; Baddeley, Thomson, & Buchanan, 1975), and this tradition continues to be highly influential (e.g., Baddeley, Emslie, et al., 1998). In recent years, however, the detail of the working memory model has also been informed by other research traditions. The natural experiments provided by studying short-term memory function in individuals with highly specific neurological and neuropsychological deficits have proved invaluable in illuminating detailed aspects of the functioning of working memory that are often difficult to investigate using conventional experimental techniques (Baddeley, Logie, Bressi, Della Sala, & Spinnler, 1986; Baddeley, Papagno, & Vallar, 1988; Bishop & Robson, 1989; Della Sala et al., 1999; Hanley et al., 1991; Shallice & Butterworth, 1977; Vallar & Baddeley, 1984). The working memory model has also been stimulated by developmental studies. Current conceptualizations of all three major components of working memory owe much to investigations of fractionations and changes in short-term memory during the childhood years (e.g., Henry, 1991; Logie & Pearson, 1997; Pickering, Gathercole, Hall, & Lloyd, in press; Towse, Hitch, & Hutton, 1998). Finally, investigations of the areas of the brain that are active during short-term memory activities have lent considerable weight to claims of the functional independence of the major subcomponents of working memory. Positron emission tomography studies point strongly to separate neural circuitry for the storage and rehearsal components of both phonological and visuospatial short-term memory, with the phonological systems located mainly in the left hemisphere and the visuospatial systems located almost exclusively in the right hemisphere (Smith & Jonides, 1997; Smith, Jonides, & Koeppe, 1996). Executive control processes, in contrast, are associated primarily with increased levels of activity in the dorsolateral prefrontal cortex (D'Esposito et al., 1995; Smith & Jonides, 1997). The apparent distinctiveness of the brain systems underlying these main components of working

WORKING MEMORY IN CHILDREN prehensive assessment of short-term skills. Forward recall of digits taps the phonological loop, whereas backward digit recall draws upon both the phonological loop and the central executive (Morra, 1994). Neither test measures visuospatial memory. Also, the highly familiar and distinctive phonological forms of digit names may well yield measures of phonological storage capacity that are less sensitive than serial recall measures using less familiar stimuli (Gathercole, Willis, Baddeley, & Emslie, 1994). A final, more practical drawback to interpretation of digit span scores from large-scale test batteries concerns the method of calculation of standard scores for digit span. These are often obtained by combitting the forward and backward digit recall scores to yield a single score, which is then standardized by age, a method that confounds assessment of two distinct subcomponents of working memory, the phonological loop and the central executive. Several other research teams have also constructed multitask assessments of working memory in recent years (e.g., Dixon, LeFevre, & Twilley, 1988; Engle, Nations, & Cantor, 1990; Jurden, 1995; Shah & Miyake, 1996; Swanson, 1992, 1993, 1994; Swanson & Alexander, 1997; Turner & Engle, 1989). The findings from these studies have made key contributions in particular to the debate concerning whether the resources available to the central executive are domain-specific or not (see also Daneman & Merikle, 1996). These studies were largely designed to investigate the complex storage and processing capacities that the working memory model views as being the domain of the central executive. In contrast, our motivation in developing a working memory test battery is to provide a set of tests that would provide a balanced and theoretically coherent assessment of the functioning of each of the principal components of the specific working memory model, which includes the phonological loop and the visuospatial sketchpad as well as the central executive. Why is it useful to provide a fine-grained analysis of working memory skills for a particular child? The answer is that individual differences in components of working memory appear to have important consequences for the acquisition and execution of a variety of complex cognitive skills that are of real importance in everyday life. Valuable insights into the child's potential to process information and to acquire new knowledge and skills can therefore be provided by assessing such individual differences. Central executive function has consistently been found to be associated with a variety of high-level abilities, including language and reading comprehension in both children and adults (e.g., Daneman & Carpenter, 1980; Dixon et al., 1988; Engle et al., 1990; Turner & Engle, 1989; Yuill, Oakhill, & Parkin, 1989), reading achievement more generally (Siegel, 1994; Swanson, 1994), arithmetic development during childhood (Bull, Johnson, & Roy, 1999; Siegel & Linder, 1984; Siegel & Ryan, 1989), the conceptual component of vocabulary acquisition (Daneman & Green, 1986), college entrance and achievement scores (Daneman & Carpenter, 1980; Jurden, 1995), and occupational success (Christal, 1991; Kyllonen & Christal, 1990). The phonological loop, on the other hand, appears to play a highly specific role in the acquisition of language and especially in supporting the long-term learning of the phonological forms of new words (Baddeley, Gathercole, & Papagno, 1998; Cheung, 1996; Gathercole & Baddeley, 1989; Gathercole, Willis, Emslie, & Baddeley, 1992; Papagno, Valentine, & Baddeley, 1991; Papagno & Vallar, 1995; Service, 1992; Service & Kohonen, 1995). The

379

loop may also mediate the acquisition of syntax during early childhood and be of particular value in building a storehouse of multiword utterances from which syntactic and morphological rules may be extracted (Adams & Gathercole, 1995; Blake, Austin, Cannon, Lisus, & Vaughan, 1994; Speidel, 1989). Severe deficits of the phonological loop are also associated with the impaired development of the phonological and morphological aspects of language characteristic of the developmental pathology Specific Language Impairment (Bishop et al., 1999; Bishop, North, & Donlan, 1996; Gathercole & Baddeley, 1990; Montgomery, 1995). Less is known about the consequences of poor visuospatial short-term memory for everyday cognition. There is, however, some preliminary evidence that it may play a key role in the learning of spatial routes and of faces (Hunley et al., 1991) and also that it supports the acquisition of arithmetic skills (Dark & Benbow, 1990). The Working M e m o r y Test Battery The test battery consists of 13 measures selected on the basis of the substantial body of experimental research spanning the past 25 years on the working memory model. More measures of the phonological loop than of the central executive or of visuospatial memory were included in the battery as a consequence of the more extensive research literature and theoretical development on this particular component. Three of the six phonological loop measures used a serial-recall paradigm involving spoken presentation of a sequence of items followed by attempted spoken recall of the items in the original order. This paradigm is the most widely used method of assessing verbal short-term memory (for review, see Gathereole & Baddeley, 1993) and, in adults and older children, at least appears to call upon both components of the phonological loop: phonological storage and subvocal rehearsal. Because children below about 7 years of age do not appear to spontaneously engage in covert rehearsal (e.g., Flavell, Beach, & Chinsky, 1966; Gathercole & Hitch, 1993; Johnston, Rugg, & Scott, 1987), serial recall performance in younger age groups probably reflects phonological storage capacity only. Measures of serial recall of digits, words, and nonwords were included. Digit span is the conventional measure of verbal shortterm memory and is present in most standardized general ability batteries (e.g., Wechsler, 1986). As discussed earlier, however, this measure may actually be relatively insensitive to phonological storage capacity because the highly familiar and phonologically distinct set of digit names provides ample opportunity for the reconstruction of partial phonological memory traces on the basis of long-term knowledge (Baddeley, Gathercole, et al., 1998; Gathercole et al., 1994). Two further serial-recall measures (of words and nonwords) were therefore included that used stimuli drawn from an unrestricted stimulus population as a means of testing more stringently the accuracy of temporary phonological storage (Coltheart, 1993). In the case of nonwords, of course, there is little opportunity for long-term lexical support for recall because the items have not previously been encountered (Hulme, Maughan, & Brown, 1991). Serial recall of nonword stimuli therefore provides a highly sensitive measure of the ability to store the detailed phonological structure of an item in the phonological loop.

380

GATHERCOLE AND PICKERING resources support processing and storage in the verbal and nonverbal domains (e.g., Jurden, 1995; Shah & Miyake, 1996). Four measures of visuospatial memory were included in the battery. Two of the tests were computer versions of standard visuospatial memory tasks, Corsi blocks (De Renzi & NicheUi, 1975) and visual pattern span (Wilson et al., 1987), which are now believed to measure dissociable spatial and visual components of short-term memory (Della Sala et al., 1999). The final two tests tapped memory for mazes using methods developed by Picketing et al. (1999). In both tests, the child studies two-dimensional mazes and then attempts to recall a path through the maze, which is either shown in the studied maze as a printed line drawn through the maze (in the static version of the task) or a route traced by the examiner's finger (dynamic version). We have recently found distinctive patterns of association and dissociation between these four visuospatial tasks that suggest that visuospatial memory may have separable subcomponents specialized for the maintenance of static and dynamic stimuli (Picketing et al., 1999). The working memory test battery was administered to a large group of children 6 and 7 years old. The study aimed to establish whether the tests were sensitive and appropriate for use with young children and to explore the construct validity of the individual subtests as measures of components of the working memory model. Standardized tests of reading, vocabulary knowledge, arithmetic, and nonverbal reasoning were administered to the majority of the children both at the time of initial testing and 16 months later, providing the opportunity to test the extent to which the test battery scores were related to achievements in these key ability domains.

Another measure of the capacity to store unfamiliar phonological structures was provided by a nonword repetition task in which children hear and then attempt to repeat single, multisyllabic nonwords such as perplisteronk or woogalamic (Gathercole & Baddeley, 1996). The accuracy of nonword repetition is highly correlated with setial-recall measures of phonological short-term memory (Gathercole et al., 1994). This task is unlikely to require subvocal rehearsal and so is best interpreted as providing a measure of phonological short-term storage alone (Baddeley, Gathercole, et al., 1998). Finally, immediate memory for words and nonwords was assessed using a serial-recognition or matching-span paradigm (Allport, 1984; Campbell & Butterworth, 1985; Martin & Breedin, 1992; Shallice & Warrington, 1970). In our tests, sequences of spoken memory items are presented twice, with the second sequence containing the same items as the first list, either in the same order or with two of the adjacent items transposed. The child's task is to judge whether the two lists are the same or different. Serialrecognition scores are highly correlated with other recall-based measures of phonological loop function (Gathercole & Picketing, in press; Gathercole, Service, Hitch, Adams, & Martin, 1999) but do not place the same demands on spoken output accuracy. Serial recognition therefore yields an assessment of the phonological loop that is largely uncontaminated by immature or disordered articulatory-phonological output skills found in many young children. Three measures of central executive function were included in the battery, each characterized by simultaneous demands to store and manipulate information. In the listening-recall measure, a series of short sentences is presented auditorily, and the child has to judge whether each sentence is true or false. After the final sentence, the task is to recall the final words of each sentence, in the original sequence. This task was first developed for use with young adults by Daneman and Carpenter (1980). In our version, the sentences contained only high-frequency words and concrete concepts likely to be familiar to young children. In the second central executive measure, counting recall, the child counts the number of colored dots in a series of displays and then attempts to recall the number tallies in sequence. This task was developed by Case et al. (1982) and has since been used widely to test central executive capacity in children (e.g., Yuill et al., 1989). The final central executive measure was backward digit recall, part of the digit span assessment in standardized ability test batteries such as the Weehsler Intelligence Scale for Children--Revised (WlSC-R; Wechsler, 1986). The task involves spoken presentation of a series of digits followed by an attempt to recall the sequence in reverse order. This subtest, like the other two central executive subtests, imposes simultaneous demands on both processing (in this case, reversing the sequence of the digits in memory) and storage (retention of the original sequence). Although the stimuli employed in these executive measures vary considerably (spoken sentences, dot displays, and digit names), performance in each case is likely to be mediated by verbal rather than nonverbal memory. A purely nonverbal measure of central executive function could not be included in the test battery as yet because no appropriate methods have been developed for use with young children. For this reason, the data collected in this study do not bear directly on the issue of whether common or distinct

Method

Participants
The participants in the study were 87 children attending three state schools in Bristol, South-West England. The percentage of children attending each school who received free school meals in 1997 was slightly higher than that of the national average: 22.3%, 31.2%, and 34.1%, compared with the national value of 21.1% (Bristol City Council 1997 Primary School Performance Tables). The number of children attending the schools for whom English is an additional language was, in contrast, lower than national levels: 1.4%, 1.2%, and 11.1%, compared with national levels of 7.8%. In total, 84 children were Caucasian and spoke English as their native language; of the remaining 3 children, 1 was of Asian origin and 2 were of Afro-Caribbean descent. At the time of initial testing on the working memory test battery (Time 1, in either the summer term of Year 2 or the autumn term of Year 3), the group had a mean age of 7 years 4 months (SD = 3.9 months, range = 6 years 9 months to 8 years 5 months) and comprised 53 girls and 34 boys. Children from two of the schools were also given standardized attainment tests at the initial time of testing (Time 1, n = 57) and 12 months later (Time 2, n = 54). At Time 2, the mean age was 8 years 5 months (SD = 4.2 months, range = 7 years 10 months to 9 years 6 months).

Procedure
All 13 working memory tests were administered to each child at Time 1; examples of stimuli and correct responses on each test are shown in Figure 1. Standardized attainment measures for literacy (reading recognition, reading comprehension, and spelling), arithmetic, vocabulary, and

WORKING MEMORY IN CHILDREN Phonological loov Subt~t Serial recall Words Nonwords Examples of stimuli Correct r e s p o n s e "chin, led, bag" "tam, neb, gock" "7,1, 5"

381

Digits
Serial recognition Words Nonwords

chin, led, bag tam, neb, gock 7, 1, 5

man, duck, leap .. man, leap, duck bool, jeck, chorg .. bool, jeck, chorg

"different"

"same"
"woogalamic"

N o n w o r d repetition woogalamic Central executive

Subtest
Listening recall C o u n t i n g recall

Examples of stimuli

Correct response

chairs have legs bananas have teeth

"yes"
"no .. legs, teeth"

~-~
Backward digit recall 9, 2, 5

"3 ... 4,3"


"5, 2, 9"

Subtest

Examples of stimuli

Correct response

Ma~ces

Mazes

Figure 1. Example stimuli for each subtest of the Working Memory Test Battery. Stimuli printed in italics are
spoken. Static versions of only the visuospatial subtests are shown. See text for details of dynamic versions.

nonverbal reasoning were also given. The order of test administration was held constant across children ~ as great an extent as possible. Each child was tested individually in a quiet room of the school and completed all tests during four sessions.

Phonological Loop Digit span. This test involves the presentation of spoken sequences of digits for inunediate serial recall, using the method described by Picketing, Gathercole, and P~_ker (1998). After a practice session, a maximum of four lists were presented at each length, starting with two-item sequences; if the fLrSt flLree lists at a particular sequence length were correctly recalled, the list length was increased by one. Items were presented at a rate of one every 750 ms. The maximum list length at which three lists are correctly recalled provided a measure of span. A test-retest reliability correlation coefficient for digit span of .68 was obtained in a study of seventy 4- and 5-year-old children (Gatbercole, 1995).

Serial recall of words and nonwords. All lists were composed of one-syllable sthnuli and included only common words likely to be familiar to young children. No item was repeated more than once across trials within a test. After a practice session, four lists of three, four, and five items were presented. Because recall accuracy at the longest list length was very low, each child's recall protocol was scored on the basis of the trials containing three and four items only. The number of lists correctly recalled constituted the score, giving a maximum possible score of 8. Nonword repetitio~ The Children's Test of Nonword Repetition (Gathercole & Baddeley, 1996) was given to each child. In this test, 40 nonwords (4 sets of 10, each containing two, three, four, and five syllables, respectively) are presented to the child in spoken form using an audiocassette recorder. The child attempts to repeat each item following its presentation, and the repetition attempt is scored as correct if there is no phonological deviation from the target form. Maximum total score on this test is 40. Split-half reliability of this test of .66 is reported by Gathercole and

382

GATHERCOLE AND PICKERING listening-span task was constructed using sentences from the Silly Sentences task (Baddeley, Emslie, and Nimmo-Smith, 1992), such as "Frogs have long ears," with some additional sentences of a similar kind constructed for the purposes of the present study. Examples of sentences are shown in Figure 1. After a practice session, each child received four trials, each consisting of two sentences. The number of sentences in each trial increased by one every four trials until the final words were incorrectly recalled in two or more trials at a particular list length. The total number of correct trials was scored. The test-retest reliability coefficient for the listening-span task was .62. Counting recall. A practice session familiarizing the child with the nature of the task preceded the test session. On each test trial, the child was required to count the number of dots presented in each of a series of arrays (saying the total number aloud) and to recall subsequently the dot totals in the order that the arrays were presented. In the display booklet that was placed in front of each child, a rectangle containing either three, four, five, or six red dots was shown on each page. Testing began with trials containing two arrays of dots, with the number of arrays increasing by one every four trials until the child incorrectly recalled the number sequences in two or more trials at a particular level. The total number of trials in which the numbers of dots were recalled in the correct order was scored. Poor test-retest reliability was found for this measure, r = .15, indicating the need for further refinement. Backward digit recall. This test employed the same procedure as the digit span test except that the child attempted to recall the sequence of spoken digits in reverse order. Preceding practice trials were given to ensure that the child understood the concept of "reverse." Test trials commenced with four trials containing two digits, followed by lengthier sequences if three or more lists were correctly recalled. The total number of lists correctly recalled was scored. Split-half reliability for the WISC-III UK version of digit span, which includes both forward and backward recall of digits, is .85 (Golombok & Rust, 1992).

Baddeley (1996), and a test-retest reliability coefficient of .77 was obtained in a sample of seventy 4- and 5-year-old children (Gathercole, 1995). Serial recognition of words and nonwords. In these tests, pairs of lists were spoken by the examiner. The lists in each pair contained the same items, either in identical sequence or with an adjacent transposition in the order of two of the items (Gathereole et al., 1999), and the child's task was to judge whether the two lists were the same or different. The items were presented at a rate of one item per 750 ms, with an extra 750-ms delay interpolated between the last item of the first list and the first item of the second list. Four sets of lists were presented at each list length of three, four, and five items; in each set, two list pairs were different and two pairs were the same. Items were drawn from the same stimulus pool as the corresponding serial-recall tests. A practice session preceded the test session in both cases. The number of correctly identified list pairs at each list length was scored, with a maximum possible score of 12.

Visuospatial M e m o r y Static and dynamic matrices. Both tests were presented using an Apple Macintosh Powerbook 5300cs. A familiarization and practice session preceded testing in each case. In the static matrices test, a 2 by 2 matrix containing two filled and two unfilled squares (see Figure 1) was displayed in the center of the computer screen for 2 s. An empty matrix then appeared on the screen, and the child identified the location of the filled squares by pointing to the appropriate squares on the screen. The examiner clicked each block selected by the child. In the dynamic matrices test, the two black squares were presented in sequence, with each square flashing on for 0.5 s, separated by an interstimulus interval of 0.5 s. The child was required to recall both the location and the order of the sequence of black squares by pointing. After correct recall of at least three trials at each level of difficulty, the matrix increased in size by two blocks. The number of trials on which a correct response was made was calculated. A satisfactory test-retest reliability coefficient of .79 was obtained for the static matrices subtest. For the dynamic matrices subtest, reliability was much lower, at .40, indicating the need for further test refinement to improve the psychometric properties of this measure. 1 Static and dynamic mazes. In each of these tests, the child studied a two-dimensional maze on each trial and attempted to recall an indicated path through the maze (for example, see Figure 1). In the static-mazes version, the child was presented with a response booklet showing mazes on each page consisting of a stick person in the center of nested walls, each of which has two entry points. On each maze, a red line extended from the outside of the maze to the central figure. Each page was displayed for 2 s before being removed from view and was then replaced by an identical maze in a response booklet that does not show the route. The task is to draw the route through the maze shown in the study item. After three or more correct responses in the four mazes with two sets of walls, the child progresses to mazes containing three walls, and so on. The dynamic-mazes version differed only in that the route to be recalled is not printed on the maze but is traced by the examiner's finger on the maze in full view of the child. Again, the task is to recall the route by drawing it in the response booklet. Different routes were constructed and assigned to the two mazes tasks. In both tests, the number of correct trials was scored. Test-retest reliability coefficients for the static and dynamic subtests were .53 and .56, respectively. Central Executive Listening recall. A series of spoken sentences was presented on each trial. Following each sentence, the child was required to judge whether or not the sentence made sense, by responding "yes" or "no." After the final sentence of the trial, the child then attempted to recall the final word of each of the sentences in the original sequence. This modified version of the

Standardized Attainment Measures Vocabulary. Receptive vocabulary knowledge was assessed at Times 1 and 2 using the Long Form of the British Picture Vocabulary Scale (Dunn, Dunn, Wbetton, and Pintilie, 1982). Raw scores were calculated for each child. Arithmetic. Arithmetic ability was assessed at Time 1 using the Group Mathematics Test (Young, 1996), which was administered to participants individually. Raw scores were calculated for each child. At Time 2, children were tested on the Basic Number Skills subtest of the Differential Ability Scales (Elliot, 1990). Raw scores were calculated for each child. Literacy. Four measures of literacy were administered to each child at Time 1. Word recognition was measured using two tests: the British Ability Scales (BAS) Test of Word Reading (Elliot, 1983) and the Neale Analysis of Reading Ability Revised (Neale, 1989). In the former test, children read single words of increasing difficulty, whereas in the latter, they attempted to read meaningful passages of text. Reading ages were obtained from both measures. A reading comprehension score was also obtained from the Neale Analysis of Reading Ability--Revised (Neale, 1989). Finally, spelling performance was assessed using the BAS Spelling scale (Elliot, 1992). Raw scores and spelling age equivalent scores were calculated for each child. The BAS reading and spelling tests were also administered at Time 2.

1 Test-retest reliability was assessed in a subgroup of 27 of the children tested at Time 1 on the following subtests: static and dynamic matrices, static and dynamic mazes, listening span, and counting span. After administration of the full working memory battery at Time 1, these five subtests were administered again to the subgroup 3 weeks later.

WORKING MEMORY IN CHILDREN


Results

383

Descriptive and Correlational Analyses


Descriptive statistics for each of the subtests of the Children's Working Memory Test Battery are shown in the left panel of Table 1. A wide range of scores was found on each measure. With the exception of two subtests, scores fell within ranges that avoided attenuation resulting from either floor or ceiling effects. On both the nonword recall and listening-recall subtests, however, mean scores fell at the lower end of the range (1.26 for nonword recall and 2.98 for listening recall), indicating that the difficulty level may have been higher for this age range than is desirable. A significant sex difference was found only in counting recall, with females scoring higher than males, F(1, 85) = 4.345, p < .05. The right panel of Table 1 provides information on performance on the attainment measures completed by a subgroup of the children both at Time 1, when the working memory measures were administered, and 16 months later at Time 2. Mean percentile

points of the children are shown for all measures other than the Group Arithmetic Test, for which raw scores are shown. The sample generally achieved expected levels of performance on the measures for their age, as indicated by mean percentile points averaging at about 50. On the literacy measures, females consistently performed at higher levels than males. Significant sex differences were found on the following Time 1 measures: BAS reading, F(1, 54) = 4.091, p < .05; BAS spelling, F(1, 54) = 5.054, p < .05; and Neale comprehension, F(1, 53) = 4.411, p < .05. Correlational analyses between principal measures were conducted, and the resulting correlation matrices are provided in Table 2. The lower triangle shows the zero-order correlation coefficients, and the upper triangle shows the partial correlation coefficients adjusted for chronological age in months. To minimize the number of spuriously significant correlations arising from inclusion of all 13 measures in the correlation matrix, the a criterion for significance was set to .01.

Table 1

Descriptive Statistics for Principal Working Memory and Standardized Attainment Measures by Times of Testing
Girls Measure Boys

SD

SD

Working memory measures (Time 1)

Digit span
Serial recall, words Serial recall, nonwords Nonword repetition Serial recognition, words Serial recognition, nonwords Static matrices Dynamic matrices Static mazes Dynamic mazes Backward digit recall Counting recall Listening recall

4.01 4.13 1.36 25.32 9.62 8.04 16.35 9.69 7.94 8.35 6.81 7.49a 2.98

0.69 1.84 1.44 6.64 1.86 1.97 3.28 3.30 3.19 2.55 2.32 2.68 1.80

53 53 53 53 53 53 52 52 53 53 53 53 53

4.06 4.12 1.12 24.41 9.15 8.21 17.26 8.88 7.29 8.71 6.97 6.21 2.97

0.60 1.77 1.09 5.96 1.67 2.04 4.85 3.22 2.91 3.23 2.62 2.99 1.95

34 34 34 34 34 34 34 34 34 34 34 34 34

Attainment measures (percentile scores, unless otherwise indicated) Vocabularyb Time 1 Time 2 Arithmetic Time 1c Time 2d Readinge Time 1 Time 2 Readingd Time 1 Readingg Time 1 spening~ Time 1 Time 2 46.00 47.80 30.32 65.10 61.40~ 73.66 50.81 a 45.59 64.61 a 65.83 23.36 21.12 9.66 30.38 31.69 29.16 30.00 31.02 23.58 27.79 37 35 37 35 37 35 37 37 36 35 44.95 49.68 28.45 50.79 45.67 58.47 34.17 32.22 49.22 52.37 25.33 24.19 9.29 29.79 30.16 33.34 21.57 28.10 24.78 27.19 20 19 20 19 20 19 18 18 20 19

a Females score significantly higher than males (p < .05), by unlvariate F test. b British Picture Vocabulary Scales. c Group Arithmetic Test (raw scores), d Basic Number Skills subtest. British Abilities Scales Reading Test. f Neale Analysis of Reading Ability--Revised--Accuracy. s Neale Analysis of Reading Ability--Revised---Comprehension. h British Ability Scales Spelling Test.

384 Table 2

GATHERCOLE AND PICKERING

Correlation Matrix for Principal Measures From the Working Memory Test Battery
Variable 1. 2. 3. 4. 5. 6. 7. 8. 9. I0. 11. 12. 13. 14. Age Digit span Serial recall, words Serial recall, nonwords Nonword repetition Serial recognition, words Serial recognition, nonwords Staticmatrices Dynamic matrices Static mazes Dynamic mazes Backward digitrecall Counting recall Lis~ningrecall 1 .01 .05 .01 .13 .28* .14 .18 .20 .33* .24 .25 .28* .09 2 -.36* .31" .39* .49* .36* .35* .07 .16 .23 .30* .24 .31" 3 4 5 .40* .41" .26 -.26 .17 .03 .01 .15 .02 .15 .15 .17 6 .50* .34* .15 .23 -.55* .34* .07 .25 .14 .22 .12 .23 7 .37* .11 .13 .16 .54* .19 .04 .15 .21 .29* .11 .28* 8 9 10 .16 .27* .06 .10 .18 .12 .14 .26 -.36* .28* .40* .15 11 .23 .27* .08 -.01 .08 .18 .30* .28* .32* -.30* .38* .32* 12 .31" .31" .11 .11 .16 .26 .21 .26 .21 .25 -.34* .34* 13 .24 .23 -.02 .12 .04 .08 .18 .27 .36 .34 .29 -.45 14 .31 .18 .12 .16 .21 .27 .26 .26 .14 .31 .33 .44 --

.36* .31" -.35* .36* -.43* .27* .34* .15 .18 .13 .14 -.05 .10 -.04 .28* .07 .27* .07 .32" .12 .24 -.01 .19 .12

.35* .07 .13 .09 -.06 -.04 .01 -.02 .31' .02 .17 .01 --.10 -.06 -.16 .27* .33* .32* .24 .30* .22 .31" .27* .27*

Note. Simple correlation coefficients are shown in the lower triangle; partial correlation coefficients adjusted for chronological age are shown in the upper
triangle. *p < .01.

Intercorrelations between subtests provide an initial indication of the internal validity of the measures purportedly tapping the three subcomponents of the working memory model. Consider first the phonological loop subtests. The four recall-based measures (digit span, word recall, nonword recall, and nonword repetition) were significantly correlated with one another (partial rs > .30, p < .01, in each case), with the exception of the two nonword measures (partial r = .26, p > .01). The two serial-recognition tasks were highly correlated with one another (partial r = .54, p < .001), and word recognition correlated significantly with both digit span (partial r = .50, p < .001) and word recall (partial r = .34, p < .01). Scores on neither recognition measure were significantly associated with either recall or repetition of nonwords, providing preliminary evidence that individual differences in phonological loop function may be composed of separable recall and recognition components. All three central executive tasks were significantly correlated with one another. The highest degree of association was obtained for counting and listening recall, partial r = .44, p < .01. Backward digit recall was significantly correlated with both counting recall, partial r = .29, p < .01, and listening recall, partial r = .33, p < .01. There was, in contrast, little integrity between scores on the four visuospatial measures. Dynamic mazes correlated significantly with each measure: with static matrices, partial r = .30, p < .01; with dynamic matrices, partial r = .28, p < .01; and with static mazes, partial r = .32, p < .01. None of the remaining three visuospatial measures correlated significantly with one another.

Exploratory and Confirmatory Factor Analyses


The correlational analyses indicate a greater degree of construct validity and integrity for the phonological loop and central executive measures than for the four visuospatial tasks. To investigate the higher-order factor structure underpinning variations in scores in the various subtests, a principal-components analysis was performed on z scores for all 13 subtest scores. Three factors emerged with eigenvalues in excess of 1.00. Factor loadings greater than .30

on the rotated factor matrix are shown in the left panel of Table 3. All three central executive measures and three of the four visuospatial measures loaded on Factor 1. The two serial-recognition tasks, digit span and static matrices, loaded highly on Factor 2. The remaining phonological loop tasks (word and nonword recall, and nonword repetition) loaded highly on Factor 3. Although the particular three-factor structure that emerges from this exploratory factor analysis does not correspond directly to the tripartite structure of the working memory model of Baddeley and Hitch (1974), it fits reasonably well with findings from experimental studies. There is considerable evidence that visuospatial short-term memory tasks place very significant demands upon the general processing and storage resources of the central executive (Phillips & Christie, 1977; Wilson et al., 1987). The high loadings of three of the visuospatial tasks on Factor 1, which is also strongly identified with the three central executive measures, supports the notion that performance on these tasks may be mediated by the central executive. The second and third factors appear to correspond to different aspects of phonological loop function. Factor 2 is identified with the three phonological loop tasks that place the lightest retrieval and output demands on the child. The two recognition tasks involve only a same-different judgment, and although the digit task requires retrieval and output of the digit sequence, the phonological forms of the digits have a high degree of phonological redundancy and are highly familiar to the children. In contrast, the three phonological loop tasks loading only on Factor 3 all place substantial demands on accuracy of retrieval from phonological storage and on adequacy of the planning and execution of accurate articulatory motor gestures matching the retrieved material. Interestingly, digit span also loaded on this factor, presumably because it too involves active retrieval and spoken output. The high degree of redundancy and familiarity of the phonological forms of the digits does, however, contrast markedly with the unfamiliarity of the verbal stimuli in the three other tasks; in each case, there was no repetition of stimuli across trials. In this respect, on a continuum of retrieval-output demands, digit span lies in an intermedi-

WORKING

MEMORY

IN CHILDREN

385

Table 3
Factor Loadings on Rotated Factor Solutions

patterns and for spatial sequences calls upon dissociable memory capacities (Della Sala et al.,in press; Picketing et al., 1999). The

Factor
Measure 1 2 3

present weighting of the static and dynamic matrices tasks on different factors is therefore at least consistent with the notion of dissociable memory capacities for static visual and dynamic spatial material.
Because the status of the visuospatial tasks with respect to the higher-level structure of working memory is at present unclear, the four visuospatial memory measures were omitted from subsequent attempts to build higher-order models of the data. A further factor analysis was conducted on the three central executive measures and four summary scores on the basis of the phonological loop tasks (recall z score based on average of word and nonword recall z scores, serial-recognition z score based on average of word and nonword recognition z scores, nonword repetition and digit span z scores). A two-factor rotated solution emerged with an eigenvalue cutoff of 1.00; factor loadings greater than .3 for the rotated solution are shown in the middle panel of Table 3. The pattern of loadings is very clear. The three central executive measures loaded on Factor 1, and the four phonological loop measures loaded on Factor 2. This higher-order factor structure of the central executive and phonological loop measures was tested further using structural equation modeling, a statisticalapproach that allows us first to estimate more precisely the degree of fit between this two-factor model and the data and second to test the degree of association between the two factors (interpreted bere as the central executive and the phonological loop). The E Q S structural equation program (Bcntler & Wu, 1995) was used to test the model shown in Figure 2. The input to the program was the partial correlation matrix computed between the seven measures, with chronological age in months partialed out. The program provides standardized coefficients for each path on the specified model that denote strength of association and correspond approximately to regression weights. This two-factor model provided an excellent fit to the data. The X 2 value for the model, with 13 degrees of freedom, was 11.841, p = .50, establishing that the model does not significantly depart from the data. The Bcnfler-Bonett Norrned Fit Index for the model was .90, the Bentler-Bonctt Nonnorrned Fit Index was 1.02, and the Comparative Fit Index was 1.00. All path coefficients were significant and arc shown in Figure 2. The central executive and phonological loop constructs are signifi-

All principal measures


Dynamic matrices Static mazes .70 .63 .66 .55 .74 .54

Dynamic mazes Backward digit recall Counting recall Listening recall Serial recognition, words Serial recognition,nonwords Static mazes Digit span Serial recall, words Serial recall, nonwords Nonword repetition

.33 .75 .71 .70 .59

.48 .69 .74 .72

Central executive and phonological loop measures Serial recognition Serial recall Nonword repetition Digit span Backward digit recall Counting recall Listening recall .60 .75 .74 .73 .64 .81 .78 Phonological loop measures Serial recall, words Serial recall, nonwords Nonword repetition Digit span Serial recognition, words Serial recognition, nonwords .75 .74 .71 .58

.50 .85 .86

Note. Only loadings greater than .30 are shown.

ate position relative to the serial-recognition tasks (low) and three other recall-based tasks (high). Its shared factor loadings reflect this intermediate status.
The loading of the static matrices measure on Factor 2 was unexpected and defies simple explanation. It should, however, be noted that there is accumulating evidence that memory for visual

Counting [ recall ~

A ~ / ~ l

Nonword repetition [

.a2

digit recall

13 "

~ . 5 8 --1 span

[|

recognition [

Figure 2. Latentconstruct model of the central executive and phonological loop subtest scores.

386

GATHERCOLE AND PICKERING One particular task, digit span, draws significantly on both factors, possibly reflecting the intermediate level of demands it places on retrieval and articulatory output processes.

cantly associated with one another, with a covariance coefficient of .55. Both the exploratory and confh-'matory factor analytic approaches reported earlier provide strong statistical support for the concept of distinct but associated central executive and phonological loop subcomponents of working memory. The detailed structure of each factor also accords well with our a priori theoretical assumptions and thus provides substantial construct validity for these seven measures. In a final structural equation model, the presence of two higherorder factors within the phonological loop was further investigated. First, a factor analysis was performed on the six individual phonological loop measures. Loadings on the two-factor solution from the rotated factor matrix are shown in the right panel of Table 3. The two factors in this solution correspond to Factors 2 and 3 in the factor solution reported for all 13 test scores. The two serial-recall tasks load highly on Factor 1, with digit span showing a significant but lower weighting on this factor. The remaining two recognition tasks (word and nonword recognition) load highly on Factor 2. Digit span is also associated, although to a lesser degree, with Factor 2. The adequacy of this two-factor structure to the phonological loop was tested further in a structural equation model. Input to the model consisted of the partial correlation matrix for the six phonological loop measures, with chronological age in months covaried out. The resulting model, with standardized path coefficients, is shown in Figure 3. The two latent factors shared a covariance coefficient of .48. This model provided an extremely good fit of the data. The X2 value, with 7 degrees of freedom, was 3.851, with a nonsignificant probability value of .80. The Bentler-Bonett Normed Fit Index was ,97, the Bentler-Bonett Nonnormed Fit Index was 1.07, and the Comparative Fit Index for the model was 1.00. All individual path coefficients were significant by t statistic, p < .05. These analyses suggest two significant factors or mechanisms within the phonological loop that contribute to individual differences in children's performance on the tasks. One factor supports performance on tasks with low retrieval and output requirement, whereas tasks requiring substantial accuracy in retrieval and spoken output processes are mediated primarily by the other factor.

Working Memory Test Scores and Educational Attainment


The interrelationships between test battery scores indicate high degrees of construct validity for our assessments of two of the three subcomponents of working memory, the phonological loop and the central executive. Earlier in this article, we reviewed evidence that the functioning of these two subcomponents of working memory is closely linked with a range of aspects of intellectual achievements in the school years. As a preliminary test of the validity of the factor structure of the working memory test battery, the relationships between working memory scores and children's achievements on the standardized tests of vocabulary, literacy, and arithmetic at ages 7 and 8 years were investigated. A single, weighted measure of phonological loop and central executive function was computed by applying the standardized path coefficients running from the latent construct to each measure in the structural equation model shown in Figure 2. Thus, for each child, the central executive score was the average of the following values: .58 counting recall z score, .69 listening recall z score, and .48 backward digit recall z score. The corresponding phonological loop score was the average of the following values: .52 nonword repetition z score, .52 serial recall z score, .81 digit span z score, and .54 serial recognition z score. A literacy composite score at age 7 years was obtained by averaging the z scores on three measures: the single-word recognition score from the BAS (Elliot, 1983), the sentence reading accuracy score from the Neale Analysis of Reading Ability (Neale, 1989), and the spelling test from the BAS (Elliot, 1992). Another composite literacy score was constructed by averaging the z scores for the BAS spelling and reading test scores obtained at age 8 years. The correlational analyses performed on these scores are summarized in Table 4. In each case, the a level was set to .01 to minimize spuriously significant correlations arising from the relatively large number of variables included in the matrix. A clear and distinctive pattern of association between the working memory subscores and the three domains of achievement emerged from

.48
w

oowordl Wo;d I o, ,,.ord I \

Word llNo,,,,'ord

Figure 3. Latentconstruct model of the phonologicalloop subtest scores.

WORKING MEMORY IN CHILDREN Table 4 Simple and Partial Correlations Between Weighted Working Memory Scores and Attainment Scores at Ages 7 and 8 Controlled variables Attainment measure None Age Age CE Age PL

387

Phonological loop Vocabulary 7 Literacy 7 Arithmetic 7 Vocabulary 8 Literacy 8 Arithmetic 8 .53* .40* .41" .50* .30 .30 .56* .40* .43* .51" .30 .31 Central executive Vocabulary 7 Literacy 7 Arithmetic 7 Vocabulary 8 Literacy 8 Arithmetic 8 .61" .52* .56* .52* .49* .55* .56* .47* .50* .48* .45* .50* .40* .34* .37* ,32 ,37* ,42" .41" .23 .25 .41" .23 .25

Note. CE = central executive;PL = phonologicalloop. *p < .01.

these analyses. Phonological loop scores were significantly associated with performance on all three types of ability at age 7 years and with the vocabulary measure obtained at age 8 years. However, these scores shared a unique predictive relationship with only the two vocabulary scores once variance associated with age and central executive scores had been partialed out. The link between the phonological loop and vocabulary knowledge was undiminished over the 1-year period between the original working memory assessment and vocabulary test at 8 years. There was no evidence for any correspondingly specific links between the phonological loop and either literacy or arithmetic ability. Central executive scores, on the other hand, retained significant unique associations with performance in all three domains at the original time of testing, even after phonological loop scores had been taken into account. For the literacy and arithmetic scores, these associations remained significant over the year separating the original and second assessment, although the specific link with vocabulary test scores fell to a nonsignificant value when the children were tested 1 year later. Discussion The findings from this study establish a high degree of both internal and external validity to the working memory test battery. The statistical structure of the subtest scores corresponds closely to the standard model of two of the components, the phonological loop and the central executive, which guided the initial selection of the subtests. The data reinforce the widely held view that the phonological loop supports performance on tasks that involve temporary verbal storage, whereas the central executive is tapped by tasks that impose simultaneous demands on both processing and storage of information for brief periods of time. Performance on the phonological loop and central executive measures showed significant associations, an important finding

that may have a number of possible explanations. One possibility is that, in line with the view that the central executive may have housekeeping duties within the working memory system, the executive plays a key role in relaying and retrieving information to and from the phonological loop. It may therefore constrain the functioning of phonological storage-only tasks as well as the more complex working memory activities that are supported directly by the central executive. Alternatively, it may be that the predominantly verbal nature of the central executive subtests incorporated in the test battery led to some dependence on the phonological loop for storage of the verbal material in these complex span tasks. These two accounts are difficult to distinguish in the absence of central executive subtests that are nonverbal in nature, and as yet, no such tests have been developed that are appropriate for use with children. A final possibility is that the association between the phonological loop and central executive reflects the contributionto both systems of a higher-order construct that may possibly correspond to general intelligence (Jurden, 1995; Kyllonen & Christal, 1990). There is relatively little evidence here that the visuospatial subtests included in the test battery genuinely tap a distinct component of working memory. The individual measures tests were not highly associated with one another and generally shared stronger links with central executive measures. This may simply reflect limitations in the specific methods and stimuli that will be remedied by further test development; certainly, the retest reliability of one of the measures (dynamic matrices) was very low. There are, however, at least two other potential explanations for this pattern of findings. A number of other experimental studies of children and adults have also suggested central executive mediation of visuospatial memory performance (Phillips & Christie, 1977; Wilson et al., 1987), raising the possibility that at least some visuospatial memory activities may be supported by the general-purpose processing resources of the central executive rather than a highly specialized visuospatial system. Alternatively, the present findings may be restricted to young children. The capacity to use imagery to support processing and memory activities is only just beginning to develop during the early school years (for review, see Pressley, 1977), with its emergence closely linked to the child's developing short-term memory capacity (Cariglia-Bull & Pressley, 1990). It is therefore possible that the specific processing and storage capacities that characterize the visuospatial sketchpad are not fully developed in the 6- and 7-year-old children participating in the present study and that this is the root of the unexpected dependence on central executive capacities in this group. The data obtained in this study have served not only to confirm the theoretical expectations that guided the selection of the individual subtests, but have also provided new insights into the detailed nature of the working memory model. In the present data, the phonological loop decomposes into two distinct but associated subcomponents that contribute to individual differences in children's scores on these tasks. One subeomponent supports performance on recognition-based tasks that place low demands on the fine-grained storage of phonological structure, as they require no spoken recall, whereas tasks requiring a high degree of accuracy in the retrieval and output of physical structure of the verbal stimuli are mediated primarily by the other subcomponent. Interestingly, the most commonly used measure of verbal short-term memory capacity--digit spanmis the only phonological loop measure that

388

GATHERCOLE AND PICKERING ing memory for the acquisition of the phonological and conceptual forms of new words. In summary, the new test battery is capable of providing a fine-grained theoretical analysis of working memory function in children and may also potentially be extended to use with adults. Further development and refinement of materials and methods is required to improve the psychometric properties of some of the tests, In its present form alone, however, the validation of its internal structure for this cohort of unselected children provides a substantial platform for the use of the battery to investigate key issues relating to working memory function and dysfunction in developmental and neuropsychological pathologies and also to guide the exploration of the role of subcomponents of working memory in the acquisition of knowledge and skills during childhood. References
Adams, A.-M., & Gathercole,S. E. (1995).Phonological working memory and speech production in preschool children. Journal of Speech and Hearing Research, 38, 403-414. Allport, D. A. (1984). Auditory-verbal short-term memory and conduction aphasia. In H. Bouma and D. G. Bouwhuis (Eds.), Attention and performance X: Control of language processes (pp. 313-326). London: Erlbaum. Baddeley, A. D. (1986). Working memory. Oxford, England: Oxford University Press. Baddeley, A. D. (1996). Exploring the central executive. Quarterly Journal of Experimental Psychology, 49A, 5-28. Baddeley, A. D., Emslie, H., Kolodny, J., & Duncan, J. (1998). Random generation"and the executive control of memory. Quarterly Journal of Experimental Psychology, 51A, 819-852. Baddeley, A. D., Emslie, H., & Nimmo-Smith, I. (1992). The Speed and Capacity of Language Processing (SCOLP) Test. Suffolk, England: Thames Valley Test Company. Baddaley, A. D., & Gathercole, S. E. (1999). Individual differences in learning and memory: Psychometrics and the single case. In P. L. Ackerman, P. C. Kyllonen, & R. Roberts (Eds.), The future of learning and individual differences research: Processes, traits, and content (pp. 31-54). Washington, DC: American Psychological Association. Baddeley, A. D., Gathercole, S. E., & Papagno, C. (1998). The phonological loop as a language learning device. Psychological Review, 105, 158-173. Baddeley, A. D., & Hitch, G. (1974). Working memory. In G. A. Bower (Ed.), Recent advances in learning and motivation (Vol. 8, pp. 47-90). New York: Academic Press. Baddeley, A. D., & Lieberman, K. (1980). Spatial working memory. In R. Nickerson (Ed.), Attention and performance VIII (pp. 521-539). Hillsdale, NJ: Erlbaum. Baddeley, A. D., Logic, R., Bressi, S., Della Sala, S., & Spinnler, H. (1986). Dementia and working memory. Quarterly Journal of Experi" mental Psychology, 38,4, 603-618. Baddeley, A. D., Papagno, C., & Vallar, G. (1988). When long-term learning depands on short-term storage. Journal of Memory and Language, 27, 586-595. Baddeley, A., Thomson, N., & Buchanan, M. (1975). Word length and the structure of short-term memory. Journal of Verbal Learning and Verbal Behavior, 14, 575-589. Bentler, P. M., & Wu, E. J. C. (1995). EQS for Macintosh User's Guide. Encino, CA: Multivariate Software. Bishop, D. V. M., Bishop, S. J., Bright, P., James, C., Delaney, T., & Tallal, P. (1999). Different origin of auditory and phonological processing problems in children with language impairment: Evidence from a

loads significantly on both of these subcomponents, presumably because it places intermediate demands on the accuracy of phonological storage and output as a consequence of the physical redundancy and high degree of familiarity of the stimulus set. In contrast, tasks involving the recall of open sets of verbal stimuli are associated solely with the high maintenance component of the system. An intriguing feature of this partitioning of the phonological loop concerns its alignment, or rather lack of alignment, with the two-component model of the phonological loop, according to which there is a basic phonological storage device that is supplemented by a subvocal rehearsal process (Baddeley, 1986; Vallar & Baddeley, 1984). Could the recognition measures be tapping the phonological store alone, whereas the recall tasks involve rehearsal too? Although not impossible, this seems unlikely at present for two reasons. First, subvocal rehearsal appears to be used in serialrecognition as well as serial-recall paradigms (Gathercole & Brown, 1999). Second, the nonword repetition task was found in our study to be associated with the high storage load component of the phonological loop, but probably does not require rehearsal (Baddeley, Gathercole, et al., 1998). It could be the case that the two-component structure of the phonological loop that emerges from the present data set is simply a snapshot of the unique developmental phase of 6- and 7-year-old children and therefore will not generalize to other age groups. This age certainly appears to be one that features important qualitative shifts in working memory function, particularly with respect to the emergence of subvocal rehearsal (Gathercole & Hitch, 1993). Therefore, a priority for future research is to investigate whether the same factor structure for the phonological loop is present in older children and adults. The relationships between working memory test scores and the various attainment measures obtained at the initial time of testing at 7 years of age and 1 year later provide evidence for the external validity and likely future utility of the test battery. In line with findings from many previous studies, the central executive and phonological loop components of the test battery were found to have markedly distinct patterns of associations with different ability domains. The central executive measures shared significant unique links with children's test scores in all three domains (vocabulary, literacy, and arithmetic) at the time of initial testing, and with literacy and arithmetic achievements 1 year later. These results certainly fit well with previous links found between central executive function and reading (e.g., Daneman & Carpenter, 1980; Dixon et al., 1988), arithmetic (Siegal & Linder, 1984; Siegel & Ryan, 1989), and the conceptual component of vocabulary acquisition (Daneman & Green, 1986). Phonological loop scores, on the other hand, shared unique links with scores on the vocabulary measures only. These associations were strong and persisted undiminished over the 1-year longitudinal phase of the study. This pattern of findings is entirely consistent with the large body of evidence indicating that the phonological loop plays a crucial role in supporting the long-term learning of the phonological structure of new words (for review, see Baddeley, Gathercole, et al., 1998). The evidence for independent pathways running from the phonological loop and the central executive to vocabulary learning is new and may possibly reflect the differential support provided by the two components of work-

WORKING MEMORY IN CHILDREN twin study. Journal of Speech, Language, and Hearing Research, 42, 155-168. Bishop, D. V. M., North, T., & Donlan, C. (1996). Nonword repetition as a phenotypic marker for inherited language impairment: Evidence from a twin study. Journal of Child Psychology and Psychiatry, 33, 1-64. Bishop, D. V. M., & Robson, J. (1989). Unimpaired short-term memory and rhyme judgment in congenitally speechless individuals: Implications for the notion of "articulatory coding." Quarterly Journal of Experimental Psychology, 41A, 123-140. Blake, J., Austin, W., Cannon, M., Lisus, A., & Vaughan, A. (1994). The relationship between memory span and measures of imitative and spontaneous language complexity in preschool children. International Journal of Behavioral Development, 17, 91-108. Bull, R., Johnson, R. S., & Roy, J. A. (1999). Exploring the roles of the visuo-spatial sketchpad and central executive in children's arithmetical skills: Views from cognition and developmental neuropsychology. Developmental Neuropsychology, 15, 421-442. Campbell, R., & Butterworth, B. (1985). Phonological dyslexia and dysgraphia in highly literate subjects: A developmental case with associated deficits of phonemic processing and awareness. Quarterly Journal of Experimental Psychology, 37A, 435-476. Campbell, T. F., DoUaghan, C., Needleman, H., & Janosky, J. (1997). Reducing bias in language assessment: Processing-dependent measures. Journal of Speech, Language, and Hearing Research, 40, 519-525. Cariglia-Bull, T., & Pressley, M. (1990). Short-term memory differences between children predict imagery effects when sentences are read. Journal of Experimental Child Psychology, 49, 384-398. Case, R. D., Kurland, M., & Goldberg, J. (1982). Operational efficiency and the growth of short-term memory span. Journal of Experimental Child Psychology, 33, 386-404. Cheung, H. (1996). Nonword span as a unique predictor of secondlanguage vocabulary learning. Developmental Psychology, 32, 867-873. Christal, R. E. (1991). Comparative validities of ASVAB and lamp tests for logic gates learning (Armstrong Laboratory Technical Report). Coltheart, V. (1993). Effects of phonological similarity and concurrent irrelevant articulation on short-term-memory recall of repeated and novel word lists. Memory & Cognition, 21, 539-545. Daneman, M., & Carpenter, P. A. (1980). Individual differences in working memory and reading. Journal of Verbal Learning and Verbal Behaviour, 19, 450-466. Daneman, M., & Carpenter, P. A. (1983). Individual differences in integrating information within and between sentences. Journal of Experimental Psychology: Learning, Memory, and Cognition, 9, 561-584. Daneman, M., & Green, I. (1986). Individual differences in comprehending and producing words in context. Journal of Memory and Language, 25, 1-18. Daneman, M., & Merikle, P. M. (1996). Working memory and language comprehension: A meta-analysis. Psychonomic Bulletin & Review, 3, 422-433. Dark, V. J., & Benbow, C. P. (1990). Enhanced problem translation and short-term memory: Components of mathematical talent. Journal of Educational Psychology, 82, 420-429. Della Sala, S., Gray, C., Baddeley, A. D., Allamano, N., & Wilson, L. (1999). Pattern span: A tool for unwelding visuo-spatial memory. Neuropsychologia, 37, 1199-1199. De Renzi, E., & Nichelli, P. (1975). Verbal and nonverbal short term memory impairment following hemispheric damage. Cortex, 11, 341353. D'Esposito, M., Detre, J. A., Alsop, D. C., Shin, R. K., Atlas, S., & Grossman, M. (1995). The neural basis of the central executive system of working memory. Nature, 378, 279-281. Dixon, P., LeFevre, J., & Twilley, L. C. (1988). Word knowledge and working memory as predictors of reading skill. Journal of Educational Psychology, 80, 465-472.

389

Dnnn, L. M., Dnnn, L. M., Whetton, C., & Pintilie, D. (1982). The British Picture Vocabulary Scale. Windsor, England: NFER Nelson. Elliot, C. D. (1983). The British Ability Scales. Windsor, England: NFER Nelson. Elliot, C. D. (1990). Differential Ability Scales. New York: Psychological Corporation. Elliot, C. D. (1992). The British Ability Scales Spelling scale. Windsor, England: NFER Nelson. Engie, R. W., Nations, J. K., & Cantor, J. (1990). Is "working memory capacity" just another name for word knowledge? Journal of Educational Psychology, 82, 799-804. Farah, M. J., Hammond, K. M., Levine, D. N., & Calvanio, R. (1988). Visual and spatial mental imagery: Dissociable systems of representation. Cognitive Psychology, 20, 439-462. Flavell, J. H., Beach, D. R., & Chinsky, J. M. (1966). Spontaneous verbal rehearsal in a memory task as a function of age. Child Development, 37, 283-299. Ford, S., & Silber, K. P. (1994). Working memory in children: A developmental approach to the phonological coding of pictorial material. British Journal of Developmental Psychology, 12, 165-175. Gathercole, S. E. (1995). The assessment of phonological memory skills in preschool children. British Journal of Educational Psychology, 65, 155-

164.
Gathercole, S. E., & Adams, A. (1993). Phonological working memory in very young children. Developmental Psychology, 29, 770-778. Gathercole, S. E., & Baddeley, A. D. (1989). Evaluation of the role of phonological STM in the development of vocabulary in children: A longitudinal study. Journal of Memory and Language, 28, 200-213. Gathercole, S. E., & Baddeley, A. D. (1990). The role of phonological memory in vocabulary acquisition: A study of young children learning new names. British Journal of Psychology, 81, 439-454. Gathercole, S. E., & Baddeley, A. D. (1993). Working memory and language. Hove, England: Erlbaum. Gathercole S. E., & Baddeley, A. D. (1996). The Children's Test of Nonword Repetition. New York: Psychological Corporation. Gathercole S. E., & Brown, L. (1999). Evidence for phonological storage and subvocal rehearsal in serial recognition. Manuscript in preparation. Gathercole S. E., & Hitch, G. J. (1993). Developmental changes in short-term memory: A revised working memory perspective. In A. Collins S. E. Gathercole, M. A. Conway, & P. E. Morris (Eds.), Theories of memory (pp. 189-210). Hove, England: Erlbaum. Gathercole S. E., & Picketing, S. J. (in press). Estimating the capacity of phonological short-term memory. International Journal of Psychology. Gathercole S. E., Service, E., Hitch, G. J., Adams, A.-M., & Martin, A. J. (1999). Phonological short-term memory and vocabulary development: Further evidence of the nature of the relationship. Applied Cognitive Psychology, 13, 65-77. Gathercole, S. E., Willis, C., Baddeley, A. D., & Emslie, H. (1994). The Children Test of Nonword Repetition: A test of phonological working memory. Memory, 2, 103--127. Gathercole, S. E., Willis, C., Emslie, H., & Baddeley, A. D. (1992). Phonological memory and vocabulary development during the early school years: A longitudinal study. Developmental Psychology, 28, 887-898. Golombok, S., & Rust, J. (1992). Manual of the Wechsler Intelligence Scale for Children (3rd ed., UK). Kent, England: Psychological Corporation. Haggard, M. P., Hinde, S. E., Wade, A. R., Bennett, K. E., & Greenwood, D. C. (1996, February). Optimising an outcome battery for OME in 5-7 year-dials. Paper presented at ARO, Florida. Hanley, J. R., Young, A. W., & Pearson, N. A. (1991). Impairment of the visuo-spatial sketch pad. Quarterly Journal of Experimental Psychology, 43A, 101-125. Henry, L. A. (1991). The effects of word length and phonemic similarity in

390

GATHERCOLE AND PICKERING resources for spatial thinking and language processing: An individual differences approach. Journal of Experimental Psychology: General, 125, 4-27. Shallice, T., & Butterworth, B. (1977). Short-term memory impairment and spontaneous speech. Neuropsychologia, 15, 729-735. Shallice, T., & Warrington, E. K. (1970). Independent functioning of verbal memory stores: A neuropsychologieal study. QuarterlyJournal of Experimental Psychology, 22, 261-273. Siegel, L. S. (1994). Working memory and reading: A life-span perspecfive. International Journal of Behavioural Development, 17, 109-124. Siegel, L. S., & Linder, B. A. (1984). Short-term memory processes in children with reading and arithmetic disabilities. Developmental Psychology, 20, 200-207. Siegel, L. S., & Ryan, E. B. (1989). The development of working memory in normally achieving and subtypes of learning disabled children. ChiM Development, 60, 973-980. Smith, E. E., & Jonides, J. (1997). Working memory: A view from neuroimaging. Cognitive Psychology, 33, 5-42. Smith, E. E., Jonides, J., & Koeppe, R. A. (1996). Dissociating verbal and spatial memory using PET. Cerebral Cortex, 6, 11-20. Speidel, G. E. (1989). A biological basis for individual differences in learning to speak. In G. E. Speidel & K. E. Nelson (Eds.), The many faces of imitation in language learning (pp. 199-229). New York: Springer-Verlag. Swanson, H. L. (1992). Generality and modifiability of working memory among skilled and less skilled readers. Journal of Educational Psychology, 84, 473-488. Swanson, H. L. (1993). Working memory in learning disability subgroups. Journal of Experimental Child Psychology, 56, 87-114. Swanson, H. L. (1994). Short-term memory and working memory: Do both contribute to our understanding of academic achievement in children and adults with learning disabilities? Journal of Learning Disabilities, 27, 34-50. Swanson, H. L., & Alexander, J. E. (1997). Cognitive processes as predictors of word recognition and reading comprehension in learningdisabled and skilled readers: Revisiting the specificity hypothesis. Journal of Educational Psychology, 89, 128-158. Towse, J. N., Hitch, G., & Hutton, U. (1998). A reevaluation of working memory capacity in children. Journal of Memory and Language, 39, 195-217. Turner, M. L., & Engle, R. W. (1989). Is working memory capacity task-dependent? Journal of Memory and Language, 28, 127-154. Vallar, G., & Baddeley, A. D. (1984). Fractionation of working memory: Neuropsyehological evidence for a short-term store. Journal of Verbal Learning and Verbal Behavior, 23, 151-161. Wechsler, D. (1986). WechslerIntelligence Scale for ChiMren--Revised. New York: Psychological Corporation. Wilson, J. T. L., Scott, J. H., & Power, K. G. (1987). Developmental differences in the span of visual memory for pattern. British Journal of Developmental Psychology, 5, 249-255. Young, D. (1996). GroupMathematics Test. Southend, England: Hodder & Stoughton. Yuill, N., Oakhill, J., & Parkin, A. (1989). Working memory, comprehension ability and the resolution of text anomaly. British Journal of Psychology, 80, 351-361.

young children's short-term memory. Quarterly Journal of Experimental Psychology, 43A, 35-52. Hulme, C., Manghan, S., & Brown, G. D. A. (1991). Memory for familiar and unfamiliar words: Evidence for a long-term memory contribution to short-term memory span. Journal of Memory and Language, 30, 685701. Johnston, R. S., Rugg, M. D., & Scott, T. (1987). Phonological similarity effects, memory span, and developmental reading disorders: The nature of the relationship. British Journal of Psychology, 78, 205-211. Jurden, F. H. (1995). Individual differences in working memory and complex cognition. Journal of Educational Psychology, 87, 93-102. Just, M. A., & Carpenter, P. A. (1992). A capacity theory of comprehension: Individual differences in working memory. Psychological Review, 99, 122-149. Kyllonen, P. C., & Christal, R. E. (1990). Reasoning ability (is little more than) working memory capacity. Intelligence, 14, 389-433. Logie, R. H. (1986). Visuo-spatial processing in working memory. Quarterly Journal of Experimental Psychology, 38A, 229-247. Logie, R. H. (1995). Visuo-spatial working memory. Hove, England: Erlbaum. Logic, R. H., & Pearson, D. G. (1997). The inner eye and the inner scribe of visuo-spatial working memory: Evidence from developmental fractionation. European Journal of Cognitive Psychology, 9, 241-257. Martin, R. C., & Breedin, S. D. (1992). Dissociations between speech perception and phonological short-term memory deficits. CognitiveNeuropsychology, 9, 509-534. Miles, C., Morgan, M. J., Miine, A. B., & Morris, E. D. M. (1996). Developmental and individual differences in visual memory span. Current Psychology, 15, 53-67. Montgomery, J. (1995). Examination of phonological working memory in specifically language-impaired children. Applied Psycholinguistics, 16, 355-378. Morra, S. (1994). Issues in working memory measurement: Testing for M capacity. International Journal of Behavioural Development, 17, 143159. Neale, M. D. (1989). Neale Analysis of Reading Ability (rev. British ed.). Windsor, England: NFER Nelson. Papagno, C., Valentine, T., & Baddeley, A. D. (1991). Phonological short-term memory and foreigu-language vocabulary learning. Journal of Memary and Language, 30, 331-347. Papagno, C., & Vallar, G. (1995). Short-term memory and vocabulary learning in polyglots. Quarterly Journal of Experimental Psychology, 48,4, 98-107. Phillips, W. A., & Christie, D. F. M. (1977). Interference with visualization. Quarterly Journal of Experimental Psychology, 29, 637-650. Picketing, S. J., Gathercole, S. E., Hall, M., & Lloyd, S. (in press). Development of memory for pattern and path: Further evidence for the fractionation of visuo-spatial short-term memory. Quarterly Journal of Experimental Psychology. Picketing, S. J., Gathercole, S. E., & Peaker, M. (1998). Verbal and visuo-spatial short-term memory in children: Evidence for common and distinct mechanisms. Memory & Cognition, 26, 1117-1130. Pressley, M. (1977). Imagery and children's learning: Putting the picture in developmental perspective. Review of Educational Research, 47, 585622. Service, E. (1992). Phonology, working memory, and foreign-language learning. Quarterly Journal of Experimental Psychology, 45A, 21-50. Service, E., & Kohnnen, V. (1995). Is the relation between phonological memory and foreign-language learning accounted for by vocabulary acquisition? Applied Psycholinguistics, 16, 155-172. Shah, P., & Miyake, A. (1996). The separability of working memory

Received January 22, 1999 Revision received July 21, 1999 Accepted July 21, 1999

You might also like