Working Memory Performance in Children With and Without Specific Language Impairment in Two Nonmainstream Dialects of English

Applied Psycholinguistics, page 1 of 23, 2017
doi:10.1017/S0142716417000509
Working memory performance in

children with and without specific
language impairment in two
nonmainstream dialects of English
JANET L. MCDONALD, CHRISTY M. SEIDEL,
REBECCA HAMMARLUND, and JANNA B. OETTING
Louisiana State University
Received: June 15, 2016 Accepted for publication: May 9, 2017
ADDRESS FOR CORRESPONDENCE

Janet L. McDonald, Department of Psychology, Louisiana State University, Baton Rouge,
LA 70803. E-mail: psmcdo@lsu.edu
ABSTRACT
Using speakers of either African American English or Southern White English, we asked whether
a working memory measure was linguistically unbiased, that is, equally able to distinguish between
children with and without specific language impairment (SLI) across dialects, with similar error profiles
and similar correlations to standardized test scores. We also examined whether the measure was affected
by a child’s nonmainstream dialect density. Fifty-three kindergarteners with SLI and 53 typically
developing controls (70 African American English, 36 Southern White English) were given a size
judgment working memory task, which involved reordering items by physical size before recall, as
well as tests of syntax, vocabulary, intelligence, and nonmainstream density. Across dialects, children
with SLI earned significantly poorer span scores than controls, and made more nonlist errors. Span and
standardized language test performance were correlated; however, they were also both correlated with
nonmainstream density. After partialing out density, span continued to differentiate the groups and
correlate with syntax measures in both dialects. Thus, working memory performance can distinguish
between children with and without SLI and is equally related to syntactic abilities across dialects.
However, the correlation between span and nonmainstream dialect density indicates that processing-
based verbal working memory tasks may not be as free from linguistic bias as often thought. Additional
studies are needed to further explore this relationship.
Keywords: dialects; nonmainstream dialect density; specific language impairment; working memory
Children with specific language impairment (SLI) have normal intelligence, but
have more difficulty with various aspects of language including syntax and mor-
phology, vocabulary, and phonological processing than typically developing (TD)
children (Leonard, 2014; Schwartz, in press). They also tend to have deficits in
verbal working memory (Briscoe & Rankin, 2009; Ellis Weismer, Plante, Jones,
& Tomblin, 2005; Frizelle & Fletcher, 2015; Lum, Conti-Ramsden, Page, &
Ullman, 2012; Mainela-Arnold & Evans, 2005; Mainela-Arnold, Evans, & Coady,
© Cambridge University Press 2017 0142-7164/17
Downloaded from https://www.cambridge.org/core. University of Florida, on 06 Nov 2017 at 23:22:38, subject to the Cambridge Core terms
of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0142716417000509
Applied Psycholinguistics 2
McDonald et al.: Working memory performance
2010; Montgomery, 2000a; 2000b; Montgomery & Evans, 2009; Montgomery,

Evans, & Gillam, 2009; Vugs, Hendriks, Cuperus, & Verhoeven, 2014). Working
memory, the ability to hold and simultaneously process information, is measured
by tasks that require an individual to perform some kind of operation while also
remembering information for recall (Just & Carpenter, 1992). In TD children,
performance on working memory tasks correlates to standardized tests of syntax
(Engel de Abreu, Gathercole, & Martin, 2011; Haake, Hansson, Gulz, Schötz, &
Sahlén, 2014; Magimairaj & Montgomery, 2012; but see Lum et al., 2012) and vo-
cabulary (Adams, Bourke, & Willis 1999; Engel de Abreu et al., 2011), as well as
to grammaticality judgments (McDonald, 2008) and often, but not always, to sen-
tence comprehension (Montgomery, 2000a, 2000b; Montgomery & Evans, 2009;
Montgomery, et al., 2009). The picture is murkier for children with SLI, with work-
ing memory scores and sentence comprehension sometimes positively correlating
(Montgomery & Evans, 2009; Montgomery et al., 2009), sometimes not corre-
lating (Montgomery, 2000b), and once even negatively correlating (Montgomery,
2000a).
In the current work, we further explored working memory as an important cor-
relate to children’s language abilities and the language deficits of children with
SLI using speakers of two nonmainstream dialects of English: African American
English (AAE) and Southern White English (SWE). Our focus on AAE and SWE
was strategic because both are spoken in the rural south and in both of these di-
alects, a child’s language aptitude is difficult to assess with traditional measures.
For morphology in particular, utterances with omitted grammatical morphemes
(e.g., She Ø walking) are both a hallmark feature of these dialects and fit the lin-
guistic profile of SLI (Oetting, Lee, & Porter, 2013; Oetting & McDonald, 2001;
Seymour, Bland-Stewart, & Green, 1998). This overlap in features across dialects
and the SLI condition makes grammatical morphology extremely difficult to eval-
uate in speakers of these nonmainstream dialects. Given this, working memory
measures, if they are free from linguistic bias, could be extremely useful for help-
ing to identify childhood SLI within AAE and SWE and perhaps within other
groups of linguistically diverse learners.
In the current work, we examined the dialect neutrality of a verbal working
memory measure by asking if across dialects the measure is sensitive to variation
in language aptitude to the same degree, generates similar types of error profiles
in children, and correlates similarly to other ability measures. Between and within
the dialects, we also asked whether the measure is influenced by the density with
which children produce nonmainstream forms, which should not be the case if the
measure is free from bias. Between dialects, AAE and SWE are ideally suited to
examine this question because AAE tends to have much higher nonmainstream
density than SWE (Oetting, 2015; Oetting & McDonald, 2001). Within the dialects,
speakers also differ in the densities of their nonmainstream forms, with some
speakers producing low levels, others producing moderate levels, and still others
producing high levels (Oetting & McDonald, 2002; Terry, Connor, Thomas-Tate,
& Love, 2010; Washington & Craig, 1994). Below we review the literature on
working memory deficits in children with SLI, the identification of SLI within
AAE and SWE, and dialect density as an important metric for studies of linguistic
bias.
WORKING MEMORY DEFICITS IN CHILDREN WITH SLI

As detailed below, poor verbal working memory capacity as a possible deficit
in children with SLI has been investigated in a number of studies using several
different verbal working memory tasks. It is generally found that children with
SLI do worse than age-matched controls regardless of the task.
Backward digit span tasks require children to repeat lists of digits that vary in
length in reverse order. Results show that children with SLI, aged 4 to 12 years,
have lower backward digit span scores than age-matched controls (Briscoe &
Rankin, 2009; Frizelle & Fletcher, 2015; Lum et al., 2012; Vugs et al., 2014; for
studies that found trends but not significant differences, see Petruccelli, Bavin,
& Bretherton, 2012; Quail, Williams, & Leitão, 2009). Counting span tasks ask
children to count the number of objects in successive arrays, and then recall the
number of objects in each array at the end of the sequence. Sequences increase in
length until children are no longer able to recall the counts. As with the backward
digit span, studies show that children with SLI, aged 4 to 12 years, have lower spans
than their age-matched counterparts (Frizelle & Fletcher, 2015; Lum et al., 2012;
Vugs et al., 2014). Finally, listening span tasks require children to make true/false
judgments of sentences they hear while simultaneously retaining the final word
of each for later recall. Across various versions of this task, children with SLI,
aged 4 to 14 years, generally do not differ from controls in their ability to judge
the truth value of the sentences; however, in terms of words recalled, children
with SLI score lower than age- but not language-matched controls (Briscoe &
Rankin, 2009; Ellis Weismer, Evans, & Hesketh, 1999; Ellis Weismer et al., 2005;
Frizelle & Fletcher, 2015; Laing & Kamhi, 2003; Lum et al., 2012; Mainela-Arnold
& Evans, 2005; Mainela-Arnold et al., 2010; Marton & Eichorn, 2014; Marton,
Kelmenson, & Pinkhasova, 2007; Marton & Schwartz, 2003; Marton, Schwartz,
Farkas, & Katsnelson, 2006; Montgomery & Evans, 2009; Rodekohr & Haynes,
2001; Vugs et al., 2014). In addition to their listening span scores, children with
SLI and their age-matched controls differ in the type and amount of nonlist words
produced, with children with SLI more likely to give words that had occurred on
previous lists, or had occurred elsewhere in the sentence (Ellis Weismer et al.,
1999; Marton & Eichorn, 2014; Marton & Schwartz, 2003; Marton et al., 2006,
2007).
The task used in this study, size judgment (Cherry, Elliott, & Reese, 2007; Cherry
& Park, 1993), is a verbal working memory measure that does not require children
to count or process syntax or morphology. It involves hearing a list of nouns, and
then upon recall reordering them in terms of the physical size of the referent, from
smallest to largest. Performance on this task is highly correlated to performance
on both the backward digit span and listening span tasks in adults (Cherry et al.,
2007), validating its use as a working memory measure. Montgomery (2000a,
2000b; Montgomery et al., 2009) developed a version of the task appropriate for
children that involves three levels of processing difficulty. Their lists contain items
of various sizes from two different semantic categories (e.g., clothing and animals).
In the easiest level, children say the words back in any order; in the intermediate
level, they repeat the items back in order of size regardless of semantic category;
and in the hardest level, they must do two reorderings upon recall, one by semantic
category and then within each category, by size. Montgomery (2000a, 2000b;
Montgomery et al., 2009) found that children with SLI, aged 7 to 10 years, did not
differ from their age-matched controls on the easy or intermediate levels of this
task, but they scored lower than the controls for the hardest level of the task.
IDENTIFICATION OF CHILDHOOD SLI WITHIN NONMAINSTREAM

DIALECTS OF ENGLISH
It is difficult to identify children with SLI in the context of different nonmainstream
dialects such as SWE and AAE. Relative to mainstream dialects of English, the
study of AAE and SWE in children has been minimal, and linguistic milestones to
benchmark typical development (or flag impairment) within these dialects have yet
to be fully established (for recent work, see Newkirk-Turner, Oetting, & Stockman,
2016; Stockman, Guillory, Seibert, & Boult, 2013; Stockman, Newkirk-Turner,
Swatzlander, & Morris, 2016). As mentioned earlier, dialects such as AAE and
SWE allow grammatical omissions, including those that mark tense and agreement
morphemes (e.g., auxiliaries BE and DO, third-person morphemes, and past-tense
morphemes) that are well known to be characteristic of the SLI condition. Although
there is now some evidence showing that children with and without SLI can be
differentiated within these dialects by the frequency of their omissions for at least
some grammatical morphemes (Cleveland & Oetting, 2013; Garrity & Oetting,
2010; Oetting & McDonald, 2001; Oetting & Newkirk, 2008; Rivière & Oetting,
2017), much of this evidence has come from labor-intensive analyses of language
samples that have included over 200 utterances per child. Within clinical practice,
detailed analyses of language samples of this length are likely not feasible.
Linguistic biases within standardized tests also contribute to the difficulty of
identifying children with SLI within AAE and SWE. These biases have been
shown to surface not only when the children being tested speak a nonmainstream
dialect but also when they come from a minority cultural background and/or are
economically disadvantaged (Wyatt, 2015). Given this, multiple researchers ad-
vocate for assessments to include processing-based measures such as working
memory tasks to circumvent language-related and/or experience-related test biases
(Campbell, Dollaghan, Needleman, & Janosky, 1997; Craig & Washington, 2000;
Engel, Santos, & Gathercole, 2008; Laing & Kahmi, 2003; Oetting & Cleveland,
2006; Rodekohr & Haynes, 2001; Washington & Craig, 2004). However, there
is some evidence that working memory tasks are not completely free from bias.
A closer look at Engel et al. (2008) shows that while no differences were found
by socioeconomic status on backward digit span or counting span when using a
Bonferroni correction, the counting span results were at the p = .03 level, hinting
there may be some possible biases. This trend was confirmed in further work by
Engel de Abreu, Puglisi, Cruz-Santos, Befi-Lopes, and Martin (2014), who found
that children with poor schooling, and presumably less practice at counting, had
significantly lower counting spans than children with better schooling.
Moreover, only a very limited number of studies have examined whether work-
ing memory tasks are good at discriminating between children with and without
language impairments across different races and dialects. Laing and Kamhi (2003)
found clinical status effects (African American language impaired < African
American controls) but not race effects (African American TD = White TD) in
third and fourth graders using a listening span task. Rodekohr and Haynes (2001)
also found a clinical status effect (language impaired < controls) but not a di-
alect/race effect (African American who were confirmed AAE speakers = White
whose dialect was not confirmed) in children, aged 7 years, using a listening span
task. Nevertheless, in this study there was a marginal interaction (p = .067) be-
tween clinical status and dialect, which related to a larger clinical effect for the
White children than for the AAE-speaking children. Ideally a test designed to
detect differences between children with and without SLI would show the same
magnitude of effect across different races and dialects.
In addition to identifying children with SLI with the same degree of accuracy
across dialects, an unbiased measure of working memory should also show similar
correlations to other measures of ability across dialects. Specifically, we should find
similar correlations between working memory scores and standardized language
and intelligence measures in both AAE and SWE. In both dialects we should
replicate the previously mentioned findings that children’s working memory scores
correlate with standardized measures of syntax (Engel de Abreu et al., 2011; Haake
et al., 2014; Magimairaj & Montgomery, 2012; but see Lum et al., 2012) and
vocabulary (Adams et al., 1999; Engel de Abreu et al., 2011); they should also
correlate with standardized measures of nonverbal intelligence (Adams et al., 1999;
Alloway, Gathercole, Willis, & Adams, 2004; Ellis Weismer et al., 1999; Engel de
Abreu, Conway, & Gathercole, 2010; Engel de Abreu et al., 2011).
NONMAINSTREAM DIALECT DENSITY AS AN IMPORTANT METRIC

FOR STUDIES OF BIAS
Besides being a speaker of a nonmainstream dialect, another factor that appears
to have important implications for investigating SLI in diverse groups of language
learners is a child’s nonmainstream dialect density. There are multiple ways to
measure a child’s nonmainstream dialect density, but all are correlated to each
other and involve calculating the relative frequency (i.e., rate) with which a child
produces nonmainstream forms (Horton & Apel, 2014; Oetting & McDonald,
2002). Both internal variables such as gender, age, and socioeconomic status and
external variables such as type of task, modality of task, and speaking partner
contribute to this variation (for some examples of child studies, see Barbu, Martin,
& Chevrot, 2014; Craig, Kolenic, & Hensel, 2014; Craig & Washington, 2004;
Craig, Zhang, Hensel, & Quinn, 2009; Ivy & Masterson, 2011; Mills, 2015; Van
Hofwegen & Wolfram, 2010; Washington & Craig, 1998).
In recognition of this variation, a growing number of researchers now include
nonmainstream dialect density metrics in their studies. In studies of mainstream
English ability, a child’s nonmainstream dialect density impacts performance. For
example, high nonmainstream dialect density child speakers have more difficulty
than low density speakers in identifying aurally presented mainstream words as
words (Brown, 2011), and identifying words in mainstream contexts that are am-
biguous in their nonmainstream dialect (Edwards et al., 2014). Children’s nonmain-
stream densities have also been found to correlate, often negatively, to children’s
standardized language test scores (Charity, Scarborough, & Griffin, 2004; Connor
& Craig, 2006; Craig, Thompson, Washington, & Porter, 2004; Terry et al., 2010).
For example, in Craig et al. (2004) the Gray Oral Reading Tests were identified as
containing biases because children’s densities were negatively correlated to their
reading rates. In Terry et al. (2010), children with high nonmainstream densities
also tended to have lower vocabulary and phonological awareness test scores in the
early grades (but, for a study showing no differences between density groups using
a median split procedure and raw test scores, see Moyle, Heilmann, & Finneran,
2014).
Of concern here, children’s nonmainstream dialect densities have been shown
to influence their performance on nonword repetition, another processing-based
task that is often viewed as less culturally and linguistically biased than traditional
language tests. Nonword repetition involves having children repeat nonwords of
increasing syllable lengths; children with SLI are poorer at the task than TD chil-
dren, especially at long syllable lengths (Dollaghan & Campbell, 1998). In two
early studies, Campbell et al. (1997) and Oetting and Cleveland (2006) found no
race or dialect effects for nonword repetition, although in both of these studies,
race and/or dialect was treated as a nominal variable and they did not look at dialect
density. In contrast to these studies, when Moyle et al. (2014) examined children’s
dialect use using a nonmainstream density measure, children with higher densities
were found to earn lower nonword repetition scores than those with lower den-
sities. Results from the Moyle et al. study, as well as the marginally significant
interaction of race/dialect with diagnostic group observed on a working memory
measure in Rodekohr and Haynes (2001), call into question the unbiased nature
of processing tasks in general.
SUMMARY, RESEARCH QUESTIONS, AND PREDICTIONS

Studies have found children with SLI evidence weaknesses in working mem-
ory and produce more nonlist errors relative to controls. Correlations between
children’s working memory and measures of their syntax, vocabulary, and intelli-
gence also have been found. In the current work, we extended the study of working
memory to children who spoke either AAE or SWE, two nonmainstream dialects
within which it has proven difficult to identify children with SLI with traditional
assessment tools. We were motivated to do this work because working memory
tasks, since they are processing based, are often viewed as free from cultural
and linguistic biases, and if this is the case, these tasks would be ideally suited
for the identification of childhood SLI in nonmainstream dialects, such as AAE
and SWE.
Using a size judgment task, we first asked whether this working memory mea-
sure was equally able to distinguish between children with and without SLI across
dialects, with similar child error profiles and similar correlations to other measures
in the two dialects. We also asked whether this working memory measure was af-
fected by a child’s nonmainstream dialect density. We hypothesized that regardless
of the children’s nonmainstream dialect type, those with SLI would have lower
span scores than the controls and that the magnitude of the effect would be the same
across the dialects. We also expected to find more nonlist errors on the working
memory task made by the children with SLI than by the TD controls and similar
correlations between the children’s span scores and their scores on other measures
of syntax, vocabulary, and intelligence across dialects. Finally, we expected the
children’s nonmainstream dialect densities to be unrelated to their span scores.
METHOD
Participants
Participants were 106 kindergarteners (M = 66.24 months, SD = 3.78, range =
59–74 months) who were classified by dialect as a speaker of either SWE (n = 36)
or AAE (n = 70) and clinical status as either SLI (n = 53, 18 SWE, 35 AAE) or TD
(n = 53, 18 SWE, 35 AAE). Details related to each child’s classification by dialect
and clinical status can be found in Oetting, McDonald, Seidel, and Hegarty (2016)
as these children also participated in that study. For convenience, a summary of
their testing profiles is presented in this paper as well.
As confirmed by listener judgments and a screening test, children’s dialect cor-
responded to their race, with SWE speakers being non–African American and
AAE speakers being African American (AA). Specifically, the listener judgment
task involved two out of three judges, blind to the child’s race and gender, agreeing
on the dialect classification from 1-min samples of conversation using a holistic
impression (Oetting & McDonald, 2002); using this method, 94% of non-African
American children were classified as SWE speakers and 90% of African American
children were classified as AAE speakers. The nonmainstream dialect status of the
remaining children was verified from longer speech samples, and from the presence
of nonmainstream responses on the language variation portion of the Diagnostic
Evaluation Language Variation Screening Test (DELV-S; Seymour, Roeper, & de
Villiers, 2003).
The DELV-S was also used to classify the children’s nonmainstream dialect
densities as low, medium, and high using the three-level classification system
provided by the test developers. The rating system considers each child’s number
of nonmainstream and mainstream responses compared to age-delimited criteria.
For speakers of SWE, there were 14 classified as low variation from mainstream
English (SLI: 0; TD: 14), 6 classified as medium with some variation (SLI: 4; TD:
2), and 16 classified as high with strong variation (SLI: 14; TD: 2). For speakers
of AAE, there were 6 classified as low (SLI: 0; TD: 6), 12 classified as medium
(SLI: 6; TD: 6), and 52 classified as high (SLI: 29; TD: 23). As mentioned earlier,
while there are many ways to measure a child’s nonmainstream dialect density,
such as DELV-S scores, listener judgments, or calculating the number of types or
tokens of nonmainstream structures produced, they all are correlated to each other
(Horton & Apel, 2014; Oetting & McDonald, 2002). While the test developers
of the DELV-S categorize children into three dialect density groups based on
their responses, we and others have also used the DELV-S items to calculate the
more continuous measure of each child’s percentage of nonmainstream responses
over the sum of the child’s nonmainstream and mainstream responses (Oetting
et al., 2016; Terry et al., 2010). For the children studied here, both these DELV-S
indices are highly correlated (r = .86, p < .001), and the results presented here
do not differ significantly as a function of the DELV-S metric used.1 We chose the
DELV-S three-level score as recommended by the test developers because of its

ease of use and relevance to clinical practice.
The children’s clinical status was determined through a battery of standard-
ized tests. Children in both groups passed a hearing screening, performed at or
above –1.2 SD of the normative mean on the Primary Test of Nonverbal Intel-
ligence (PTONI; Ehrler & McGhee, 2008), and at or above –1 SD of the nor-
mative mean on the Goldman–Fristoe test of articulation (Goldman & Fristoe,
2000). Children in the SLI group performed at or below –1 SD of the normative
mean on the syntax portion of the Diagnostic Evaluation of Language Variation—
Norm Referenced (DELV-NR; Seymour, Roeper, & de Villiers, 2005), while those
in the TD group performed above this cutoff. Although not used to exclude or
classify the participants, all of the children also completed the Peabody Picture
Vocabulary Test IV (PPVT; Dunn & Dunn, 2007), and 52 completed the gram-
mar subtests of the Test of Language Development—Primary 4 (TOLD; New-
comer & Hammill, 2008), which was added in the final years of data collection.
Each child with SLI was matched to a TD control based on dialect spoken, age,
and nonverbal IQ (PTONI), and then as much as it was possible, maternal edu-
cational level. Maternal education was gathered via parental questionnaire, and
measured by number of years of formal education (e.g., 12 = completion of high
school).
Materials
The size judgment task consisted of three lists at each of five list lengths, ranging
from lengths of 2 to 6 words. Thus, there were a total of 60 words across the
lists; each word was assigned to a particular list of a particular list length. All
words were one- or two-syllable concrete nouns whose size should be known to
kindergarteners (e.g., penny, book, and coat). Since the task involved reordering
the lists in terms of size, lists were devised so that the correct ordering of small-
est to largest size was fairly obvious, and this was validated by the researchers
who scored the task. With the exception of one list of length 2, none of the lists
was presented in the smallest to largest order of the objects, thus reordering was
necessary to give the lists back in order of size. The one list of length 2 that
was given in smallest to largest order was used to be sure children did not think
they should simply give the words back in reverse order. Words were digitally
recorded by a southern African American female native speaker of English and
presented via a laptop computer, with a 500-ms transition between each word in
a list.
Procedure
The study was approved by the Louisiana State University Institutional Review
Board, and parental consent and child assent was obtained for all participants.
Children were tested across multiple days, with the size judgment task given af-
ter standardized testing and a language sample. As part of a larger study, the
children also completed grammar probes and a sentence recall task. The size
judgment task, because it took less than 10 min, was fit in around these other
tasks, generally at the end or near the end of testing. All standardized tests were
administered as recommended. The size judgment task was administered accord-
ing to a script that explained that the task was to listen to the list, and then say
it back to the experimenter in order of the size of the physical object, starting
with the smallest, and proceeding to the largest. There were three practice lists
of length 2, where the experimenter explicitly asked the children which item was
smaller, and then which one was larger. The child was then asked to put the two
words together in that order. After these three explicitly guided practice lists, the
children did three additional practice lists of length 2 with corrective feedback.
Across the six practice lists, four had items arranged largest first, and two had
the smallest first. After the practice lists, children started the experimental lists,
starting with three lists of length 2, with lists increasing to a maximum of length
6. Lists were given to all children in the same order. The children’s responses
were recorded online by the experimenter, as well as digitally recorded for later
checking.
Scoring
The size judgment task was scored in two ways using an all or none method
and a partial credit method. Previous comparisons between these scoring sys-
tems has shown that partial credit scoring generally demonstrates more sensitivity
and shows higher correlations to other variables of interest (Conway et al., 2005;
Friedman & Miyake, 2005; Giofrè, & Mammarella, 2014; St. Clair-Thompson,
& Sykes, 2010). Greatest list length, an all or none method used by Montgomery
(2000a, 2000b), awarded children credit for the highest list length in which two
of the three lists had all the items recalled, and in the correct reorder. If children
failed to correctly recall and reorder the lists at length 2, they earned a score of
1. Scores ranged from 1 to 3 for those with SLI and 1 to 4 for the TD children.
Total links, the partial credit scoring method developed for this paper, awarded 1
point every time the order of recall for a word pair from the list went from small
to large; no points were awarded when it went from large to small. For exam-
ple, if children were asked to reorder the list “pony, ring, wolf, ocean, chicken,
house” and said “ring, ocean, chicken, pony, wolf, house,” they earned 3 points
(ring to ocean, chicken to pony, wolf to house). If they recalled “ocean, wolf,
ring, chicken, house, pony,” they earned 2 points (ring to chicken and chicken to
house). If they did not recall all the words, links were still scored; for example, if
they said “ocean, pony, house,” they earned 1 point (pony to house). Consonant
with the high score on the greatest list length measure being 4, inspection of the
results showed that performance on the task tended to fall after list length 4. We
therefore computed the total links score considering only lists lengths 2 through
4.2 Scores ranged from 1 to 13 for children with SLI and from 1 to 17 for the TD
children.
Finally, nonlist words that the children produced during recall were classified
into three subtypes: words from previous lists, words that rhyme with current list
words, and other errors (this included repeating a word from the current list more
than once, and words that never appeared in the size judgment stimuli). These were
totaled across all five list lengths.
Table 1. Characteristics and scores on standardized tests of SLI and TD groups by dialect
SWE AAE
SLI TD SLI TD
11/7a 9/9a 14/21a 18/17a
M SD M SD M SD M SD
Matching variables
Age (months) 65.72 3.89 66.61 4.18 66.94 3.74 65.60 3.55
PTONI 96.50 8.35 98.28 8.14 93.69 9.62 98.09 8.87
Maternal education (years) 12.33 2.87 13.17 3.05 11.67 2.27 13.27 2.62
Language measures
DELV-NR syntax 4.78 1.66 10.39 1.72 4.83 1.01 10.00 1.55
TOLD (n = 52) 80.92 5.38 109.00 9.54 79.74 6.48 104.85 7.69
PPVT 85.78 7.01 105.56 5.62 82.34 9.42 101.06 9.32
Dialect density measure
DELV-S dialect density 2.78 0.43 1.33 0.67 2.83 0.38 2.49 0.78
Note: SLI, specific language impairment; TD, typically developing; SWE, Southern White
English; AAE, African American English; PTONI, Primary Test of Nonverbal Intelligence;
DELV-NR, Diagnostic Evaluation of Language Variation—Norm Referenced; TOLD, Test
of Language Development; PPVT, Peabody Picture Vocabulary Test; DELV-S dialect den-
sity, DELV Screening Test language variation subsection.
a
Male/female ratio.
Reliability
Reliability of scoring the size judgment task was checked by having a second person
independently score 20% of the data, and individual link scoring and nonlist word
types were compared. Agreement was high between the two scorers (97%).
RESULTS
Clinical status and dialect effects on matching variables and standardized
tests
Group profiles of the children by dialect and clinical status are presented in Table 1.
As reported in Oetting et al. (2016), we first analyzed the matching variables of age,
PTONI, and maternal education in a 2 (clinical status: SLI vs. TD) × 2 (dialect:
SWE vs. AAE) between-subjects analysis of variance (ANOVA). There were no
main effects of clinical status or dialect or their interaction for age or PTONI. For
maternal education, there was a main effect of clinical status, with the level of the
SLI group less than that of the TD group, F (1, 98) = 4.96, p = .028, ηp2 = 0.05.
Thus, matching for maternal education level was not completely successful, but
the effect size was small.
All of the standardized tests also showed main effects of clinical status with the
SLI group earning lower scores than the TD group: DELV-NR, F (1, 102) = 328.80,
p < .001, ηp2 = 0.76; TOLD, F (1, 48) = 118.61, p < .001, ηp2 = 0.71; and PPVT,
F (1, 102) = 122.32, p < .001, ηp2 = 0.55. The PPVT also showed a main effect for
dialect (AAE < SWE), F (1,102) = 5.20, p = .03 ηp2 = 0.05. Knowledge-based
tests, such as vocabulary, often show such cultural, socioeconomic, or linguistic
group differences (see Engel et al., 2008; see also Qi, Kaiser, Milan, & Hancock,
2006; Restrepo et al., 2006), and we confirmed such findings here.
Next we applied the same analysis to the variable of nonmainstream dialect
density. Here there were main effects for clinical status, F (1, 102) = 52.59,
p < .001, ηp2 = 0.34, and dialect, F (1, 102) = 23.83, p < .001, ηp2 = 0.19,
and an interaction, F (1, 102) = 19.98, p < .001, ηp2 = 0.16. The interaction was
due to a density difference for the TD (AAE > SWE); F (1, 51) = 28.01, p <
.001, ηp2 = 0.36, but not SLI groups, F (1, 51) = 0.19, p = .66, ηp2 = 0.004. This
finding in the TD group was not unexpected because differences in rates of use is
one of the primary ways in which AAE and SWE differ from each other (Oetting,
2015; see Cleveland & Oetting, 2013; Oetting & Newkirk, 2008). Failure to find a
density difference in the SLI group may be because both SWE and AAE speakers
were near ceiling on this measure.3 Examining the interaction from within each
dialect, we see there is an effect of clinical status for both the SWE speakers, F
(1, 34) = 57.46, p < .001, ηp2 = 0.63, and the AAE speakers, F (1, 68) = 5.44,
p = .023, ηp2 = 0.07, although it was stronger in the SWE speakers. This clinical
status difference in dialect density is unexpected. We therefore examine the effects
of dialect density in the correlational analyses reported later, and detail possible
reasons for differences in dialect density by clinical status in the discussion.
Clinical status and dialect effects on working memory

Next we turned to our main question: whether the working memory task showed
an effect of clinical status that was equivalent across the dialects. For both greatest
list length and total links scoring methods, we performed a 2 (clinical status) × 2
(dialect) between-subjects ANOVA and examined how well the measure allowed
us to classify the children into the two clinical groups.
Greatest list length. There was a main effect for clinical status, with the SLI
group earning lower scores (M = 1.60, SD = 0.60) than the TD group (M = 2.15,
SD = 0.74), F (1, 102) = 20.99, p < .001, ηp2 = 0.17. The interaction between
clinical status and dialect, while not reaching conventional levels of significance,
F (1, 102) = 3.54, p = .063, ηp2 = 0.03, echoed similar tendencies found by
Rodekohr and Haynes (2001). When tested separately within dialects, clinical
status remained significant for both dialects, although the effect tended to be larger
in SWE, F (1, 34) = 14.70, p < .001 ηp2 = 0.30; SLI M = 1.44, SD = 0.51; TD M
= 2.33, SD = 0.84, than in AAE, F (1, 68) = 5.58, p = .021, ηp2 = 0.08; SLI M =
1.69; SD = 0.63; TD M = 2.06, SD = 0.68.
In classifying the children into those with and without SLI, a score of 2 on the 1
to 4 scale was identified as the optimal cut point. It classified 63% of the children
correctly, with sensitivity (94%) being excellent, but specificity (32%) being poor.
Classification was better for SWE speakers (sensitivity 100%, specificity 44%)
than for AAE speakers (sensitivity 91%; specificity 26%).
Total links. Again, there was a main effect of clinical status, with the SLI group
earning lower scores (M = 6.70, SD = 2.53) than the TD group (M = 9.87, SD =
3.23), F (1, 102) = 35.25, p < .001, ηp2 = 0.26. Although the interaction between
clinical status and dialect was not statistically reliable, F (1, 102) = 2.93, p = .09,
ηp2 = 0.03, the effect again tended to be larger in SWE speakers, F (1, 34) = 30.01,
p < .001, ηp2 = 0.47; SLI M = 6.50, SD = 1.92; TD M = 11.00, SD = 2.91, than
in AAE speakers, F (1, 68) = 11.62, p = .001, ηp2 = 0.15; SLI M = 6.80, SD =
2.82; TD M = 9.29, SD = 3.27. In addition, independent t tests showed that the
SLI group earned lower link scores than the TD group at each list length (all ts ≤
–3.86, all ps < 001). Thus, even at the short list length of two words, children with
SLI were different from TD children.
In terms of classification, the total links measure was superior to the greatest list
length measure, and it correctly classified 75% of the children by clinical status.
A cut point at 8 total links yielded a sensitivity of 77% and a specificity of 72%.
Classification again was better for SWE speakers (sensitivity 83%, specificity 83%)
than for AAE speakers (sensitivity 74%, specificity 66%).
Nonlist words
Nonlist words produced during the working memory task were analyzed in a 2
(clinical status) × 2 (dialect) × 3 (error type) ANOVA. There was a main effect
of clinical status; as predicted, the SLI group produced more nonlist words (M =
8.38, SD = 6.21) than the TD group (M = 6.04, SD = 5.23), F (1, 102) = 4.65,
p = .033, ηp2 = 0.04. Although the AAE speakers tended to produce more nonlist
words (M = 7.89, SD = 6.26) than the SWE speakers (M = 5.89, SD = 4.70), the
main effect of dialect did not reach statistical significance, F (1, 102) = 2.91, p =
.091, ηp2 = 0.03. There was also a main effect of error type, F (2, 204) = 39.91,
p < .001, ηp2 = 0.28. Rhyming errors (M = 0.66, SD = 0.91) were less frequent
than the other two types of errors, other errors (M = 2.92, SD = 3.25) and words
from a prior list (M = 3.62, SD = 3.30); the latter two did not differ statistically
by a Bonferroni corrected post hoc test. This pattern of results occurred for both
dialects; there were no significant interactions.
Correlations with standardized tests

Next, we examined the correlations between the two scoring methods for the work-
ing memory task and the standardized test measures as well as both of these to
nonmainstream density. These are shown below the diagonal in Table 2. As ex-
pected from previous research (e.g., Friedman & Miyake, 2005), the two scoring
methods were significantly intercorrelated. In examining correlations of work-
ing memory to the standardized language and intelligence measures, we see that
both working memory scoring methods showed strong correlations to measures
of syntax (DELV-NR and TOLD) and vocabulary (PPVT) but not intelligence, as
measured by the PTONI.
However, note that children’s nonmainstream densities also were significantly
negatively correlated to the two working memory scoring methods. This indicates
that the higher the child’s density, the lower the working memory score. In order
to check that this was not due to the higher densities found in the children with SLI
Table 2. Correlations between two scoring methods for working memory task and
standardized test measures and among these two measures and nonmainstream density
GLL Links Syntax TOLD PPVT PTONI

All (n = 106)
Greatest list length — .80*** .29** .30* .16 .00

Total links .83*** — .38*** .38** .28** −.01
DELV-NR syntax .40*** .50*** — .76*** .65*** .10
TOLD (n = 52) .36** .45*** .79*** — .64*** .11
PPVT .31*** .44*** .74*** .68*** — .30**
PTONI .05 .05 .15 .11 .33*** —
DELV-S dialect density −.35*** −.43*** −.46*** −.38** −.52*** −.15
Southern White English Only (n = 36)

Greatest list length — .84*** .24 .29 .32 .22
Total links .89*** — .34* .33 .51** .24
DELV-NR syntax .49** .59*** — .63* .52*** −.11
TOLD (n = 16) .32 .29 .74*** — .73** −.11
PPVT .54*** .69*** .76*** .79*** — .06
PTONI .26 .28 .02 −.19 .14 —
DELV-S dialect density −.48** −.55*** −.70*** −.52* −.72*** −.14
African American English Only (n = 70)

Greatest list length — .79*** .27* .29 .10 −.10
Total links .80*** — .38*** .42* .21 −.10
DELV-NR syntax .34** .45*** — .79*** .70*** .19
TOLD (n = 36) .37* .50** .80*** — .63*** .22
PPVT .19 .31** .74*** .66*** — .37**
PTONI −.06 −.06 .22 .23 .39*** —
DELV-S dialect density −.27* −.34** −.31** −.33* −.38*** −.12
Note: Correlations below the diagonal are the measures of working memory, language and
intelligence measures, and dialect density. Correlations above the diagonal are with dialect
density partialled out. GLL, Greatest List Length; DELV-NR, Diagnostic Evaluation of
Language Variation—Norm Referenced; TOLD, Test of Language Development; PPVT,
Peabody Picture Vocabulary Test; PTONI, Primary Test of Nonverbal Intelligence; DELV-S
dialect density, DELV Screening Test language variation subsection.
*p ≤ .05. **p ≤ .01. ***p ≤ .001.
as compared to TD children, we ran a partial correlation. The relationship between

density and working memory held true with clinical status category partialled out
(greatest list length r = –.21, p = .028; total links r = –.27, p = .005). Conversely,
the relationship between clinical status group and working memory held true with
density partialled out (greatest list length r = .26, p = .007; total links r = .36,
p < .001).
In addition, the children’s nonmainstream dialect densities correlated to their
syntax and vocabulary measures. These correlations were also negative, show-
ing that children with higher densities earned lower scores on these measures.
Here again we checked these relationships with clinical status partialled out, and
found only the relationship between dialect density and PPVT remained significant
(r = –.30, p = .002). Conversely, the relationship between clinical status and the
DELV-NR syntax measure (r = .85, p < .001), TOLD (r = .86, p < .001), and the
PPVT (r = .67, p < .001) remained when dialect density was partialled out.
We next calculated the correlations between the working memory scores and the
standardized tests of language and intelligence with nonmainstream dialect density
partialled out to see if correlations between working memory and the standardized
tests still held. These are shown above the diagonal in Table 2. Even with density
partialled out, there was a significant positive correlation between both working
memory scoring methods and the measures of syntax (DELV-NR and TOLD) and
between total links and the vocabulary measure (PPVT). Thus, the relationships
between working memory, syntax, and vocabulary were not solely attributable to
the children’s nonmainstream dialect densities.
We then divided the participants by dialect and examined the correlations in
these subgroups (see Table 2); we had predicted similar patterns of correlations
across the two dialects. In the correlations shown below the diagonal, we see sig-
nificant positive correlations between one or both of the working memory scoring
methods and the DELV-NR syntax measure, and the PPVT vocabulary measure
for speakers of both dialects. The correlation with working memory was not sig-
nificant for the TOLD in the SWE group, probably due to the low number of
participants on this test. Dialect density was generally negatively correlated with
both working memory scores and standardized tests scores within these subpopula-
tions. Looking above the diagonals, we see the correlations with density partialled
out in each dialect group. For the SWE speakers, a significant connection was
still seen between the total links working memory score and the DELV-NR syntax
measure as well as the PPVT measure of vocabulary. For AAE speakers, signif-
icant partial correlations were seen between both working memory scores and
the DELV-NR syntax measure and between the total links measure and the TOLD.
Thus when considering the partial correlations, both dialect groups showed connec-
tions between working memory scores and syntax measures of a similar strength;
the relationship between working memory and vocabulary held for only SWE
speakers.
DISCUSSION
We examined if a verbal working memory measure would prove to be linguisti-
cally unbiased by looking at the performance of speakers of two nonmainstream
dialects. AAE and SWE were chosen as their dialectal forms strongly overlap with
those produced by children with SLI, making it hard to tell dialect from disorder,
and because they also differ in the density with which such forms are produced. If
verbal working memory proves to be linguistically unbiased, it should be able to
distinguish equally well between children with SLI and TD children in each dialect
as well as yield similar patterns of nonlist errors and show similar correlations to
standardized tests of language and nonverbal intelligence in both dialects. In addi-
tion to dialect type, we also examined whether the density with which a dialect is
spoken affects performance. To be linguistically unbiased, nonmainstream dialect

density should have no effect on performance either between or within dialects.
Scores on the working memory measure, size judgment, were significantly lower
for children with SLI than age-matched TD children, and this was true for speakers
of both dialects. This held for both the all or none scoring method (greatest list
length) and the partial credit method (total links), although consonant with previous
studies investigating scoring methods (Conway et al., 2005; Friedman & Miyake,
2005; Giofrè, & Mammarella, 2014; St. Clair-Thompson, & Sykes, 2010), the
partial scoring method was generally superior. Specifically, it accounted for more
variance in the ANOVA (links ηp2 = 0.26; greatest list length ηp2 = 0.17), and
it correctly classified more children (links 75%; greatest list length 63%). The
ability of the size judgment test to distinguish nonmainstream dialect speaking
children with SLI from TD children is consistent with the two other studies which
examined working memory in terms of listening span in minority populations
(Laing & Kamhi, 2003; Rodekohr & Haynes, 2001).
We can also compare our results to those of Montgomery (2000a, 2000b; Mont-
gomery et al., 2009), who used a size judgment task in mainstream English speak-
ers, aged 7 to 10 years. They found that the task could differentiate between SLI
and TD groups only when two levels of reordering (size and semantic category)
were involved. With our younger age group, one level of reordering was sufficient
to show group differences. We note that we did not have a group of mainstream
English-speaking children in our study, so we cannot say whether or not our non-
mainstream dialect-speaking children would score differently than mainstream
speakers on the size judgment task, but we did find that the size judgment task
can distinguish between children with SLI and TD children as Montgomery et al.
found.
Across both nonmainstream dialects, we also found that children with SLI gave
more nonlist words in the size judgment task than did TD children. Previous
research using listening span tasks with mainstream English-speaking children
(Ellis Weismer et al., 1999; Marton & Eichorn, 2014; Marton & Schwartz, 2003;
Marton et al., 2006, 2007) also found a higher number of nonlist words given by
children with SLI than TD children, and thus we replicated this finding.
Finally, we found good evidence that working memory correlated with measures
of syntax and vocabulary; this held true in each dialect as well. This is consistent
with the findings of others using mainstream English speakers (Adams et al., 1999;
Magimairaj & Montgomery, 2012). However, we did not find good evidence for this
relationship with nonverbal intelligence. It is possible since all of our participants
had a score on the PTONI of at least 82, that restriction of range is preventing
us from finding this correlation. Using a group of children with a broader range
of ability, such a correlation was found for SWE-speaking children in our lab
(McDonald, Seidel, Porter, Oetting, & Hegarty, 2011).
Unique to our study was the inclusion of children who spoke one of two nonmain-
stream dialects. As expected, nonmainstream dialect density differed across these
dialects, with the AAE speakers producing higher rates than the SWE speakers.
Nonmainstream dialect density also differed within each dialect, with individuals
varying in use, from low to moderate to high rates. We found, contrary to our
hypothesis, that nonmainstream density was significantly negatively correlated to
working memory performance; across dialects this was true when SLI status was
partialled out, and it was also true within each dialect. Because of the intercorrela-
tions that were observed between the children’s nonmainstream dialect densities,
working memory scores, and standardized test scores, we reran the correlations
with dialect density partialled out. When both dialect groups were considered to-
gether, significant partial correlations were found between working memory scores
and measures of syntax and vocabulary. When each dialect group was considered
separately, significant partial correlations were found between working memory
scores and syntax, and the magnitude was similar across the dialects (e.g., partial
correlation between total links and DELV-NR Syntax was .34 for SWE and .38 for
AAE). However, we only found evidence of a correlation with vocabulary in SWE
children. Recall, however, that the PPVT did not prove itself to be free of linguistic
biases either in terms of type of dialect or nonmainstream dialect density, and this
may account for this lack of correlation in the AAE group.
Our working memory measure was not nonmainstream dialect density neutral.
This finding is parallel to that found by Moyle et al. (2014), who also found an effect
of children’s nonmainstream dialect density on a different processing-based task,
that of nonword repetition. Both size judgment and nonword repetition involve
perceiving and repeating phonological strings, and may therefore be influenced by
phonological or phonotactic factors that may differ in speakers with higher dialect
densities (Brown, 2011; Edwards, Beckman, & Munson, 2004; Edwards et al.,
2014).
Nonmainstream dialect density could also be negatively related to verbal work-
ing memory performance for reasons beyond phonology, because it is likely cor-
related to the amount of exposure to formal education and to the speed of access to
mainstream lexical items. In AAE-speaking children, nonmainstream dialect den-
sity has been shown to decrease as formal education increases, especially between
kindergarten and first grade (Craig & Washington, 2004). Recall that our partici-
pants were kindergarteners who lived in a rural community. As such, differences
in the children’s nonmainstream dialect densities could partially reflect how much
exposure they have had to mainstream English, most likely in a formal educational
setting. At least in adults, the size judgment task has been shown to be sensitive to
differences in formal educational levels (Cherry et al., 2007). In addition, children
with high dialect density could be having some of the same difficulties a bilingual
has with linguistically based working memory measures that necessitate lexical
access. Hansen et al. (2016) showed that even when tested in their first language,
children who were immersed in a second language in elementary school showed
deficits compared to monolinguals on a first language reading span task early on in
the immersion experience, and these deficits were correlated to their speed of lexi-
cal access. They did not show this deficit on another measure of working memory,
the n-back task, which is not highly linguistically loaded, indicating the deficit was
specific to tests with high lexical access demands. Thus, demands of mainstream
lexical access or other working memory demands may be impeding children with
high dialect density from performing as well as lower dialect density children on
the size judgment task.
Recall also that within both dialect groups, nonmainstream dialect density scores
were highest for the children with SLI, and density differences between those with
and without SLI were larger for the SWE than AAE groups. While this finding
needs to be confirmed with other groups of nonmainstream English child speakers,
it is possible that children with SLI, relative to their TD peers, are less able to shift
their dialects to a more mainstream variety when engaged in school-based tasks
(Craig & Washington, 2004). Alternatively, or in addition, it is possible that the
language variation portion of the DELV-S is not ability neutral across dialects. In
support of this possibility, the dialect portion of the DELV-S includes eight items
that target third-person marking, and at least one study has shown children with
and without SLI to differ in their marking of this grammatical structure in SWE
but not AAE (Cleveland & Oetting, 2013).
Are processing-based measures free of cultural and/or linguistic bias?

Although the working memory task differentiated between children with and with-
out SLI in the two nonmainstream dialects studied here, there were several findings
that suggested that this processing-based measure was not entirely free of cultural
and/or linguistic biases. First, although not statistically significant, clinical status
effects tended to be larger for SWE than AAE speakers, a finding similar to that
of Rodekohr and Haynes (2001). Finding such a tendency across studies, even
though none reached the conventional level of significance, should at least give
us pause before we confidently conclude that processing-based measures are free
of linguistic biases. Second, there were strong correlations between the children’s
working memory scores and their nonmainstream dialect densities, with higher
densities corresponding to poorer working memory scores. It is possible that the
tendency for stronger differences between TD children and children with SLI in
SWE speakers than in AAE speakers is actually an effect of nonmainstream dialect
density rather than dialect type. This would not be surprising, as one of the major
differences between AAE and SWE is the frequency with which dialectal forms
are used. To test this idea, we reran the ANOVA analyses on working memory
span with dialect density as a covariate. There were no longer any tendencies for
interactions between clinical status and dialect type (ps > .31), indicating that the
children’s nonmainstream dialect density rather than their dialect type may have
been the operational factor.
Taken together, the current study and these previous studies suggest that
processing-based tasks involving verbal materials may not be free from all cultural
and/or linguistic biases. In terms of practical use of such tests for diagnosis and
research, it may be fruitful to go beyond tests of verbal working memory such as
listening span or size judgment, and look at tests of nonverbal working memory.
There is evidence that children with SLI also show deficits in working memory
using nonverbal items (Henry, Messer, & Nash, 2012; Marton, 2008); it would be
important to assess such measures for bias in speakers of nonmainstream dialects.
Finally, it is clear that in looking for possible linguistic biases in testing mate-
rials, either standardized or processing based, it is important to look not only at
the type of dialect spoken, but also at the nonmainstream dialect density, both be-
tween dialects and by individuals within any dialect. This is especially important
as effects of nonmainstream density have been found on phonological process-
ing and lexical access (Brown, 2011; Edwards et al., 2014), word reading and
literacy (Connor & Craig, 2006; Craig et al., 2004; Terry et al., 2010) as well as
processing-based tasks (Moyle et al., 2014). We would therefore advocate that
researchers include a measure of nonmainstream dialect density when investigat-
ing linguistic bias. It need not be a labor-intensive measure, as we found our results
with the easily administered and scored language variation portion of the DELV-S
(Seymour et al., 2003). It is also important that more research be done on measures
of nonmainstream dialect density to be sure they are ability neutral.
In summary, this study adds to a body of literature that finds working memory
deficits in children with SLI when compared to age-matched controls, by repli-
cating the findings of lower span scores and higher number of nonlist words in
speakers of two nonmainstream English dialects. In addition, it is the first study
to look at the effect of children’s dialects on a verbal working memory measure
by measuring children’s dialects in two ways, by type of dialect (AAE vs. SWE)
and by nonmainstream dialect density (low vs. medium vs. high). While dialect
type was not found to affect the children’s span scores at a statistically reliable
level, nonmainstream dialect density did. We found a complex relationship be-
tween children’s nonmainstream dialect density, working memory capacity, and
performance on standardized tests. The results raise the possibility that like many
standardized language measures, processing-based measures (at least those in-
volving verbal stimuli) may not be as free from cultural and/or linguistic bias as
is often purported. Similarly, the results raise the possibility that nonmainstream
dialect density measures such as the DELV-S may also not be as ability neutral
as many who have used this tool to index dialect differences between and within
groups assume.
ACKNOWLEDGMENTS
Funding for this study was provided through NIDCD RO1DC009811. We appreciate the
assistance of Jessica Berry, Kyomi Gregory, Ryan James, Christy Moland, Karmen Porter,
Andrew Rivière, Tina Villa, and a number of others who helped create the stimuli and collect
the data. We also thank the teachers, families, and children who participated in the study.
NOTES
1. Specifically, while the correlation coefficients vary slightly depending on scoring
method used, they have similar levels of significance to each of the other variables
in the correlation matrix later given in Table 2.
2. Similar results in the analyses were generally found for total link scores over list
lengths 2 to 6. However, scores were less skewed when only including list lengths 2
to 4, possible because some children were giving up or getting frustrated at the higher
list lengths.
3. This explanation of ceiling performance is confirmed when looking at the continu-
ous rather than categorical scoring of dialect density. The continuous scoring method
still showed the main effects for clinical status, F (1, 102) = 50.23, p < .001,
η2p = 0.33, and dialect, F (1, 102) = 44.67, p < .001, η2p = 0.30, and an interaction,
F (1, 102) = 26.95, p < .001, η2p = 0.14. But when examining the children with SLI
alone, the SWE speakers with SLI (M = 0.79, SD = 0.17) did show a lower dialect
density than AAE speakers with SLI (M = 0.89, SD = 0.13), F (1, 51) = 6.06, p =
.017, η2p = 0.11.
REFERENCES
Adams, A., Bourke, L., & Willis, C. (1999). Working memory and spoken language comprehension in
young children. International Journal of Psychology, 34, 364–373.
Alloway, T. P., Gathercole, S. E., Willis, C., & Adams, A. (2004). A structural analysis of working mem-
ory and related cognitive skills in young children. Journal of Experimental Child Psychology,
87, 85–106.
Barbu, S., Martin, N., & Chevrot, J. (2014). Maintenance of regional dialects: A matter of gender?
Boys, but not girls, use local varieties in relation to their friend’s nativeness and local identity.
Frontiers in Psychology, 5, 1–11.
Briscoe, J., & Rankin, P. M. (2009). Exploration of a “double-jeopardy” hypothesis within work-
ing memory profiles for children with specific language impairment. International Journal of
Language & Communication Disorders, 44, 236–250.
Brown, M. C. (2011). Dialect and lexical access: An investigation into dialect density, dialect en-
vironment and word knowledge (Unpublished doctoral dissertation, University of Wisconsin,
Madison).
Campbell, T., Dollaghan, C., Needleman, H., & Janosky, J. (1997). Reducing bias in language as-
sessment: Processing-dependent measures. Journal of Speech and Hearing Research, 40,
519–525.
Charity, A. H., Scarborough, H. S., & Griffin, D. (2004). Familiarity with “school English” in African-
American children and its relation to early reading achievement. Child Development, 75, 1340–
1356.
Cherry, K. E., Elliott, E. M., & Reese, C. M. (2007). Age and individual differences in working memory:
The size judgment span task. Journal of General Psychology, 134, 43–65.
Cherry, K. E., & Park, D. C. (1993). Individual difference and contextual variables influence spatial
memory in younger and older adults. Psychology and Aging, 8, 517–526.
Cleveland, L. H., & Oetting, J. B. (2013). Children’s marking of verbal –s by nonmainstream English
dialect and clinical status. American Journal of Speech-Language Pathology, 22, 604–614.
Connor, C. M., & Craig, H. K. (2006). African American preschoolers’ language, emergent literacy
skills, and use of African American English: A complex relation. Journal of Speech, Language,
and Hearing Research, 49, 771–792.
Conway, A. R. A., Kane, M. J., Bunting, M. F., Hambrick, D. Z., Wilhelm, O., & Engle, R. W. (2005).
Working memory span tasks: A review and a user’s guide. Psychonomic Bulletin and Review,
12, 769–786.
Craig, H. K., Kolenic, G. E., & Hensel, S. L. (2014). African American English-speaking students: A
longitudinal examination of style shifting from kindergarten through second grade. Journal of
Speech, Language, and Hearing Research, 57, 143–157.
Craig, H. K., Thompson, C. A., Washington, J. A., & Porter, S. L. (2004). Performance of elementary-
grade African American students on the Gray Oral Reading Tests. Language, Speech, Hearing
Services in Schools, 34, 141–154.
Craig, H. K., & Washington, J. A. (2000). An assessment battery for identifying language impairments
in African American children. Journal of Speech, Language, and Hearing Research, 43, 366–
379.
Craig, H. K., & Washington, J. A. (2004). Grade-related changes in the production of African American
English. Journal of Speech, Language, and Hearing Research, 47, 450–463.
Craig, H. K., Zhang, L., Hensel, S. L., & Quinn, E. J. (2009). African American English-speaking
students: An examination of the relationship between dialect shifting and reading outcomes.
Journal of Speech, Language, And Hearing Research, 52, 839–855.
Dollaghan, C., & Campbell, T. F. (1998). Nonword repetition and child language impairment. Journal
of Speech, Language, and Hearing Research, 41, 1136–1146.
Dunn, L. M., & Dunn, D. M. (2007). Peabody Picture Vocabulary Test (4th ed.). Toronto: Pearson
Education.
Edwards, J., Beckman, M. E., & Munson, B. (2004). The interaction between vocabulary size and
phonotactic probability effects on children’s production accuracy and fluency in nonword rep-
etition. Journal of Speech, Language, and Hearing Research, 47, 421–436.
Edwards, J., Gross, M., Chen, J., MacDonald, M. C., Kaplan, D., Brown, M., & Seidenberg, M.
S. (2014). Dialect awareness and lexical comprehension of mainstream American English
in African American English-speaking children. Journal of Speech, Language, and Hearing
Research, 57, 1883–1895.
Ehrler, D. J., &; McGhee, R. L. (2008). Primary Test of Nonverbal Intelligence. Austin, TX: PRO-ED.
Ellis Weismer, S., Evans, J., & Hesketh, L. J. (1999). An examination of verbal working memory
capacity in children with specific language impairment. Journal of Speech, Language, and
Hearing Research, 42, 1249–1260.
Ellis Weismer, S., Plante, E., Jones, M., & Tomblin, J. B. (2005). A functional magnetic resonance imag-
ing investigation of verbal working memory in adolescents with specific language impairment.
Journal of Speech, Language, and Hearing Research, 48, 405–425.
Engel, P. J., Santos, F. H., & Gathercole, S. E. (2008). Are working memory measures free of socioe-
conomic influence? Journal of Speech, Language, and Hearing Research, 51, 1580–1587.
Engel de Abreu, P., Conway, A. A., & Gathercole, S. E. (2010). Working memory and fluid intelligence
in young children. Intelligence, 38, 552–561.
Engel de Abreu, P., Gathercole, S. E., & Martin, R. (2011). Disentangling the relationship between
working memory and language: The roles of short-term storage and cognitive control. Learning
and Individual Differences, 21, 569–574.
Engel de Abreu, P. M. J., Puglisi, M. L., Cruz-Santos, A., Befi-Lopes, D. M., & Martin, R. (2014).
Effects of impoverished environmental conditions on working memory performance. Memory,
22, 323–331.
Friedman, N. P., & Miyake, A. (2005). Comparison of four scoring methods for the reading span test.
Behavior Research Methods, 37, 581–590.
Frizelle, P., & Fletcher, P. (2015). The role of memory in processing relative clauses in children with
specific language impairment. American Journal of Speech-Language Pathology, 24, 47–59.
Garrity, A. W., & Oetting, J. B. (2010). Auxiliary BE production by AAE-speaking children with and
without specific language impairment. Journal of Speech, Language, and Hearing Research,
53, 1307–1320.
Giofrè, D., & Mammarella, I. C. (2014). The relationship between working memory and intelligence
in children: Is the scoring procedure important? Intelligence, 46, 300–310.
Goldman, R., & Fristoe, M. (2000) Goldman-Fristoe Test of Articulation (2nd ed.). Circle Pines, MN:
American Guidance Services.
Haake, M., Hansson, K., Gulz, A., Schötz, S., & Sahlén, B. (2014). The slower the better? Does
the speaker’s speech rate influence children’s performance on a language comprehension test?
International Journal of Speech-Language Pathology, 16, 181–190.
Hansen, L. B., Macizo, P., Duñabeitia, J. A., Saldaña, D., Carreiras, M., Fuentes, L. J., & Bajo, M.
T. (2016). Emergent bilingualism and working memory development in school aged children.
Language Learning, 64(Suppl. 2), 51–75.
Henry, L. A., Messer, D. J., & Nash, G. (2012). Executive functioning in children with specific language
impairment. Journal of Child Psychology and Psychiatry, 53, 37–45.
Horton, R., & Apel, K. (2014). Examining the use of spoken dialect indices with African American
children in the Southern United States. American Journal of Speech-Language Pathology, 23,
448–460.
Ivy, L. J., & Masterson, J. J. (2011). A comparison of oral and written English styles in African
American students at different stages of writing development. Language, Speech, Hearing
Just, M. A., & Carpenter, P. A. (1992). A capacity theory of comprehension: Individual differences in
working memory. Psychological Review, 99, 122–149.
Laing, S. P., & Kamhi, A. (2003). Alternative assessment of language and literacy in culturally and
linguistically diverse populations. Language, Speech, and Hearing Services in Schools, 34,
44–55.
Leonard, L. (2014). Children with specific language impairment. Cambridge, MA: MIT Press.
Lum, J. G., Conti-Ramsden, G., Page, D., & Ullman, M. T. (2012). Working, declarative and procedural
memory in specific language impairment. Cortex, 48, 1138–1154.
Magimairaj, B. M., & Montgomery, J. W. (2012). Children’s verbal working memory: Role of process-
ing complexity in predicting spoken sentence comprehension. Journal of Speech, Language,
Mainela-Arnold, E., & Evans, J. L. (2005). Beyond capacity limitations: Determinants of word recall
performance on verbal working memory span tasks in children with SLI. Journal of Speech,
Language, and Hearing Research, 48, 897–909.
Mainela-Arnold, E., Evans, J. L., & Coady, J. (2010). Beyond capacity limitations: II. Effects of
lexical processes on word recall in verbal working memory tasks in children with and with-
out specific language impairment. Journal of Speech, Language, and Hearing Research, 53,
1656–1672.
Marton, K. (2008). Visuo-spatial processing and executive functions in children with specific language
impairment. International Journal of Language & Communication Disorders, 43, 181–200.
Marton, K., & Eichorn, N. (2014). Interaction between working memory and long-term memory:
A study in children with and without language impairment. Zeitschrift für Psychologie, 222,
90–99.
Marton, K., Kelmenson, L., & Pinkhasova, M. (2007). Inhibition control and working memory capacity
in children with SLI. Psychologia, 50, 110–121.
Marton, K., & Schwartz, R. G. (2003). Working memory capacity and language processes in children
with specific language impairment. Journal of Speech, Language, and Hearing Research, 46,
1138–1153.
Marton, K., Schwartz, R. G., Farkas, L., & Katsnelson, V. (2006). Effect of sentence length and complex-
ity on working memory performance in Hungarian children with specific language impairment
(SLI): A cross-linguistic comparison. International Journal of Language & Communication
Disorders, 41, 653–673.
McDonald, J. L. (2008). Grammaticality judgments in children: The role of age, working memory and
phonological ability. Journal of Child Language, 35, 247–268.
McDonald, J. L., Seidel, C. M., Porter, K. L., Oetting, J. B., & Hegarty, M. (2011). Size judgment:
Working memory and standardized test performance. Poster presented at the 2011 meeting of
the American Speech-Language Hearing Association, San Diego, CA.
Mills, M. T. (2015). The effects of visual stimuli on the spoken narrative performance of school-age
African American children. Language, Speech, and Hearing Services in Schools, 46, 337–351.
Montgomery, J. W. (2000a). Relation of working memory to off-line and real-time sentence processing
in children with specific language impairment. Applied Psycholinguistics, 21, 117–148.
Montgomery, J. W. (2000b). Verbal working memory and sentence comprehension in children with
specific language impairment. Journal of Speech, Language, and Hearing Research, 43, 293–
308.
Montgomery, J. W., & Evans, J. L. (2009). Complex sentence comprehension and working memory
in children with specific language impairment. Journal of Speech, Language, and Hearing
Research, 52, 269–288.
Montgomery, J. W., Evans, J. L., & Gillam, R. B. (2009). Relation of auditory attention and complex
sentence comprehension in children with specific language impairment: A preliminary study.
Applied Psycholinguistics, 30, 123–151.
Moyle, M. J., Heilmann, J. J., & Finneran, D. A. (2014). The role of dialect density in nonword
repetition performance: An examination with at-risk African American preschool children.
Clinical Linguistics and Phonetics, 28, 682–696.
Newcomer, P. L. & Hammill, D. D. (2008). Test of Language Development—Primary (4th ed.). Austin,
TX: PRO-ED.
Newkirk-Turner, B. R., Oetting, J. B., & Stockman, I. J. (2016). Development of auxiliaries by young
children learning African American English. Language, Speech, Hearing Services in Schools,
47, 209–224.
Oetting, J. B. (2015). Dialect differences between African American English and Southern White
English in children. In S. Lanehart. (Ed.), Oxford handbook of African American language
(pp. 512–518). New York: Oxford University Press.
Oetting, J. B., & Cleveland, L. H. (2006). The clinical utility of nonword repetition for children living
in the rural of the US. Clinical Linguistics & Phonetics, 20, 553–561.
Oetting, J. B., Lee, R., & Porter, K. (2013). Evaluating the grammars of children who speak nonmain-
stream dialects of English. Topics in Language Disorders, 33, 140–151.
Oetting, J. B., & McDonald, J. L. (2001). Nonmainstream dialect use and specific language impairment.
Journal of Speech, Language, and Hearing Research, 44, 207–223.
Oetting, J. B., & McDonald, J. L. (2002). Methods for characterizing participants’ nonmainstream
dialect use in child language research. Journal of Speech, Language, and Hearing Research,
45, 508–518.
Oetting, J. B., McDonald, J. L., Seidel, C. M., & Hegarty, M. (2016). Sentence recall by children with
SLI across two nonmainstream dialects of English. Journal of Speech, Language, and Hearing
Research, 59, 183–194.
Oetting, J. B., & Newkirk, B, L. (2008). Subject relatives by children with and without SLI across
different dialects of English. Clinical Linguistics & Phonetics, 22, 111–125.
Petruccelli, N., Bavin, E. L., & Bretherton, L. (2012). Children with specific language impairment and
resolved late talkers: Working memory profiles at 5 years. Journal of Speech, Language, and
Hearing Research, 55, 1690–1703.
Qi, C. H., Kaiser, A. P., Milan, S., & Hancock, T. (2006). Language performance of low-income
African American and European American preschool children on the PPVT-III. Language,
Speech, Hearing Services, in Schools, 37, 5–16.
Quail, M., Williams, C., & Leitão, S. (2009). Verbal working memory in specific language impairment:
The effect of providing visual support. International Journal of Speech-Language Pathology,
11, 220–233.
Restrepo, M. A., Schwanenflugel, P. J., Blake, J., Neuharth-Pritchett, S., Cramer, S. E., & Ruston,
H. P. (2006). Performance on the PPVT-III and the EVT: Applicability of the measures with
African American and European American preschool children. Language, Speech, and Hearing
Rivière, A. M., & Oetting, J. B. (2017). Marking of infinitival TO is influenced by a child’s dialect and
clinical status. Unpublished manuscript.
Rodekohr, R. K., & Haynes, W. O. (2001). Differentiating dialect from disorder: A comparison of two
processing tasks and a standardized language test. Journal of Communication Disorders, 34,
255–272.
Schwartz, R. G. (Ed.). (in press). Handbook of child language disorders (2nd ed.). New York: Psychol-
ogy Press.
Seymour, H. N., Bland-Stewart, L., & Green, L. J. (1998). Difference versus deficit in child
African American English. Language, Speech, and Hearing Services in Schools, 29,
96–108.
Seymour, H. N., Roeper, T., & de Villiers, J. G. (2003). Diagnostic Evaluation of Language Variation
Screening Test. San Antonio, TX: Psychological Corporation.
Seymour, H. N., Roeper, T., & de Villiers, J. G. (2005). Diagnostic Evaluation of Language Variation:
Norm-Referenced Test. San Antonio, TX: Psychological Corporation.
St. Clair-Thompson, H., & Sykes, S. (2010). Scoring methods and the predictive ability of working
memory tasks. Behavior Research Methods, 42, 969–975.
Stockman, I. J., Guillory, B., Seibert, M., & Boult, J. (2013). Toward validation of a minimal competence
core of morphosyntax for African American children. American Journal of Speech-Language
Pathology, 22, 40–56.
Stockman, I. J., Newkirk-Turner, B. L., Swartzlander, E., & Morris, L. R. (2016). Comparison of
African American children’s performances on a minimal competence core for morphosyntax
and the Index of Productive Syntax. American Journal of Speech-Language Pathology, 25,
80–96.
Terry, N. P., Connor, C. M., Thomas-Tate, S., & Love, M. (2010). Examining relationships among
dialect variation, literacy skills, and school context in first grade. Journal of Speech, Language,
Van Hofwegen, J., & Wolfram, W. (2010). Coming of age in African American English: A longitudinal
study. Journal of Sociolinguistics, 14, 427–455.
Vugs, B., Hendriks, M., Cuperus, J., & Verhoeven, L. (2014). Working memory performance and exec-
utive function behaviors in young children with SLI. Research in Developmental Disabilities,
35, 62–74.
Washington, J. A., & Craig, H. K. (1994). Dialectal forms during discourse of poor, urban, African
American preschoolers. Journal of Speech and Hearing Research, 37, 816–823.
Washington, J. A., & Craig, H. K. (1998). Socioeconomic status and gender influences on children’s
dialectal variations. Journal of Speech, Language, and Hearing Research, 41, 618–626.
Washington, J. A., & Craig, H. K. (2004). A language screening protocol for use with young African
American children in urban settings. American Journal of Speech-Language Pathology, 13,
329–340.
Wyatt, T. A. (2015). Assessing the language skills of African American English child speakers. In
S. Lanehart (Ed.), The Oxford handbook of African American language (pp. 526–546). New
York: Oxford University Press.

Working Memory Performance in Children With and Without Specific Language Impairment in Two Nonmainstream Dialects of English

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Working Memory Performance in Children With and Without Specific Language Impairment in Two Nonmainstream Dialects of English

Uploaded by

Copyright:

Available Formats

Applied Psycholinguistics, page 1 of 23, 2017

Working memory performance in

Received: June 15, 2016 Accepted for publication: May 9, 2017

ADDRESS FOR CORRESPONDENCE

2010; Montgomery, 2000a; 2000b; Montgomery & Evans, 2009; Montgomery,

WORKING MEMORY DEFICITS IN CHILDREN WITH SLI

IDENTIFICATION OF CHILDHOOD SLI WITHIN NONMAINSTREAM

NONMAINSTREAM DIALECT DENSITY AS AN IMPORTANT METRIC

SUMMARY, RESEARCH QUESTIONS, AND PREDICTIONS

DELV-S three-level score as recommended by the test developers because of its

11/7a 9/9a 14/21a 18/17a

Clinical status and dialect effects on working memory

Correlations with standardized tests

GLL Links Syntax TOLD PPVT PTONI

Greatest list length — .80* .29 .30* .16 .00

Southern White English Only (n = 36)

African American English Only (n = 70)

as compared to TD children, we ran a partial correlation. The relationship between

spoken affects performance. To be linguistically unbiased, nonmainstream dialect

Are processing-based measures free of cultural and/or linguistic bias?

You might also like