You are on page 1of 14

Research Note

Objective and Subjective Clustering Methods for


Verb Fluency Responses From Individuals With
Alzheimer’s Dementia and Cognitively Healthy
Older Adults
Madison N. Fisher,a Devin M. Casenhiser,a and Eun Jin Paeka
a
Department of Audiology and Speech Pathology, College of Health Professions, The University of Tennessee Health Science Center, Knoxville

ARTICLE INFO ABSTRACT


Article History: Purpose: The verb fluency task has been researched using a variety of analysis
Received September 12, 2022 methods and shown its sensitivity to declines in executive functioning and lexi-
Revision received March 13, 2023 cal retrieval abilities in various neurogenic populations. Few studies to date,
Accepted June 27, 2023 however, have analyzed clusters and switches in the task, and there is a lack of
robust analysis methods that preclude subjectivity and potential rater bias. The
Editor-in-Chief: Michael de Riesthal purpose of this study was to investigate the reliability when using subjective
Editor: Sarah Elizabeth Wallace clustering methods and to determine the feasibility of using an objective
clustering method to determine verb fluency performance in individuals with
https://doi.org/10.1044/2023_AJSLP-22-00290 Alzheimer’s dementia (IwDs) and cognitively healthy older adults (CHOAs).
Method: Responses from a verb fluency task were obtained from IwDs and
CHOAs. Group differences were examined using an objective clustering method
for multiple variables regarding clustering and switching. We also calculated the
intrarater, interrater, and intermethod reliability using intraclass coefficients.
Results: Significant group differences were found when utilizing the objective
clustering method in all variables except the average cluster size, with IwDs per-
forming poorer than CHOAs. Intrarater reliability was excellent. Interrater reliabil-
ity between two authors and intermethod reliability between the objective and
subjective methods were variable ranging from moderate to good.
Conclusions: The results from using the objective clustering method in this
study are consistent with the previous literature, making it a viable option for
clustering analyses on the verb fluency task, which naturally minimizes subjec-
tivity and rater bias. Alternatively, employing a thoroughly validated and reliable
subjective approach can also mitigate potential rater bias and improve replica-
bility across studies.
Supplemental Material: https://doi.org/10.23641/asha.24061017

Verbal fluency tasks require spontaneous generation category or semantic fluency (producing words within a
of words in a specific amount of time, typically 30 or specific category such as animals or supermarket items),
60 s, with a predetermined set of criteria (e.g., producing have been used to assess and monitor executive function-
words in a semantic category such as animals). Verbal flu- ing and lexical access in a variety of neurogenic popula-
ency tasks, including phonemic or letter fluency (produc- tions (Alegret et al., 2018; Henry et al., 2004; Macoir
ing words beginning with a specified sound or letter) and et al., 2019; Östberg et al., 2005; Paek & Murray, 2021;
Rodrigues et al., 2015). Category fluency tasks in which
the participant is asked to produce verbs of a specific
Correspondence to Eun Jin Paek: epaek@uthsc.edu. Publisher Note: semantic category (as opposed to nouns of a specific cate-
This article is part of the Special Issue: Select Papers From the 51st
Clinical Aphasiology Conference. Disclosure: The authors have
gory) are a more recent addition to the set of verbal flu-
declared that no competing financial or nonfinancial interests existed at ency tasks and were first studied by Piatt, Fields, Paolo,
the time of publication. and Troster (1999). The “verb fluency tasks” have been

American Journal of Speech-Language Pathology • Vol. 32 • 2589–2601 • October 2023 • Copyright © 2023 The Authors 2589
This work is licensed under a Creative Commons Attribution 4.0 International License.
shown to be valid and reliable measures of multiple and MCI to AD (Alegret et al., 2018; Östberg et al.,
domains of cognition and predictors of future cognitive 2005) or from Parkinson’s disease to Parkinson’s disease
declines in the assessment of cognitive–linguistic skills in with dementia (Piatt, Fields, Paolo, Koller & Troster,
adults (Macoir et al., 2019; Östberg et al., 2005; Paek & 1999). Other studies have investigated verb fluency task
Murray, 2021; Zhao et al., 2013; to Piatt, Fields, Paolo, performance in different types of dementia (e.g., Davis
Koller, & Troster, 1999; Piatt, Fields, Paolo, & Troster, et al., 2010; Delbeuck et al., 2013). Given the increasing
1999). For example, those with neurological disorders number of people diagnosed with Alzheimer’s disease and
such as aphasia or Alzheimer’s dementia (AD) produced related dementias (Alzheimer’s Association, 2022) and the
fewer verbs in a given time than cognitively healthy older lack of cures for the diseases, there is a critical need for
adults (CHOAs; i.e., adults with no history of cognitive early detection and prevention of the disease and related
decline, cerebrovascular accidents, or other neurological disorders, as well as intervention and management strate-
disorders; Alegret et al., 2018; Beber et al., 2015; Davis gies that are appropriate for the level of cognition in each
et al., 2010; Forlenza et al., 2012; Paek et al., 2019) due individual. Therefore, it is imperative to have reliable and
to the decreased cognitive–linguistic abilities observed in valid measures to assist in the evaluation of cognitive and
those populations. linguistic symptoms, as well as distinguish characteristics
of CHOA, SCD, MCI, and AD (Östberg et al., 2005).
Verb fluency can be classified as a type of semantic
fluency, because it assesses a participant’s ability to gener- There are a variety of methods to analyze verbal flu-
ate words related to a specific semantic category, namely, ency performance in addition to calculating the number of
verbs that represent actions that people can perform. correct responses that can further capture cognitive
However, it differs from other types of semantic fluency, declines. For example, analyses of word clusters (i.e., the
such as noun or category fluency (e.g., animals or super- size or number of clusters of words that are semantically
market items), because it may involve distinct aspects of or phonologically related) may provide additional insights
language processing, including more complex semantics into the cognitive–linguistic characteristics of individuals
and syntax. For example, verb fluency tasks may engage with Alzheimer’s dementia (IwDs). Clusters are consecu-
processes related to motor aspects of language production, tive responses grouped together based on predetermined
as well as syntactic processes that involve verb-argument categorization criteria. The average cluster size has been
structure (Qiu & Johns, 2021; Williams et al., 2021). The suggested to reflect lexical retrieval skills and semantic
constraint in the instructions to produce actions that could networks, whereas the number of switches (i.e., change
be performed by people specifically highlights this motor from one cluster to a different cluster) reflects executive
aspect of language and sets it apart from other forms of functioning skills such as organization in verbal fluency
semantic fluency. Moreover, verbs generally possess higher tasks (Troyer et al., 1997). On the other hand, the number
semantic complexity compared with nouns, making them of clusters is thought to reflect both lexical retrieval and
a potentially more sensitive measure for detecting seman- executive function abilities (Troyer et al., 1997). Although
tic declines associated with AD, which could contribute to clustering and switching can drive overall productivity
early detection and better management of the disease during verbal fluency and are strongly correlated with the
(Kim & Thompson, 2004; Williams et al., 2021). There- total number of correct responses (e.g., Gordon & Chen,
fore, verb fluency tasks offer important and distinct 2022), research has demonstrated their potential diagnostic
insights into language abilities in individuals with AD that and clinical value in predicting disease progression and
go beyond what semantic and phonemic fluency measures differential diagnosis, especially in the context of neurode-
can capture. Despite its potential benefits, there is cur- generative disorders such as AD. Previous studies have
rently a dearth of information on verb fluency in AD, in shown that these measures (switching and clustering) offer
contrast to other types of verbal fluency such as animal unique information about the underlying cognitive pro-
fluency or phonemic fluency. cesses, highlighting their significance in evaluating individ-
uals with AD (Beatty et al., 1997; Fagundo et al., 2008;
Previous studies indicated that decreased verb flu-
Haugrud et al., 2011; Tröster et al., 1998; Weakley &
ency is associated with reduced lexical access to verbs and
Schmitter-Edgecombe, 2014).
executive dysfunction (Davis et al., 2010; Macoir et al.,
2019; Östberg et al., 2005), and thus, the tasks are particu- Despite the research analyzing word clusters in ver-
larly useful during diagnostic procedures and evaluations bal fluency tasks, studies investigating clustering and
of symptoms for people with AD, including the earlier switching have reported a substantial amount of rater bias
stages of the progressive disorder such as subjective cogni- (Troyer et al., 1997; Vonk et al., 2019). Moreover, verb
tive decline (SCD) and mild cognitive impairment (MCI; fluency in particular has less data on cluster analyses and
Östberg et al., 2005; Zhao et al., 2013). The task is also a the data available involves methods of clustering that are
useful tool to indicate the progression from SCD to MCI heterogeneous, making comparison and validation of the

2590 American Journal of Speech-Language Pathology • Vol. 32 • 2589–2601 • October 2023


method difficult. Furthermore, the lack of research on k = .92), and differences were resolved by a third research
developing reliable and valid criteria for verb fluency assistant.
responses lends itself to discrepancies involving issues with
interrater and intrarater reliability as well as replication Macoir et al. (2019) analyzed clusters on the verb
problems. Troyer et al. (1997) and more recent studies fluency and object (noun) fluency responses from 20 older
(e.g., LeDoux et al., 2014) have developed reliable and adults with SCD, 20 adults with MCI, and 20 healthy
valid criteria for the cluster analysis of some types of ver- older adults. In this article, the cluster analysis was
bal fluency responses, but it should be noted that these reported to follow the criteria from Troyer et al. (1997),
criteria were not specifically designed for verb-based flu- and the clusters for verbs were determined based on
ency tasks. For example, words with the same first two semantic subcategories (e.g., speak, scream, and whisper)
letters, homophones, rhymes, and so forth are grouped as and semantic script (e.g., gardening: dig, plant, and rake).
a cluster in letter fluency tasks (LeDoux et al., 2014). In There was no specific information provided regarding reli-
category fluency, words that belong to the same subcate- ability for verbal fluency tasks.
gory or words that obviously hold strong associations in Another study by McDowd et al. (2011) completed
the real world are grouped as a cluster (LeDoux et al., cluster analyses on letter, category, and verb fluency task
2014). Category fluency tasks such as animal and super- responses from 36 young adults, 30 healthy older adults,
market fluency have predetermined categories that are 23 adults with AD, and 30 individuals with Parkinson’s
outlined in Troyer et al. (1997), which provide categories disease. For verb fluency, clusters were defined based on
for possible correct responses to fit into (e.g., water ani- subcategories. There were different trials with different
mals, bovine, African animals). Conversely, there are no prompts for producing verbs such as asking participants
predetermined criteria for verbs that have been estab- to name “things you can do to an egg.” For the “things
lished, as there are many ways to categorize verbs (e.g., you can do to an egg” trial, the following categories were
purpose, part of the body that performs the action). Even used to cluster responses: “ways to cook an egg, ways to
within these categories, there are no concrete subcategories destroy an egg, and ways to decorate an egg” (McDowd
where a verb will fall into only one category. For exam- et al., 2011, p. 11). This was the only criterion described
ple, with no predetermined criteria, the verb “drive” could for clustering the verb fluency responses.
be categorized as performed by the hands or the feet
depending on the rater’s opinion. The lack of criteria rep- The studies mentioned above provide an initial
licable within or across studies and researchers for verb foundation and strong support for the need for replicable
fluency tasks remains a critical gap in establishing cluster criteria for cluster and switch analyses on the verb fluency
analysis of verb fluency tasks as a reliable and valid diag- task. These studies reported heterogeneous criteria that
nostic tool for dementia and other populations. are not easily replicated across studies and appear to rely
on subjective methods. Methods used for clustering by
As mentioned above, previous studies have reported Faroqi-Shah and Milman (2017) and Macoir et al. (2019)
a heterogeneous variety of clustering methods for the verb are similar in that they clustered responses based on
fluency task. For example, Faroqi-Shah and Milman semantic features; however, the minimal examples and
(2017) grouped responses of 29 individuals with aphasia subjectivity of semantic relatedness makes it difficult to
and 40 neurotypical adults on animal, verb, and letter flu- determine whether the two methods will yield similar
ency tasks by using phonemic and semantic relatedness. results. In order to address these difficulties, this study
They defined phonemic relatedness by words that began sought to use a predetermined set of norms to examine
with the same letter, differed only by a vowel, rhymed, or the feasibility of using objective clustering criteria that can
were homonyms. Semantic relatedness was defined as “a be used to expand upon the work that the previous studies
paradigmatically or syntagmatically related response” have reported on in order to potentially use the verb flu-
(Faroqi-Shah & Milman, 2017, p. 374). Although exam- ency task as a reliable evaluation tool. Specifically, we
ples specific to verbs were not available from this study, used the Lancaster Sensorimotor Norms (Lynott et al.,
paradigmatically similar words were defined as those from 2020) as a measure to categorize the verbs. The Lancaster
the same subcategory such as dog and cat (pets) or lion Sensorimotor Norms are a set of semantic norms for
and tiger (zoo). Syntagmatically similar words were English that measure how strongly words are associated
defined as having a “semantic association such that they with different sensory modalities and action effectors.
often co-occur” (Faroqi-Shah & Milman, 2017, p. 374) Thus, the Norms include six perceptual modalities (touch,
such as salt and shaker. Each participant’s responses were hearing, smell, taste, vision, and interoception) and five
inspected and analyzed twice by independent raters who action effectors (mouth/throat, hand/arm, foot/leg, head
had frequent discussions on how they chose to cluster excluding mouth/throat, and torso). These Norms repre-
responses. There was strong interrater reliability (Cohen’s sent the largest set of semantic norms gathered from 3,500

Fisher et al.: Objective and Subjective Clustering 2591


individual participants for 40,000 words. In Lynott et al. By using the Lancaster Norms, which measure how
(2020), participants were presented with words and asked much words evoke different sensations and actions, we
to rate their familiarity with the word (e.g., did they know will be able to explore how to cluster verbs based on their
the meaning of the word). Once familiarity was established, sensorimotor features. However, there are also potential
participants were asked to categorize the perceptual modali- limitations to using these Norms. For example, they may
ties (“to what extent do you experience WORD”) and not capture all the nuances and variability of sensorimotor
action effectors (“to what extend do you experience experiences across individuals and contexts. Additionally,
WORD by performing an action with the. . .”) for each although the Lancaster Norms can capture some aspects
word when provided with the previously mentioned percep- of word associations based on sensorimotor experiences,
tual modalities and action effectors. High interrater reliabil- they might not capture other aspects like phonological
ity was reported for the perceptual modalities (Cronbach’s similarities. Nevertheless, given the lack of objective cri-
alpha .90–.96 for each modality) and action effectors teria and understanding related to different approaches to
(Cronbach’s alpha .85–.93 for each effector; Lynott et al., measuring verb fluency performance, this study used the
2020). The dominant dimension was identified as the single Lancaster Norms as the objective clustering method con-
highest rating dimension among the dominant perceptual sidering the large participant sample and word bank, and
modalities, dominant action effectors, and dominant senso- high reliability with its potential contribution to the devel-
rimotor dimensions (perceptual and action combined). opment of more effective methods.
Using objective norms such as the Lancaster Senso- In addition to needing predetermined clustering cri-
rimotor Norms to cluster verbs has several advantages. It teria to increase the reliability and validity of verb fluency
is a large multidimensional data set that can be used to tasks, the previous literature highlights the need for well-
study various aspects of language processing, such as established interrater and intrarater reliability within the
word recognition, categorization, memory, and embodi- studies. Faroqi-Shah and Milman (2017) reported very
ment (Lynott et al., 2020). Additionally, it allows for strong interrater reliability (Cohen’s k = .92) on all verbal
quantifying the strength of verb–action effector associa- fluency tasks collapsed across animals, verb, and letter
tions, which are important dimensions of verb meaning. categories. This indicates that raters who participated in
Using the Lancaster Norms in clustering verbs also has the clustering method had a very strong agreement in the
the advantage of being more reliable and consistent than clusters on a randomly selected 20% of the sample. There
subjective methods that rely on human ratings, which may was no intrarater reliability reported in either of the other
vary depending on individual experience and subjective two studies highlighted in this article for clustering analy-
judgment. Moreover, the embodied cognition framework ses on the verb fluency task. For example, no specific
suggests that language is shaped by our bodily experiences information for the action category was provided by
and interactions with the world (Gallese & Cuccio, 2018). McDowd et al. (2011), whereas they provided that
Thus, using sensorimotor norms to cluster verbs may shed Cronbach’s alpha was greater than .90 for clusters,
further insights into the cognitive–linguistic strategies that switches, and cluster size of all fluency tasks including
people use when generating verbs in a fluency task. letter, semantic, and action fluency tasks.
In a recent study by Qiu and Johns (2021), the Therefore, this study also aimed to compare (a) the
Lancaster Norms were used as an objective measure to interrater reliability between subjective analyses of the two
determine sensorimotor similarity of responses from healthy raters (first and last author) to determine the reliability of
adults on verbal fluency tasks including verb and noun traditional methods and (b) the intermethod reliability
fluency. They found that the produced verbs were more between subjective and objective analyses to measure the
widely distributed across multiple modalities and sensori- degree to which these two methods converge. The subjec-
motor effectors (i.e., different sensory and motor systems, tive clustering methods used in this article were an attempt
which include vision, hearing, smell, taste, touch, and var- to replicate the methods used by Macoir et al. (2019),
ious motor actions) compared with the nouns generated McDowd et al. (2011), and Faroqi-Shah and Milman
during the fluency tasks. This indicates a broader array of (2017) by instructing raters to cluster words with semantic
sensory experiences and motor actions associated with the relationships. The objective clustering method utilized the
generated verbs. Conversely, the nouns produced demon- Lancaster Norms (Lynott et al., 2020) as a predetermined
strated more homogeneity, suggesting that they were more categorization of verbs that removes a potential for rater
likely to share similar sensory experiences and actions. bias (Vonk et al., 2019) or inconsistency within and across
Thus, noun and verb fluencies reflect different aspects of raters. This study also investigated intrarater reliability in
lexical knowledge and retrieval processes, and using senso- human raters to further establish reliability within the sub-
rimotor norms can provide valuable insights into how jective method before comparing it with the objective
verbs are stored and retrieved during verb fluency. method.

2592 American Journal of Speech-Language Pathology • Vol. 32 • 2589–2601 • October 2023


Based on the current literature, we hypothesized that was screened through a picture-matching task. This study
the objective clustering method would reveal a significant was approved by the institutional review board at the Uni-
difference in the performance on the verb fluency task versity of Tennessee Health Science Center and part of a
between IwDs and CHOAs for the number of clusters, larger research project.
average cluster size, and number of switches in that
Table 1 shows demographic information for the two
semantic impairments are commonly observed in IwDs
groups including age, gender, years of education, and
(Balthazar et al., 2008; Bayles et al., 1990; Henry et al.,
MMSE scores. The education level did not differ between
2004). We also hypothesized that the interrater reliability
the two groups (p = .971). Also, the gender distribution
between two raters using the subjective clustering methods
between the two groups did not differ (p = .43). As
would be weak given the ambiguous clustering instruc-
expected, the MMSE-2 scores were significantly lower in
tions and possible rater bias. Similarly, we hypothesized
the dementia group compared with the CHOA group
that the intermethod reliability for the objective and sub-
(p < .001). A statistically significant group difference
jective clustering methods would also be weak given the
was also discovered for age (p = .015). To ensure the
differences in clustering methods.
reliability of our analysis, we conducted secondary anal-
yses on a subset of data, which excluded three outliers
(two CHOAs aged 64 years and one IwD aged 94 years).
Method With this secondary data set with no group difference in
age (t = −1.88, df = 37, and p = .069), the results for all
Twenty IwDs and 22 CHOAs participated in this
analyses remained consistent (i.e., t tests and intraclass
study. Participants were recruited from a large, community-
correlation coefficients [ICCs]). The findings of these
based study of aging and neurodegenerative diseases. The
supplementary analyses enhance the reliability of our
inclusionary criteria for participants with dementia included
statistical results, thus substantiating that the observed
a clinical diagnosis of amnestic phenotype of AD meeting
distinctions between the control and the AD groups can
the criteria of McKhann et al. (2011) based on results of
be attributed to cognitive impairments. All participants
neuropsychological and neurological assessments, mild or
were instructed to do the following for the verb fluency
moderate severity (mild > 21; 11 < moderate < 20;
task, as first introduced by Piatt, Fields, Paolo, and
Perneczky et al., 2006) demonstrated by MMSE-2 scores
Troster (1999):
(Folstein et al., 2010), and no history of other neurological
or psychiatric diseases that might affect communication
I’d like you to tell me as many different things as you
abilities (e.g., schizophrenia, stroke-induced aphasia, trau-
can think of that people do. I don’t want you to use
matic brain injury, epilepsy, and clinical depression) as well
the same word with different endings, like eat, eating,
as no history of alcohol or drug abuse. Additionally, the
eaten. Also, just give me single words such as eat, or
inclusionary criteria for the CHOA group included no his-
smell, rather than a sentence. Can you give me an
tory of neurological, psychiatric, or developmental disor-
example of something that people do?
ders (e.g., schizophrenia, stroke, traumatic brain injury, epi-
lepsy, and clinical depression) or substance abuse that may
result in communication disorders. All participants were If the response was unacceptable, participants were
native speakers of English and right-handed, with normal asked to provide a different example. If the response was
hearing and vision (corrected or uncorrected). The Speech an acceptable verb, participants were further instructed:
Discrimination Screening subtest from Arizona Battery for
Communication Disorders of Dementia (ABCD; Bayles & That’s the idea. Now in 30 seconds, tell me as many
Tomoeda, 1993) was used for hearing screening. Vision different things as you can think of that people do.

Table 1. Demographic information on the individuals with dementia (IwDs) and cognitively healthy older adults (CHOAs).

Variable IwDs (n = 20) CHOA (n = 22) Group difference p


Age 78.15 (8.02) 72.27 (6.94) t(40) = −2.55* .015
Education 18.13 (13.41) 18.27 (12.43) t(40) = 0.04 .971
MMSE-2 22.79 (4.79) 27.73 (1.49) t(20.99) = 4.32*a < .001
Gender F = 15, M = 5 F = 14, M = 8 χ (1, Ν = 42) = .63
2
.43
Note. MMSE-2 = Mini-Mental State Examination-2; F = female; M = male.
a
Equal variances not assumed.
*p < .01.

Fisher et al.: Objective and Subjective Clustering 2593


Responses were then transcribed verbatim via video The subjective clustering method utilized two raters
and audio recording of the responses to ensure correct categorizing the responses for each group (IwD and
transcription, and interrater reliability for transcription CHOA) based on the previous studies (Faroqi-Shah &
and scoring on 20% of the sample was 100%. Milman, 2017; Macoir et al., 2019; McDowd et al., 2011).
Both raters analyzed the responses for each participant
From the recorded responses, only correct responses
using the following instructions on an Excel spreadsheet.
from this task were used in all analyses to investigate the
lexical retrieval and cognitive processes used strictly for A cluster is defined as a sequential group of words
verbs. Incorrect responses (i.e., words belonging to a dif- with a commonality (semantics). Based on this definition
ferent grammatical class), perseverations, or extraneous of a cluster, you will be clustering (grouping) responses on
comments were removed before analyses were completed. a verb fluency task where participants were asked to name
Two types of clustering methods, objective and subjec- as many things as possible that “people do” (verbs) within
tive, were utilized to analyze the data in this study. Both a time frame of 30 s. Start with the first response and
methods used a task congruent method to cluster mark it in Cluster 1. Then move to the next response and
responses. A task congruent method of clustering determine if there is a relationship between the two words.
includes clustering either semantically related words for a If there is, put it in Cluster 1 and write a word or short
category fluency task (i.e., verb fluency or animal flu- phrase that describes the commonality. If there is not a
ency) or phoneme-relatedness criteria for letter fluency commonality, move on to the following response and
tasks (Lehtinen et al., 2021). The objective clustering assess its commonality with the response that appears
method utilized the Lancaster Norms database (Lynott next. Repeat this for all of the responses. There is no limit
et al., 2020) to analyze clusters and switches. The Lancaster on how small or large the cluster can be.
Norms classify over 40,000 English verbs into one of
An example for an animal fluency task was pro-
five categories of action effectors: hand/arm, head
vided to avoid priming any categories for verbs and to
excluding mouth/throat, foot/leg, mouth, and torso.
capture the raters’ initial categorization methods more
Based on the Lancaster Norms, one dominant action
naturally. Raters were reminded to complete this indepen-
effector was distinguished as having the highest rating
dently and to be as consistent as possible between each
within the action effectors. Then, the two groups (AD
participant. To analyze the interrater reliability between
and CHOAs) were compared using independent-samples
the two raters, an addition was made to the above instruc-
t tests for the following variables: total number of words
tions, which is to categorize the verbs based on the part of
(correct responses), number of clusters, average cluster
the body used to complete the verb. The entire data set
size, and number of switches.
was analyzed by the first and third authors. In order to
The number of clusters was calculated by counting establish intrarater reliability, the first author completed
the total number of clusters (consecutive responses in the the clustering analysis twice, 2 weeks apart. ICCs (Shrout
same category) for each individual participant. The aver- & Fleiss, 1979) using a single-measurement, absolute-
age cluster size for each participant was calculated by agreement, two-way random-effects model were calculated
dividing the sum of cluster sizes for individual participants for (a) intrarater reliability (first author’s); (b) interrater
by the total number of clusters for that participant. Clus- reliability (between first and third authors) for subjective
ter sizes were calculated by counting correct responses methods; (c) intermethod reliability (between first author’s
starting at the second word in the cluster, so there were subjective method and objective method); and (d) inter-
no clusters with a cluster size of zero. That is, a cluster method reliability (between third author’s subjective
size was determined by subtracting 1 from the total num- method and objective method) on the number of clusters,
ber of words in each cluster. For example, the cluster size average cluster size, and number of switches. The level of
of a cluster consisting of two words run and jump would reliability was interpreted according to Koo and Li (2016)
be coded as 1. The number of switches was calculated by benchmarks: poor (less than .5), moderate (.5–.75), good
counting the number of shifts between each action effector (.75–.90), and excellent (.90). Bivariate correlational analy-
from the Lancaster Norms (or in the case of subjective ses were also conducted to show the relationships between
methods, different parts of the body that perform the our measures (see Supplemental Materials S1–S3). All
verb), which would include responses that did not have analyses were conducted using SPSS 27 software.
consecutive congruent action effectors (or in the subjective
rating, congruent parts of the body that completed the
action). That is, single word responses not belonging to Results
any clusters were counted in calculating switches to cap-
ture transitions between each word or cluster (Macoir The responses for both the AD and CHOA groups
et al., 2019; McDowd et al., 2011; Troyer et al., 1997). were first analyzed using the objective clustering method

2594 American Journal of Speech-Language Pathology • Vol. 32 • 2589–2601 • October 2023


(i.e., using Lancaster Norms), and the means were com- moderate to excellent [.69, .99]. The ICC for interrater reli-
pared using independent-samples t tests on the following ability (number of switches) between the two authors was
variables: number of clusters, average cluster size, number .69, with the confidence interval ranging from poor to good
of switches, and total number of words (see Table 2). [.49, .82]. The ICC for intermethod reliability (number of
Compared with CHOAs, IwDs produced significantly switches) between the objective and subjective clustering
fewer words (CHOA M = 12.27, SD = 1.89; IwD M = methods was .76 (Rater 1 and objective) and .70 (Rater 2
7.65, SD = 3.28; t[40] = 5.66, p < .001), significantly fewer and objective) with the confidence intervals ranging from
clusters (CHOA M = 2.64, SD = 1.14; IwD M = 1.60, moderate to good ([.60, .87] for Rater 1; [.50, .83] for Rater
SD = 1.10; t[40] = 3.00, p = .005), and significantly fewer 2). For the average cluster size, the ICC for intrarater reli-
switches (CHOA M = 6.50, SD = 1.63; IwD M = 4.10, ability was .98, with the confidence interval ranging from
SD = 2.13; t[40] = 4.13, p < .001). There was no signifi- good to excellent [.89, .99]. The ICC for the interrater reli-
cant difference between the CHOAs (M = 1.90, SD = .77) ability (average cluster size) between the two authors was
and IwDs (M = 1.41, SD = 1.14) regarding cluster size, .51, with the confidence interval ranging from poor to mod-
t(40) = 1.65, p = .108. These findings were mostly consis- erate [.24, .70]. The ICC for the intermethod reliability (aver-
tent with those obtained from human raters who used the age cluster size) between the objective and subjective cluster-
subjective clustering method (see Table 2 and Supplemen- ing methods was .64 (Rater 1 and objective) and .53 (Rater
tal Material S4). 2 and objective), with confidence intervals ranging from poor
to good ([.42, .79] for Rater 1; [.27, .72] for Rater 2).
Additionally, all ICCs for intrarater reliability
resulted in moderate to excellent correlations (see Table 3),
indicating strong reliability within the first author when
subjectively categorizing verb fluency responses. For the Discussion
number of clusters, the ICC for intrarater reliability was
.92, with the confidence interval (the level of confidence for We investigated whether objective clustering methods
all intervals was 95%) ranging from moderate to excellent can be used to compare verb fluency performance in IwDs
[.68, .98]. The ICC for interrater reliability (number of clus- and CHOAs and examined the intrarater, interrater, and
ters) between the two authors was .81, with the confidence intermethod reliability within and across raters and for dif-
interval ranging from moderate to good [.68, .89] (Koo & ferent clustering methods. Using the objective clustering
Li, 2016). The ICC for intermethod reliability (number of method, the IwDs produced significantly fewer clusters and
clusters) between the objective and subjective clustering switches. Although the IwDs produced smaller clusters on
methods was .80 (Rater 1 and objective) and .77 (Rater 2 average, there was no significant difference with the CHOA
and objective), with the confidence interval ranging from group. The subjective clustering method yielded strong
moderate to good on both ([.69, .89] for Rater 1; [.61, .89] intrarater reliability and variable interrater and intermethod
for Rater 2). For the number of switches, the intrarater reli- reliability between the authors as well as between the objec-
ability was .93, with the confidence interval ranging from tive and subjective clustering methods.
Table 2. Group comparison of verb fluency responses using the objective and subjective clustering methods.

Variable IwDs CHOA t (df = 40) p


a
Total number of words 7.65 (3.28) 12.27(1.89) 5.53* < .001
Number of clusters
Objective (LN) 1.60 (1.10) 2.64 (1.14) 3.00* .005
Subjective (Rater 1) 1.50 (1.05) 2.91 (1.27) 3.90* < .001
Subjective (Rater 2) 1.60 (.99) 3.09 (1.34) 4.06* < .001
Cluster size
Objective (LN) 1.41 (1.14) 1.90 (0.77) 1.65 .108
Subjective (Rater 1) 1.51 (1.10) 1.77 (1.04) 0.78 .440
Subjective (Rater 2) 1.33 (1.83) 1.96 (0.77) 2.56* .014
Number of switches
Objective (LN) 4.10 (2.13) 6.50 (1.63) 4.13* < .001
Subjective (Rater 1) 4.20 (2.19) 6.41 (2.11) 3.33* .002
Subjective (Rater 2) 4.20 (1.74) 5.45 (1.38) 2.61* .013
Note. IwDs = individuals with Alzheimer's dementia; CHOAs = cognitively healthy older adults; LN = Lancaster Norms.
a
Equal variances not assumed (df = 29.66).
*p < .05.

Fisher et al.: Objective and Subjective Clustering 2595


Table 3. Intraclass correlation coefficients for the variables on verb fluency.

Variables ICC 95% CI


Intrarater (within Rater 1) .92* [.68, .98]
Interrater (between Raters 1 and 2) .81* [.68, .89]
Number of clusters
Intermethod (between objective and Rater 1) .80* [.69, .89]
Intermethod (between objective and Rater 2) .77* [.61, .89]
Intrarater (within Rater 1) .98* [.89, .99]
Interrater (between Raters 1 and 2) .51* [.24, .70]
Average cluster size
Intermethod (between objective and Rater 1) .64* [.42, .79]
Intermethod (between objective and Rater 2) .53* [.27, .72]
Intrarater (within Rater 1) .93* [.69, .99]
Interrater (between Raters 1 and 2) .69* [.49, .82]
Number of switches
Intermethod (between objective and Rater 1) .76* [.60, .87]
Intermethod (between objective and Rater 2) .70* [.50, .83]
Note. ICC = intraclass correlation coefficient; CI = confidence level.
*p < .005. According to Koo and Li (2016), the level of reliability was determined with the following criteria: below .5 = poor, between .50
and .75 = moderate, between .75 and .90 = good, and above .90 = excellent.

Looking at the interrater and intrarater reliability results. This also suggests that the two methods may
analyzed in this study, subjective clustering methods pro- capture different aspects of verb semantics compared
duced excellent intrarater reliability on all variables. This with human raters. Upon comparison, we observed that
indicates that one rater can repeatedly categorize the the disparities between the two methods can be ascribed
responses similarly on different occasions—an important to the more elaborate categorization used by either the
consideration for assessment particularly when tracking objective or subjective clustering approach (although
disease progression or treatment effectiveness over time. these differences do not encompass all possible explana-
The subjective clustering methods revealed questionable tions for the observed disparities; thus, further research is
interrater reliability, especially on the average cluster size warranted).
variable in comparison with other variables. Such poten-
For example, although human raters would cluster
tial for inconsistency is problematic for a reliable and
talk, laugh, and cry together as a cluster associated with
valid assessment. Previous studies that have investigated
face, the Lancaster Norms used a more detailed categori-
the interrater reliability between two raters for clustering/
zation, with talk and laugh clustered for mouth and cry
switching analyses for other verbal fluency tasks (i.e., ani-
for head (see Table 4). On the other hand, garden, swim,
mal fluency) reported strong correlations overall (LeDoux
and plant were all clustered together by the Lancaster
et al., 2014; Ross et al., 2007). These studies that have
Norms into the category of hand (arm), whereas human
reported interrater reliability for clustering and switching
raters agreed that swim belongs to the whole body, and
analyses on animal fluency or letter fluency tasks have uti-
the other two (garden and plant) belong either to arms or
lized criteria from Troyer et al. (1997) and LeDoux et al.
hands. However, the relatively weak concordance between
(2014), establishing widely accepted and replicable criteria,
objective and subjective methods is comparable with the
which could contribute to the higher interrater reliability
degree of reliability observed between two human raters
observed in those cluster analyses. Given the variability of
(i.e., ICCs between the human raters). That is, the discrep-
the interrater reliability between two raters reported in this
ancies between objective and subjective methods are not
study during a verb fluency task, there appears to be
unique to the comparison between objective and subjective
increased opportunities for rater inconsistency especially
methods. This underscores the need for further research
when the clustering criteria are vague.
on a comprehensive and reliable set of criteria for cluster
Additionally, the ICCs between the subjective and analysis and how sensorimotor features can contribute to
objective clustering methods were classified as moderate verb processing and semantics in establishing the criteria.
(for the cluster size) or good (for the number of clusters Because the use of Lancaster Norms contributed to differ-
and switches) according to Koo and Li (2016). This indi- entiating the two groups in terms of the number of clus-
cates that there might be a commonality between the two ters and switches, which is consistent with subjective clus-
clustering methods lending some support to the validity of tering methods, this method may be another viable option
the objective clustering method, but not strong enough in clinical diagnostic processes to investigate cognitive pro-
ICC to claim that both methods would yield the same cesses underlying verbal fluency impairment.

2596 American Journal of Speech-Language Pathology • Vol. 32 • 2589–2601 • October 2023


Table 4. Example of participant responses and clustering and switch analyses.

Example from a CHOA Example from a participant with AD

Lancaster Lancaster
Variables Response Score Normsa Rater 1 Rater 2 Response Score Normsa Rater 1 Rater 2
Talk 1 mouth face mouth Garden 1 hand_arm arms hands
Smile 1 mouth face mouth Swim 1 hand_arm whole body whole body
Cry 1 head face mouth Plant 1 hand_arm arms hands
Laugh 1 mouth face mouth Walk 1 foot_leg legs legs
Run 1 foot_leg legs legs Talk 1 mouth face face
Jump 1 foot_leg legs legs Laugh 1 mouth face face
Walk 1 foot_leg legs legs Cry 1 head face face
Talkb 0 n/a n/a n/a
Blink 1 head face face
Swallow 1 mouth face mouth
Sneeze 1 head face mouth
Clap 1 hand_arm arms hand
Count 1 head mind hand
Swim 1 hand_arm whole body whole body
Number of clusters 2 3 4 2 1 1
Average cluster size 1.5 2.33 1.75 1.5 2 2
Number of switches 9 5 5 3 4 4
Note. Responses in the italicized areas indicate that they were clustered together. The bold text symbolizes clusters per each rater. CHOA =
cognitively healthy older adult; AD = Alzheimer’s dementia.
a
The dominant action effector was determined using the Lancaster Norms database. bThis is an incorrect response and therefore was not
included in the analysis.

Although clustering and switching analyses from cluster size during the animal fluency task although
category fluency tasks have been extensively studied in CHOAs produced the largest cluster sizes, followed by
neurological populations, verb clustering and switching MCI participants, and IwDs produced the smallest
analyses have received limited attention in research. cluster sizes (i.e., AD < MCI < CHOA). They concluded
Because these measures provide valuable insight into cog- that these results could highlight semantic memory retrieval
nitive processes underlying impaired verbal fluency perfor- processes in IwDs being inefficient, but still effective, which
mance, exploring verb fluency abilities through cluster and could be present in this study considering similar results
switch analyses can significantly enhance our understand- on a different verbal fluency task.
ing of AD beyond traditional accuracy-based scoring
Another possible explanation for the lack of signifi-
approach. Similar to the findings of Macoir et al. (2019),
cance in group differences for the average cluster size is
this study reports no significant difference between people
the heterogeneous presentation of AD. The large stan-
with neurodegenerative disorders and CHOAs regarding
dards of deviation in the IwDs, especially for the total
the average cluster size in verb fluency responses. With
number of words (SD = 3.28) and number of switches
other verbal fluency tasks such as animal fluency, there
(SD = 2.13) reflect the heterogeneity of verb fluency perfor-
are variable results regarding the group differences with
mance in the dementia population even within the mild
respect to average cluster size in IwDs compared with
and moderate stages of the disorder and precipitating fac-
CHOAs. Some studies have indicated that IwDs (or those
tors, which has been reported in previous studies (Delbeuck
with MCI) do not produce significantly smaller cluster
et al., 2013). As the heterogeneity of AD is commonly
sizes in comparison with healthy controls (e.g., Peter
observed and has been reported previously, the objective
et al., 2016; Raoux et al., 2008; Tröger et al., 2019;
clustering method used is a viable option despite the lack
Weakley et al., 2013). On the other hand, other studies
of group differences on the average cluster size variable.
have reported a significant difference between the two
groups regarding average cluster size with IwDs produc- With the objective clustering method, there were sig-
ing significantly smaller cluster sizes (e.g., Murphy et al., nificantly fewer switches in IwDs compared with CHOAs.
2006; Oh et al., 2019). Tröger et al. (2019) reported a These results are consistent with previous category fluency
lack of statistically significant differences on the average studies using animal fluency (Raoux et al., 2008; Rofes

Fisher et al.: Objective and Subjective Clustering 2597


et al., 2020; Tröger et al., 2019), suggesting that objective cluster for a category fluency task would be using a pho-
clustering methods focusing on action semantics are appli- nemic relatedness criterion for clustering methods (e.g.,
cable to analyses of verb fluency responses. For example, words that start with the same letter and then switching to
Tröger et al. (2019) reported a significant difference another letter) or using a semantically relatedness criterion
between groups on an animal fluency task where IwDs for a letter fluency task (e.g., words that belong to the
switched significantly less than MCI and CHOA groups same subcategory), which was reported as part of Faroqi-
(IwDs < MCI = CHOA), highlighting the different cogni- Shah and Milman (2017)’s criteria (Lehtinen et al., 2021).
tive characteristics of CHOAs and IwDs. Switching is con- By choosing to use a task-congruent method, the results
sidered to be a more cognitively demanding skill than of this study can be reported in the context of the cluster-
clustering, as it involves “higher cognitive functions, such ing method used but cannot be generalized to all cluster-
as cognitive flexibility and strategic search processes” ing methods as they were not analyzed. To effectively
(Lehtinen et al., 2021, p. 3). Troyer et al. (1998) reported evaluate clustering in the verb fluency task, it may be benefi-
that participants with frontal lobe lesions demonstrated cial to compare different clustering criterion with task-
less switching than healthy controls, indicating that execu- congruent and task-discrepant methods in future studies to
tive functioning difficulties could be a factor in the better capture the cognitive processes utilized by participants
decreased performance on verbal fluency tasks. Paek et al. during the verb fluency task. Using both task-congruent and
(2020) reported that verb retrieval involves activation in task-discrepant methods of clustering methods would impact
parts of the brain including the hippocampus and frontal the number of clusters, average cluster size, number of
lobes with the frontal lobe typically having associations switches, and interrater and intrarater reliability for both
with executive function abilities. In the early stages of IwDs and CHOAs. A combination of different clustering cri-
AD, there is typically damage to the temporal regions of teria for the verb fluency responses could provide insight on
the brain extending to the parietal and frontal lobes and the connectedness of different methods and the methods that
ultimately leading to global neurodegeneration (Koenig most accurately highlight the lexical retrieval and executive
et al., 2018). Since neuropathological changes to the fron- function processes used by participants. Additionally, the
tal lobe are commonly observed in IwDs, there could be a instructions used in an attempt to replicate Piatt, Fields,
decrease in the executive function skills that are used dur- Paolo, and Troster’s (1999) original instructions limit the
ing the verb fluency task observed in this population. responses to the verb fluency task to be actions that can be
These executive function skills such as switching, inhibi- done by people. This could limit the responses to not include
tion, and organizing may be reflected in the number of actions that can be done by other agents.
clusters and number of switches variables. However, it
Another future direction that could highlight group
should be noted that the extent to which clustering and
differences between IwDs and CHOAs is using temporal
switching provide information about AD over and above
analyses to determine whether there are any patterns or
the total number of items produced is currently unclear.
differences in response time (e.g., initial response time,
Some studies have claimed that these measures offer
interword response time, using intervals to analyze how
unique predictive value (Raoux et al., 2008; Tröster et al.,
many words were produced in specific intervals). This
1998), although others have found that they do not signifi-
method could provide more insight into the executive
cantly improve prediction once total output is taken into
functioning deficits, which has been investigated in other
account (Gordon & Chen, 2022).
studies where IwDs tend to take a longer time to produce
words in animal and letter fluency (Tröger et al., 2019).
The temporal analysis could be used in conjunction with
Limitations and Future Research clustering and switching analyses to investigate any rela-
tionships between temporal results and clustering and
Although this study investigated the use of the
switching variables during the verb fluency task. Addition-
Lancaster Norms as an objective method for clustering
ally, in order to gain deeper insights into the source of dis-
verbs in a verb fluency task, there are other potential
crepancies between the two methods (as well as among
methods for verb clustering that may also prove to be reli-
human raters), it would be beneficial to conduct item-
able. An additional or novel method that could be utilized
based and qualitative analyses.
for clustering the verb fluency task includes why someone
performs the verb (e.g., work, leisure, hygiene). Although Other limitations include a small sample size, as a
task congruent clustering can provide insight into the cate- larger sample size would strengthen the results and make
gorization and organization during generative naming, it them more generalizable to the participant population. It
is possible that other clustering methods (i.e., task discrep- would also be beneficial for the verb fluency clustering
ant clusters) could be used as a strategy during analyses of analyses to be used with other populations that experience
verbal fluency responses. An example of a task-discrepant neuropathological changes such as Parkinson’s disease,

2598 American Journal of Speech-Language Pathology • Vol. 32 • 2589–2601 • October 2023


primary progressive aphasia, and other types of dementia Beatty, W. W., Testa, J. A., English, S., & Winn, P. (1997).
with a standardized method for clustering. Influences of clustering and switching on the verbal fluency
performance of patients with Alzheimer’s disease. Aging, Neu-
Overall, the verb fluency task continues to provide ropsychology, and Cognition, 4(4), 273–279. https://doi.org/10.
insight into lexical retrieval and executive function abilities 1080/13825589708256652
Beber, B. C., Cruz, A. N., & Chaves, M. L. (2015). A behavioral
in IwDs and CHOAs; however, there is a need for a stric- study of the nature of verb production deficits in Alzheimer’s
ter protocol when clustering responses to the verb fluency disease. Brain and Language, 149, 128–134. https://doi.org/10.
task to eliminate the subjectivity that hinders replication 1016/j.bandl.2015.07.010
across studies. Subjective clustering methods should be Davis, C., Heidler-Gary, J., Gottesman, R. F., Crinion, J.,
used with caution and researchers will need to consider Newhart, M., Moghekar, A., Soloman, D., Rigamonti, D.,
Cloutman, L., & Hillis, A. E. (2010). Action versus animal
methodological validity in cluster analyses and/or con- naming fluency in subcortical dementia, frontal dementias,
tinue to investigate objective clustering methods, with the and Alzheimer’s disease. Neurocase, 16(3), 259–266. https://
Lancaster Norms being a viable and replicable method doi.org/10.1080/13554790903456183
for cluster analyses in future studies. Delbeuck, X., Debachy, B., Pasquier, F., & Moroni, C. (2013).
Action and noun fluency testing to distinguish between Alz-
heimer’s disease and dementia with Lewy bodies. Journal of
Clinical and Experimental Neuropsychology, 35(3), 259–268.
Data Availability Statement https://doi.org/10.1080/13803395.2013.763907
Fagundo, A. B., López, S., Romero, M., Guarch, J., Marcos, T.,
The data sets generated during and/or analyzed dur- & Salamero, M. (2008). Clustering and switching in semantic
ing the current study are available from the corresponding fluency: Predictors of the development of Alzheimer’s disease.
author on reasonable request. International Journal of Geriatric Psychiatry, 23(10), 1007–
1013. https://doi.org/10.1002/gps.2025
Faroqi-Shah, Y., & Milman, L. (2017). Comparison of animal,
action and phonemic fluency in aphasia. International Journal
Acknowledgments of Language & Communication Disorders, 53(2), 370–384.
https://doi.org/10.1111/1460-6984.12354
Research reported in this research note was supported Folstein, M. F., Folstein, S. E., McHugh, P. R., & Fanjiang, G.
in part by National Institute on Aging of the National (2010). Mini-Mental State Examination–Second Edition. Psy-
Institutes of Health under Grant R03AG072236 awarded chological Assessment Resources.
Forlenza, O., Mirandez, R., & Radanovic, M. (2012). P3-266:
to Eun Jin Paek. The authors would like to thank all our Verbal fluency in multiple categories in mild cognitive impair-
participants and their care partners for participating in the ment (MCI). Alzheimer’s Dementia, 8(4S, Pt. 15), P553.
study. This research was conducted as the first author’s https://doi.org/10.1016/j.jalz.2012.05.1489
master’s thesis project at the University of Tennessee Gallese, V., & Cuccio, V. (2018). The neural exploitation hypoth-
Health Science Center in the Adult Language and Brain esis and its implications for an embodied approach to lan-
guage and cognition: Insights from the study of action verbs
Lab directed by Eun Jin Paek. processing and motor disorders in Parkinson’s disease. Cortex,
100, 215–225. https://doi.org/10.1016/j.cortex.2018.01.010
Gordon, J. K., & Chen, H. (2022). How well does the discrepancy
References between semantic and letter verbal fluency performance distin-
guish Alzheimer’s dementia from typical aging? Aging, Neuro-
Alegret, M., Peretó, M., Pérez, A., Valero, S., Espinosa, A., psychology, and Cognition, 1–30. https://doi.org/10.1080/13825585.
Ortega, G., Hernández I., Mauleón A., Rosende-Roca M., 2022.2079602
Vargas L., Rodríguez-Gómez O., Abdelnour C., Berthier Haugrud, N., Crossley, M., & Vrbancic, M. (2011). Clustering
M. L., Bak T. H., Ruíz A., Tárraga L., & Boada M. (2018). and switching strategies during verbal fluency performance
The role of verb fluency in the detection of early cognitive differentiate Alzheimer’s disease and healthy aging. Journal of
impairment in Alzheimer’s disease. Journal of Alzheimer’s the International Neuropsychological Society, 17(6), 1153–
Disease, 62(2), 611–619. https://doi.org/10.3233/JAD-170826 1157. https://doi.org/10.1017/S1355617711001196
Alzheimer’s Association. (2022). 2022 Alzheimer’s disease facts Henry, J. D., Crawford, J. R., & Phillips, L. H. (2004). Verbal
and figures. Alzheimer’s Dementia, 18, 1–122. fluency performance in dementia of the Alzheimer’s type: A
Balthazar, M. L. F., Cendes, F., & Damasceno, B. P. (2008). meta-analysis. Neuropsychologia, 42(9), 1212–1222. https://doi-
Semantic error patterns on the Boston Naming Test in normal org.ezproxy.uthsc.edu/10.1016/j.neuropsychologia.2004.02.001
aging, amnestic mild cognitive impairment, and mild Alzhei- Kim, M., & Thompson, C. K. (2004). Verb deficits in Alzheimer’s
mer’s disease: Is there semantic disruption? Neuropsychology, disease and agrammatism: Implications for lexical organiza-
22(6), 703–709. https://doi.org/10.1037/a0012919 tion. Brain and Language, 88(1), 1–20.
Bayles, K. A., & Tomoeda, C. K. (1993). Arizona Battery for Com- Koenig, A. M., Nobuhara, C. K., Williams, V. J., & Arnold, S. E.
munication Disorders of Dementia. Canyonlands Publishing. (2018). Biomarkers in Alzheimer’s, frontotemporal, Lewy
Bayles, K. A., Tomoeda, C. K., & Trosset, M. W. (1990). Naming body, and vascular dementias. Focus, 16(2), 164–172. https://
and categorical knowledge in Alzheimer's disease: The process doi.org/10.1176/appi.focus.20170048
of semantic memory deterioration. Brain and Language, 39(4), Koo, T. K., & Li, M. Y. (2016). A guideline of selecting and
498–510. https://doi.org/10.1016/0093-934x(90)90158-d reporting intraclass correlation coefficients for reliability

Fisher et al.: Objective and Subjective Clustering 2599


research. Journal of Chiropractic Medicine, 15(2), 155–163. Perneczky, R., Wagenpfeil, S., Komossa, K., Grimmer, T., Diehl,
https://doi.org/10.1016/j.jcm.2016.02.012 J., & Kurz, A. (2006). Mapping scores onto stages: Mini-
LeDoux, K., Vannorsdall, T. D., Pickett, E. J., Bosley, L. V., mental state examination and clinical dementia rating. The
Gordon, B., & Schretlen, D. J. (2014). Capturing additional American Journal of Geriatric Psychiatry, 14(2), 139–144.
information about the organization of entries in the lexicon https://doi.org/10.1097/01.JGP.0000192478.82189.a8
from verbal fluency productions. Journal of Clinical and Peter, J., Kaiser, J., Landerer, V., Köstering, L., Kaller, C. P.,
Experimental Neuropsychology, 36(2), 205–220. https://doi.org/ Heimbach, B., Hüll, M., Bormann, T., & Klöppel, S. (2016).
10.1080/13803395.2013.878689 Category and design fluency in mild cognitive impairment: Per-
Lehtinen, N., Luotonen, I., & Kautto, A. (2021). Systematic formance, strategy use, and neural correlates. Neuropsychologia,
administration and analysis of verbal fluency tasks: Preliminary 93(Pt. A), 21–29. https://doi.org/10.1016/j.neuropsychologia.2016.
evidence for reliable exploration of processes underlying task 09.024
performance. Applied Neuropsychology: Adult. https://doi.org/10. Piatt, A. L., Fields, J. A., Paolo, A., Koller, W. C., & Troster,
1080/23279095.2021.1973471 A. I. (1999). Lexical, semantic, and action verbal fluency in
Lynott, D., Connell, L., Brysbaert, M., Brand, J., & Carney, J. Parkinson’s disease with and without dementia. Journal of
(2020). The Lancaster sensorimotor norms: Multidimensional Clinical and Experimental Neuropsychology, 21(4), 435–443.
measures of perceptual and action strength for 40,000 English https://doi.org/10.1076/jcen.21.4.435.885
words. Behavior Research Methods, 52(3), 1271–1291. https:// Piatt, A. L., Fields, J. A., Paolo, A. M., & Troster, A. I. (1999).
doi.org/10.3758/s13428-019-01316-z Action (verb naming) fluency as an executive function mea-
Macoir, J., Lafay, A., & Hudon, C. (2019). Reduced lexical sure: Convergent and divergent evidence of validity. Neurop-
access to verbs in individuals with subjective cognitive decline. sychologia, 37(13), 1499–1503. https://doi.org/10.1016/S0028-
American Journal of Alzheimer’s Disease & Other Dementias, 3932(99)00066-4
34(1), 5–15. https://doi.org/10.1177/1533317518790541 Qiu, M., & Johns, B. (2021). A distributional and sensorimotor
McDowd, J., Hoffman, L., Rozek, E., Lyons, K. E., Pahwa, analysis of noun and verb fluency. PsyArXiv. https://doi.org/
R., Burns, J., & Kemper, S. (2011). Understanding verbal 10.31234/osf.io/e9n8w
fluency in healthy aging, Alzheimer’s disease, and Parkinson’s Raoux, N., Amieva, H., Le Goff, M., Auriacombe, S., Carcaillon,
disease. Neuropsychology, 25(2), 210–225. https://doi.org/10. L., Letenneur, L., & Dartigues, J. F. (2008). Clustering and
1037/a0021531 switching processes in semantic verbal fluency in the course of
McKhann, G. M., Knopman, D. S., Chertkow, H., Hyman, B. T., Alzheimer’s disease subjects: Results from the PAQUID longi-
Jack, C. R., Jr., Kawas, C. H., Klunk, W. E., Koroshetz, tudinal study. Cortex, 44(9), 1188–1196. https://doi.org/10.
W. J., Manly, J. J., Mayeux, R., Mohs, R. C., Morris, J. C., 1016/j.cortex.2007.08.019
Rossor, M. N., Scheltens, P., Carrillo, M. C., Thies, B., Rodrigues, I. T., Ferreira, J. J., Coelho, M., Rosa, M. M., &
Weintraub, S., & Phelps, C. H. (2011). The diagnosis of Castro-Caldas, A. (2015). Action verbal fluency in Parkinson’s
dementia due to Alzheimer’s disease: Recommendations from the patients. Arquivos de Neuro-Psiquiatria, 73(6), 520–525.
National Institute on Aging-Alzheimer’s Association workgroups https://doi.org/10.1590/0004-282X20150056
on diagnostic guidelines for Alzheimer’s disease. Alzheimer’s & Rofes, A., de Aguiar, V., Jonkers, R., Oh, S. J., DeDe, G., &
Dementia, 7(3), 263–269. https://doi.org/10.1016/j.jalz.2011.03.005 Sung, J. E. (2020). What drives task performance during animal
Murphy, K. J., Rich, J. B., & Troyer, A. K., 2006. Verbal fluency fluency in people with Alzheimer’s disease? Frontiers in Psychol-
patterns in amnestic mild cognitive impairment are character- ogy, 11, Article 1485. https://doi.org/10.3389/fpsyg.2020.01485
istic of Alzheimer’s type dementia. Journal of the International Ross, T. P., Calhoun, E., Cox, T., Wenner, C., Kono, W., &
Neuropsychological Society, 12(4), 570–574. https://doi.org/10. Pleasant, M. (2007). The reliability and validity of qualitative
1017/S1355617706060590 scores for the controlled oral word association test. Archives
Oh, S. J., Sung, J. E., Choi, S. J., & Jeong, J. H. (2019). Clus- of Clinical Neuropsychology, 22(4), 475–488. https://doi.org.
tering and switching patterns in semantic fluency and their ezproxy.uthsc.edu/10.1016/j.acn.2007.01.026
relationship to working memory in mild cognitive impairment. Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses
Dementia and Neurocognitive Disorders, 18(2), 47–61. https:// in assessing rater reliability. Psychological Bulletin, 86(2),
doi.org/10.12779/dnd.2019.18.2.47 420–428. https://doi.org/10.1037/0033-2909.86.2.420
Östberg, P., Fernaeus, S. E., Hellström, Å., Bogdanović, N., & Tröger, J., Linz, N., König, A., Robert, P., Alexandersson, J.,
Wahlund, L. O. (2005). Impaired verb fluency: A sign of mild Peter, J., & Kray, J. (2019). Exploitation vs. exploration-
cognitive impairment. Brain and Language, 95(2), 273–279. computational temporal and semantic analysis explains semantic
https://doi.org/10.1016/j.bandl.2005.01.010 verbal fluency impairment in Alzheimer’s disease. Neuropsycho-
Paek, E. J., & Murray, L. L. (2021). Quantitative and qualitative logia, 131, 53–61. https://doi.org/10.1016/j.neuropsychologia.
analysis of verb fluency performance in individuals with prob- 2019.05.007
able Alzheimer’s disease and healthy older adults. American Tröster, A. I., Fields, J. A., Testa, J. A., Paul, R. H., Blanco,
Journal of Speech-Language Pathology, 30(1S), 481–490. C. R., Hames, K. A., Salmon, D. P., & Beatty, W. W.
https://doi.org/10.1044/2019_AJSLP-19-00052 (1998). Cortical and subcortical influences on clustering and
Paek, E. J., Murray, L. L., & Newman, S. D. (2020). Neural cor- switching in the performance of verbal fluency tasks. Neu-
relates of verb fluency performance in cognitively healthy ropsychologia, 36(4), 295–304. https://doi.org/10.1016/S0028-
older adults and individuals with dementia: A pilot fMRI 3932(97)00153-X
study. Frontiers in Aging Neuroscience, 12, Article 73. https:// Troyer, A. K., Moscovitch, M., & Winocur, G. (1997). Clustering
doi.org/10.3389/fnagi.2020.00073 and switching as two components of verbal fluency: Evidence
Paek, E. J., Murray, L. L., Newman, S. D., & Kim, D.-J. (2019). from younger and older healthy adults. Neuropsychology,
Test-retest reliability in an fMRI study of naming in demen- 11(1), 138–146. https://doi.org/10.1037/0894-4105.11.1.138
tia. Brain and Language, 191, 31–45. https://doi.org/10.1016/j. Troyer, A. K., Moscovitch, M., Winocur, G., Alexander, M. P., &
bandl.2019.02.002 Stuss, D. (1998). Clustering and switching on verbal fluency:

2600 American Journal of Speech-Language Pathology • Vol. 32 • 2589–2601 • October 2023


The effects of focal frontal- and temporal-lobe lesions. Neuro- Weakley, A., Schmitter-Edgecombe, M., & Anderson, J. (2013).
psychologia, 36(6), 499–504. https://doi.org/10.1016/s0028- Analysis of verbal fluency ability in amnestic and non-amnestic
3932(97)00152-8 mild cognitive impairment. Archives of Clinical Neuro-
Vonk, J. M., Flores, R. J., Rosado, D., Qian, C., Cabo, R., psychology, 28(7), 721–731. https://doi.org/10.1093/arclin/act058
Habegger, J., & Manly, J. J. (2019). Semantic network func- Williams, E., McAuliffe, M., & Theys, C. (2021). Language
tion captured by word frequency in nondemented APOE ɛ4 changes in Alzheimer’s disease: A systematic review of verb
carriers. Neuropsychology, 33(2), 256–262. https://doi.org/10. processing. Brain and Language, 223, Article 105041. https://
1037/neu0000508 doi.org/10.1016/j.bandl.2021.105041
Weakley, A., & Schmitter-Edgecombe, M. (2014). Analysis of verbal Zhao, Q., Guo, Q., & Hong, Z. (2013). Clustering and switching
fluency ability in Alzheimer's disease: The role of clustering, during a semantic verbal fluency test contribute to differential
switching and semantic proximities. Archives of Clinical Neuro- diagnosis of cognitive impairment. Neuroscience Bulletin,
psychology, 29(3), 256–268. https://doi.org/10.1093/arclin/acu010 29(1), 75–82. https://doi.org/10.1007/s12264-013-1301-7

Fisher et al.: Objective and Subjective Clustering 2601


Copyright of American Journal of Speech-Language Pathology is the property of American
Speech-Language-Hearing Association and its content may not be copied or emailed to
multiple sites or posted to a listserv without the copyright holder's express written permission.
However, users may print, download, or email articles for individual use.

You might also like