You are on page 1of 7

Journal of Affective Disorders 338 (2023) 358–364

Contents lists available at ScienceDirect

Journal of Affective Disorders


journal homepage: www.elsevier.com/locate/jad

Neuropsychological instruments for bipolar disorders: A systematic review


on psychometric properties
Maria Gloria Rossetti a, b, Francesca Girelli c, Cinzia Perlini d, *, Paolo Brambilla a, e,
Marcella Bellani b
a
Department of Neurosciences and Mental Health, Fondazione IRCCS Ca’Granda Ospedale Maggiore Policlinico, Milan, Italy
b
Section of Psychiatry, Department of Neurosciences, Biomedicine and Movement Sciences, University of Verona, Verona, Italy
c
UOC Psichiatria, Azienda Ospedaliera Universitaria Integrata (AOUI), Verona, Italy
d
Section of Clinical Psychology, Department of Neurosciences, Biomedicine and Movement Sciences, University of Verona, Verona, Italy
e
Department of Pathophysiology and Transplantation, University of Milan, Milan, Italy

A R T I C L E I N F O A B S T R A C T

Keywords: Background: Cognitive deficits are a core feature of bipolar disorder (BD) that persist during the euthymic phase
Bipolar disorder and affect global functioning. However, nowadays, there is no consensus on the optimal tool to capture cognitive
Affective disorders deficits in BD. Therefore, this review aims to examine the psychometric properties of tools commonly used to
Psychosis
assess cognitive functioning in BD.
Cognition
Assessment
Methods: Literature search was conducted on PubMed and Web of Science databases on August 1, 2022 and on
Psychometric properties April 20, 2023, yielding 1758 de-duplicated records. Thirteen studies fulfilled the inclusion criteria and were
included in the review.
Results: All tools examined showed acceptable-to-good psychometric properties suggesting that both brief
cognitive screeners and comprehensive batteries may be appropriate for detecting or monitoring cognitive
changes in BD.
Limitations: Methodological differences between the included studies precluded a direct comparison of the re­
sults. Further research is needed to investigate the psychometric properties of cognitive tools that assess also
affective and social cognition.
Conclusions: The tools examined appear sensitive enough to distinguish between BD patients with versus without
cognitive deficits, however, an optimal tool has not yet been identified. The applicability and clinical utility of
the tools may depend on multiple factors such as available resources. That said, web-based instruments are
expected to become the first-choice instrument for cognitive screening as they can be applied on a large scale and
at an affordable cost. As for second-level assessment instruments, the BACA shows robust psychometric prop­
erties and tests both affective and non-affective cognition.

1. Background impairment across several domains (22 %); (ii) selective deficits in one
or two domains (38 %) or (iii) preserved cognition (40 %) (Martino
Cognitive deficits are well established in bipolar disorder (BD) and et al., 2008). Notably, several studies have demonstrated a relationship
represent a core feature present in both acute and remitted states and between neurocognitive impairment and psychosocial functioning
even before the first manifestation of the disease (Bora and Pantelis, (Baune et al., 2013) pointing to cognitive evaluation as an essential
2015; Bora et al., 2010). Typically, the most affected domains are long- component of the clinical management of BD patients.
term verbal and visual memory, working memory, attention, executive There is a variety of instruments to assess cognition in BD (Bakkour
functions, psychomotor speed, and language (Bourne et al., 2013; Kurtz et al., 2014). In clinical practice, screening scales like the Mini-Mental
and Gerraty, 2009; Bourne et al., 2013). The extent and severity of State Examination (MMSE) (Folstein, 1975) or the Montreal Cognitive
cognitive deficits vary among BD patients including (i) global Assessment (MoCA) (Nasreddine et al., 2005) are often administered.

* Corresponding author at: Section of Clinical Psychology, Department of Neurosciences, Biomedicine and Movement Sciences, University of Verona, Verona, Italy.
E-mail address: cinzia.perlini@univr.it (C. Perlini).

https://doi.org/10.1016/j.jad.2023.06.026
Received 19 September 2022; Received in revised form 26 May 2023; Accepted 15 June 2023
Available online 17 June 2023
0165-0327/© 2023 Elsevier B.V. All rights reserved.
M.G. Rossetti et al. Journal of Affective Disorders 338 (2023) 358–364

However, these tests are global cognition screeners generally used to Therefore, to investigate what might be the most appropriate tools to
exclude cognitive impairment or dementia in elderly populations and evaluate cognition in BD, we reviewed the studies that measured the
may have poor sensitivity and specificity for cognitive assessment in BD. psychometric properties of cognitive tests in bipolar versus control
To address this issue, in 2010 the International Society for Bipolar Dis­ groups.
orders (ISBD) proposed an adaptation of the MATRICS Consensus
Cognitive Battery (MCCB), originally developed for schizophrenia, to 2. Methods
assess cognition in BD. The ISBD endorsed the applicability of the ma­
jority of MCCB subtests, with recommendations made for the inclusion The systematic literature search, the screening, and the selection of
of additional executive function tests (core: Colour–Word Stroop and the the studies were reported according to (Preferred Reporting Items for
Trail Making Test–Part B (TMT-B); optional: the Wisconsin Card Sorting Systematic Reviews and Meta-Analyses (PRISMA)) guidelines (Page
Task) and the substitution of verbal learning measures (i.e., the Cali­ et al., 2021), as outlined in Fig. 1.
fornia Verbal Learning Test (CVLT) in place of the Hopkins Verbal
Learning Test–Revised (HVLT-R)) that form the ISBD-Battery for the
2.1. Strategy search
Assessment of Neurocognition (ISBD-BANC) (Yatham et al., 2010). More
recently, the ISBD published consensus-based recommendations indi­
The data search was conducted on August 1, 2022, and replicated on
cating the Screen for Cognitive Impairment in Psychiatry (SCIP) and the
April 20, 2023, using PubMed and Web of Science databases. No pub­
Cognitive Complaints in Bipolar Disorder Rating Assessment (COBRA)
lication date filter was set for the search. The following keywords were
as the most feasible tools for the screening of objective and subjective
used for the search: ‘Bipolar Disorder’ AND (cognit* OR neuropsychol*)
cognition, respectively (Miskowiak et al., 2018). Yet, other batteries
AND (sensitivity OR validity OR reliability OR feasibility OR ‘psycho­
such as the Brief Assessment of Cognition in Affective Disorders (BAC-A)
metric properties’). The inclusion criteria were: (i) original articles
have also been proposed for use in BD (Keefe et al., 2014). However,
published in peer-reviewed journals; (ii) English language; (iii) a case-
nowadays, there is no consensus on the optimal test to capture cognitive
control study design; (iv) the use of tests measuring multiple cognitive
changes in people with BD.
domains; (v) the application of statistical methods focusing on psycho­
The goodness of a cognitive instrument relies on multiple factors,
metric properties. Studies including samples with multiple diagnoses
including the robustness of its psychometric properties. For instance, a
(unless data for BD was reported separately), preclinical studies, and
tool should be validated in the target population of interest and be
case reports were excluded. Studies focusing only on affective cognition,
sensitive enough to distinguish between people with preserved cogni­
social cognition or cognitive insight were also excluded. Overall, the
tion and those experiencing subtle cognitive deficits. Also, it needs to be
literature search retrieved 1758 de-duplicated records. After title and
practical for implementation in both clinical and research settings.
abstract screening, 1743 articles were excluded because they clearly did

Fig. 1. A flow diagram of the articles screening and selection process.

359
M.G. Rossetti et al. Journal of Affective Disorders 338 (2023) 358–364

not meet the inclusion criteria. A total of 15 articles were selected for Specifically, the COBRA had significant but weak correlations with
full-text review. Among them, 13 studies were included in the review measures of executive function, and verbal and working memory (Jen­
while the remaining two were excluded because had no control group or sen et al., 2015; Rosa et al., 2013), but not global cognition (Xiao et al.,
evaluated the wrong outcomes. Table 1 shows the sample characteristics 2015) in BD patients. Overall, the evidence suggests that the COBRA is a
and main findings from each study. feasible instrument for subjective cognition assessment in BD. None­
theless, the COBRA should only be used in combination with other
2.2. Study quality assessment performance-based screening tests as the association between subjective
and objective cognitive deficits remains controversial (Miskowiak et al.,
MGR and FG assessed the methodological quality of the reviewed 2016).
studies using the National Institute Health, National Heart, Lung, and The SCIP assess verbal learning and memory, delayed memory,
Blood Institute-Quality Assessment tool for Observational Cohort and working memory, verbal fluency, and processing speed domains. It is
Cross-Sectional Studies (https://www.nhlbi.nih.gov/health-topics/st one of the performance-based tools most commonly used to screen
udy-quality-assessment-tools). The tool comprises 14 items that cover cognitive deficits in psychiatric disorders (Guilera et al., 2009; Rojo
six standard quality domains i.e., selection, exposure, outcome assess­ et al., 2010; Schmid et al., 2021). The SCIP is suitable for test-retest
ment, loss-to-follow-up, confounding, and others (refer to Table S1 for evaluations as it includes three alternative forms and is available in
details) (Wang et al., 2019). Here, we defined the exposure as the many languages including English, Spanish, French, Italian, and
diagnosis of BD (yes/no), and outcome as the psychometric properties of German, among others (Miskowiak et al., 2018). Only a few studies have
the cognitive tests as defined by each study. Therefore, only 9 out of the investigated the psychometric properties of this instrument in BD sam­
14 items were assessable. The other 5 items (blinding to exposure status, ples showing good validity, reliability, and sensitivity (Cuesta et al.,
loss to follow-up, measuring exposure prior to outcome, timeframe be­ 2011; Guilera et al., 2009; Jensen et al., 2015; Rojo et al., 2010).
tween exposure and outcome, and levels of exposure) do not apply to the Interestingly, the SCIP total score was significantly associated with the
exposure and outcome defined here. global cognition composite scores measured by neuropsychological tests
The quality of each study was assessed using the following ratings: (Cuesta et al., 2011; Jensen et al., 2015). Overall, preliminary data
‘Yes’, ‘No’ or ‘Not reported/Cannot determine’. The proportion of shows that the SCIP might be a feasible and valid instrument for use in
affirmative ratings across all studies for each item provided a measure of BD for screening purposes or to monitor cognitive changes over time.
the literature’s item-wise methodological quality. For each study, the In recent decades, the use of web-based, self-administered cognitive
proportion of affirmative ratings across the 9 items was computed and screeners has grown exponentially as such tools allow for large-scale,
used as proxy of study-wise methodological quality. low-cost patient assessment (Miskowiak et al., 2021).
Among them, the THINC-it and the ICAT have been recently vali­
3. Results dated in bipolar populations. The THINC-it (THINC-integrated tool) is a
computerized screening battery originally designed for patients with
Most of the studies (11 out of 13) explore the psychometric proper­ major depressive disorder (McIntyre et al., 2017). It combines measures
ties of tools that are commonly used for screening purposes or to monitor of both subjective (i.e., attention and concentration, prospective mem­
cognition over time i.e., the COBRA, the SCIP, the THINC-integrated tool ory, retrospective memory, planning, and organization) and objective
(THINC-it), and the Internet-Based Cognitive Assessment Tool (ICAT) cognition (i.e., attention & executive functions, working memory and
(Cuesta et al., 2011; Guilera et al., 2009; Jensen et al., 2015; Lima et al., processing speed), but lacks tests targeting verbal learning and memory,
2018; Miskowiak et al., 2021; Rojo et al., 2010; Rosa et al., 2013; Xiao which are commonly affected in BD. The battery has been extensively
et al., 2015; Yoldi-Negrete et al., 2018; Zhang et al., 2020; Zhu et al., validated in multiple languages, however, its use requires an internet
2023). The remaining studies investigated the psychometric properties connection and is limited to individuals who can operate a computer
of two more comprehensive batteries that can be used for a thorough independently. To date, two studies investigated the psychometric
neuropsychological evaluation namely, the MCCB and the BACA (Keefe proprieties of THINC-it for BD, in patients with BD-type 2 in depressive
et al., 2014; Van Rheenen and Rossell, 2014). (Zhang et al., 2020) or euthymic state (Zhu et al., 2023). The pre­
Overall, there was a substantial overlap in the patient cohort used liminary results suggest that THINC-it can accurately distinguish be­
across some of the included studies. Specifically, Guilera et al. (2009); tween the cognitive performance of BD patients and controls, but has
Rojo et al. (2010) and Cuesta et al. (2011) used partially overlapping low-to-moderate concurrent validity and test-retest reliability in
samples as well as Jensen et al. (2015) and Miskowiak et al. (2021). depressed/euthymic patients with BD-type 2 (Zhang et al., 2020; Zhu
The COBRA is a self-report questionnaire measuring subjective et al., 2023). Further studies in larger samples of both BD types 1 and 2
cognitive difficulties experienced by BD patients in their everyday life are needed to test the feasibility of this cognitive screener for BD.
related to executive function, processing speed, memory, attention and The ICAT is a web-based cognitive test battery designed to resemble
concentration. It is available in Spanish, English, French, Chinese, the cognitive tasks of the SCIP, measuring verbal learning, working
Danish, Japanese, and Brazilian languages. The questionnaire exhibited memory, delayed verbal learning and psychomotor speed (Miskowiak
satisfactory psychometric properties including high internal consistency et al., 2021; Purdon and Psych, 2005). Findings from recent studies
(Jensen et al., 2015; Lima et al., 2018; Rosa et al., 2013; Xiao et al., showed good validity and sensitivity of the ICAT as a cognitive screener
2015; Yoldi-Negrete et al., 2018), good discriminant validity with con­ for BD (Hafiz et al., 2019; Miskowiak et al., 2021). Specifically, the ICAT
trol groups and good concurrent validity with other instruments used to revealed high discriminant validity between BD and HC and high con­
assess cognitive complaints (Jensen et al., 2015; Lima et al., 2018; Rosa current validity with the SCIP total score. Notably, the ICAT total scores
et al., 2013). Notably, COBRA’s psychometric properties were consistent did not correlate with subjective cognitive complaints (i.e., COBRA
across countries. Cut-off points to discriminate between BD and HC scores) but there was a moderate correlation between lower ICAT total
scores were similar between languages (i.e., >10 Brazilian and Spanish scores and greater functional impairments of BD patients, as measured
validation; >11 Chines validation). In addition, most authors agreed in by the Functioning Assessment Short Test (Rosa et al., 2007). Lastly, the
considering COBRA as a one-factor tool, indicating its better ability to ICAT showed good feasibility and user-friendliness.
detect global dysfunction rather than discriminate deficits in different In general, preliminary data suggest that web-based cognitive bat­
cognitive domains (Lima et al., 2018; Rosa et al., 2013; Xiao et al., teries may offer valid and reliable remote testing of patients’ objective
2015). Nonetheless, the questionnaire showed only moderate sensitivity cognitive functions and do not require resources in terms of time and
for the detection of objective cognitive impairment (Jensen et al., 2015) space. However, larger studies are warranted to investigate whether
and unclear concurrent validity with measures of objective cognition. these tools have adequate validity and reliability in remote

360
M.G. Rossetti et al. Journal of Affective Disorders 338 (2023) 358–364

Table 1
Sample characteristics and main findings from the included studies.
Authors (year) Sample N (F) Age [mean Patients characteristics Psychometric properties Results
(ds)] analysed

COBRA
Yoldi-Negrete BD = 80(20) 48.2(11.99) ▪ DSM-5 diagnosis of BD ▪ Item discriminant analysis ▪ Good internal consistency (α = 0.91) and good
et al. (2018) HC = 92(33) 46.9(17.40) (86 % BD-I) ▪ Construct validity discriminative validity with HC (t = − 2.7, p < .01)
▪ Euthymic state for at ▪ Discriminative validity ▪ Cultural congruence was high between the Mexican and the
least 1 month ▪ Internal consistency Spanish version (0.96, p = .01) and acceptable between the
▪ Cross-cultural comparison Mexican and the Japanese version (0.80, p = .01)
Jensen et al. BD = 84(35) 36(10) ▪ ICD-10 diagnosis of BD ▪ Concurrent validity (between ▪ High concurrent validity (r = 0.68, p < .01)
(2015)a HC = 68(43) 34(12) (61 % BD-I) COBRA and CPFQ) ▪ Good internal consistency (α = 0.87)
▪ Partial or full ▪ Internal consistency ▪ Moderate sensitivity (68 %) and specificity (74 %) for
remission ▪ Sensitivity and specificity for detection of objective cognitive impairment
detection of objective cognitive
impairment
Rosa et al. BD = 91(43) 41.83 ▪ DSM-IV diagnosis of ▪ Internal consistency ▪ High internal consistency (α = 0.91)
(2013) HC = 124(62) (11.28) BD (77 % BD-I) ▪ Reliability ▪ High concurrent validity with the FCQ (ro = 0.89, p < .001)
39.4(11.59) ▪ Euthymic state for at ▪ Concurrent validity ▪ Significant correlations with objective measures of
least 3 months ▪ Discriminative validity executive function, verbal learning and memory (all p < .05),
▪ Explorative factorial analyses working memory (p < .001)
and feasibility ▪ Discriminative validity (i.e., BD obtained a higher total
score than HC) and high feasibility (100 % of participants
answered all items)
▪ One-factor structure
Xiao et al. (2015) BD = 125(64) 27.26 ▪ DSM-5 diagnosis of BD ▪ Internal consistency ▪ Very high internal consistency (α = 0.91) and retest
HC = 130(67) (10.02) (77 % BD-I) ▪ Retest reliability reliability (ICC = 0.90)
28.74 ▪ Euthymic state for at ▪ Discriminative validity ▪ Good content validity (high relevance and good
(10.70) least 1 month. ▪ Convergent validity comprehensiveness of the items)
▪ Content validity ▪ Good discriminative validity (BD > HC for COBRA total
▪ Item analysis score p < .001)
▪ Factorial and feasibility ▪ High feasibility (99 % of participants answered all items)
analysis ▪ One-factor structure
Lima et al. BD = 85(61) 49.60 ▪ DSM-5 diagnosis of BD ▪ Internal consistency ▪ High internal consistency (α = 0.89)
(2018) HC = 65(46) (12.88) ▪ Euthymic state for at ▪ Concurrent validity ▪ High concurrent validity (correlation with cognitive
45.85 least 1 month ▪ Discriminative validity domains of the FAST: r = 0.811, p < .001)
(15.68) ▪ Good discriminative validity with HC (p < .001)
▪ One-factor structure

SCIP
Rojo et al. (2010) BD = 75(42) 40.5(8.9) ▪ DSM-IV-TR diagnosis ▪ Decision validity ▪ Adequate decision validity (i.e., all the subtests yielded
HC = 79(33) 38.2(8.6) of BD ▪ Sensitivity and specificity of adequate values for sensitivity and specificity with the
▪ Euthymic state each subtest proposed cut-off points)
▪ The total score of the SCIP (<70) was associated with a
sensitivity of 87.9 and specificity of 80.6
Jensen et al. BD = 84(35) 36(10) 34 ▪ ICD-10 diagnosis of BD ▪ Concurrent validity with ▪ Good discriminative validity (all subtests BD < HC, p < .05)
(2015)a HC = 68(43) (12) (61 % BD-I) established NP tests ▪ High concurrent validity with established NP tests (r =
▪ Partial or full ▪ Sensitivity and specificity 0.55–79, p < .01)
remission ▪ Discriminative validity (t- ▪ High sensitivity (84 %) and specificity (87 %) for detection
tests) of objective cognitive impairment
▪ Proposed cut-off for SCIP: ≤ 70
Cuesta et al. BD = 65(33) 41.23 ▪ DSM-IV-TR diagnosis ▪ Internal consistency ▪ Good internal consistency for GCCS (α = 0.82)
(2011) HC = 76(32) (10.07) of BD-I ▪ Discriminative validity (BD vs ▪ Moderate reliability (α = 0.79)
37.91(8.59) ▪ Euthymic state SZC and HC) ▪ Moderate concurrent with GCCS (0.76)
▪ Concurrent validity relative
to a GCCS
Guilera et al. BD = 76(42) 40.30(8.98) ▪ DSM-IV-TR diagnosis ▪ Internal consistency ▪ Good internal consistency (α = 0.74)
(2009) HC = 45(20) 37.69(8.20) of BD-I ▪ Discriminative validity (BD vs ▪ Good internal consistency of subtest (α = 0.74–0.78)
▪ Euthymic state HAMD HC) ▪ Discriminative validity (i.e., HC > BD) and high feasibility
<8, ▪ Concurrent validity with (97.37 % of participants reported to all visits)
YMRS <6 established NP tests ▪ Good concurrent validity with global cognition scores
▪ Feasibility estimated from established NP tests (p < .05)
▪ Test-retest reliability ▪ High test-retest for total score (ICC = 0.87)
measured after 7 and 14 days ▪ Good equivalence between the 3 parallel forms (α = 0.74)
(ICC) ▪ One-factor structure

THINC-it
Zhang et al. BD-II = 58 26.40(8.59) ▪ DSM-5 diagnosis of BD ▪ Reliability ▪ Good discriminant validity (BD < HC in 4 out of 5 tests)
(2020) (36) 28.80(9.18) ▪ Current episode of ▪ Internal consistency ▪ Heterogeneous concurrent validity of the single tests
HC = 61(35) depression HAMD ≥ 17 ▪ Concurrent validity (range: 0.36–0.81, p < .01)
▪ Correlation with clinical ▪ Moderate concurrent validity of the overall test (0.66, p <
symptoms .001)
▪ Test-retest reliability ▪ Poor internal consistency (α = 0.4–0.6)
measured after 1 week (ICC) ▪ Poor-acceptable ICC (range: 0.44-0.76)
▪ Positive correlation between PDQ-5-D and HAMD (r = 0.43,
p < .001)
(continued on next page)

361
M.G. Rossetti et al. Journal of Affective Disorders 338 (2023) 358–364

Table 1 (continued )
Authors (year) Sample N (F) Age [mean Patients characteristics Psychometric properties Results
(ds)] analysed

Zhu et al. (2023) BD-D = 120 31.0(10.8) ▪ DSM-5 diagnosis of ▪ Reliability ▪ Moderate internal consistency (α = 0.69)
(80) 31.6(6.0) BD-D ▪ Internal consistency ▪ High internal consistency of subtest (α = 0.82)
HC = 100(77) ▪ Euthymic state HAMD ▪ Discriminative validity (BD-D ▪ Good discriminative validity (BD > HC for 2/5 subtest and
<14 vs HC) for THINC-it total score p < .001)
▪ Concurrent validity ▪ High concurrent validity between THINC-it objective score
▪ Correlation with clinical and measures of attention, executive function, and working
symptoms memory (all p < .001)
▪ Test-retest reliability ▪ Acceptable-good test-retest ICC (range 0.57–0.85, all p <
measured after 1 week (ICC) .001)
▪ Explorative factorial analyses ▪ Two-factor structure
and feasibility

ICAT
Miskowiak et al. BD = 35(26) 31.7(9.0) ▪ ICD-10 diagnosis of BD ▪ Sensitivity ▪ Good discriminant validity (BD < HC, p < .001)
(2021) HC = 35(21) 28.5(6.0) (39 % BD-I) ▪ Discriminant validity ▪ High concurrent validity with SCIP Total Score (r = 0.72, p
▪ Full or partial ▪ Concurrent validity with the < .000)
remission HAMD ≤ 14, SCIP ▪ Moderate correlations between ICAT and SCIP subtests
YMRS ≤14 ▪ Correlation with COBRA and scores (range 0.48–0.55)
measures of functional capacity ▪ Negative correlation between ICAT total scores and FAST
(FAST) scores (r(30) = − 0.32, p = .04)
▪ Usability (PSSUQ) ▪ Tentative cut-off for ICAT: <68

MCCB
Van Rheenen BD = 50(34) 38.4(13.02) ▪ DSM-IV-TR diagnosis ▪ Sensitivity of the MCCB and ▪ Good discriminant validity for 3-out of-7 MCCB domains
and Rossell HC = 52(32) 34.00 of BD TMT-B and Colour–Word (working memory, visual learning, and processing speed) ES
(2014) (14.27) ▪ Current episode Stroop in BPD range: 0.8–1.0
depressive

BACA
Keefe et al. BD = 309 44.1(12.1) ▪ DSM-IV-TR- diagnosis ▪ Discriminative validity ▪ Good discriminant validity (BD < HC, p < .001)
(2014) (159) HC = 44.4(12.5) of BD-I between ▪ No differences between the 2 forms of the APT. Test-retest
309(159) ▪ Most recent episode ▪ Relative sensitivity of the reliability was high for summary composite score (ICC =
depressed subtests 0.85) but low for affective subtests (ICC = 0.50, 0.51)
▪ Test-retest reliability

APT = Affective Processing Test; BACA = Brief Assessment of Cognition in Affective Disorder; BD = Bipolar Disorder; BD-D = Bipolar Depression; COBRA = Cognitive
Complaints in Bipolar Disorder Rating Assessment; CPFQ = Massachusetts General Hospital Cognitive and Physical Functioning Questionnaire; DSM-IV-TR/5 =
Diagnostic and Statistical Manual of Mental Disorders-4th edition-text revision/5th edition; FAST = Functioning Assessment Short Test; FCQ = Frankfurt Complaint
Questionnaire; GCCS = Global Cognitive Composite Score (obtained by averaging standardized scores of 10 neuropsychological measures, see (Cuesta et al., 2011));
HAMD = Hamilton Depression Rating Scale; HC = Healthy Controls; α = Cronbach’s α; ICAT = Internet Based Cognitive Assessment Tool; ICC = Intraclass correlation
coefficient; ICD-10 = International Statistical Classification of Diseases and Related Health Problems 10th Revision; NP = neuropsychological; MCCB = MATRICS
Consensus Cognitive Battery; PDQ-5-D = Perceived Deficits Questionnaire for Depression; PSSUQ = (Post Study System Usability Questionnaire; SCIP = Screen for
Cognitive Impairment in Psychiatry; YMRS = Young Mania Rating Scale; TMT-B = Trail Making Test–Part B; THINC-it = THINC-integrated tool.
a
Results from Jensen et al. (2015) have been reported in two sections (i.e.; COBRA; SCIP).

administration settings as well as test-retest reliability. functions when using this battery.
Our research retrieved two studies that explored the psychometric The BACA is a paper-and-pencil battery specifically designed for
properties of more comprehensive neuropsychological batteries used to affective disorders. It comprises six tests of non-affective cognition that
measure objective cognition in BD. Specifically, Van Rheenen and Ros­ measure visuomotor abilities, working memory, learning and declara­
sell (2014) examined the clinical validity of the MCCB and two addi­ tive memory, attention, verbal fluency and problem-solving, and two
tional tests recommended by the ISBD (i.e., Colour–Word Stroop and tests specifically designed to measure the influence of emotionally-
TMT-B) (Yatham et al., 2010) while Keefe et al. (2014) explored the valenced stimuli on non-affective cognitive processes (Keefe et al.,
validity of the BACA. 2014). The English, Italian, and Chinese versions of the BACA own their
The MCCB includes multiple tests exploring processing speed, normative datasets and cut-offs and have been validated in BD samples
attention/vigilance, working memory, verbal learning, visual learning, (Keefe et al., 2014; Lee et al., 2018; Rossetti et al., 2019; Rossetti et al.,
reasoning and problem-solving and social cognition. The tests for the 2022). Overall, the battery demonstrated good sensitivity to the cogni­
final consensus battery were selected based on test-retest reliability, tive impairment of BD patients in both affective and non-affective do­
utility as a repeated measure, relationship to functional outcome, and mains and good test-retest reliability and comparability of the parallel
practicality and tolerability (Nuechterlein et al., 2008). Van Rheenen forms of specific subtests (Keefe et al., 2014). However, the relatively
and Rossell (2014) found good discriminant validity between BD pa­ long administration time (≥30 min) (Keefe et al., 2014) and the high
tients and HC for 3 out of the 7 MCCB cognitive domains (i.e., working cost may limit its application in clinical practice.
memory, visual learning, and processing speed), and global cognition.
Notably, when the domain analysis was re-run including the two exec­
3.1. Overview of risk of bias assessment
utive measures recommended by the ISBD, the authors found a trend-
level impairment for executive functioning in the patient group that
The risk of bias was inconsistent across the reviewed literature. All or
was primarily driven by performance on the TMT-B (Van Rheenen and
almost all studies clearly defined their objectives, study population,
Rossell, 2014). The study proved the suitability of the MCCB for BD
exposure measures and sample recruitment criteria; and reliably
patients, although it is advisable to include additional tests of executive
assessed the defined outcome variable, i.e., the psychometric properties

362
M.G. Rossetti et al. Journal of Affective Disorders 338 (2023) 358–364

of the cognitive tools. In contrast, only a few studies provided an Declaration of competing interest
explanation or justification for the chosen sample size (n = 1) or
consistently measured the effect size of their findings (n = 5). Similarly, None.
very few studies (15 %) controlled for the effect of confounding vari­
ables (e.g., medication status, education, BD type, BD state). An overall Acknowledgements
assessment across studies, based on an average of all study-wise quality
scores, indicated that the literature examining the psychometric prop­ None.
erties of tools commonly used for cognitive assessment in BD to date is of
‘fairly good’ (average score = 0.7). Refer to Table S1 for complete
References
details.
Bakkour, N., Samp, J., Akhras, K., El Hammi, E., Soussi, I., Zahra, F., Duru, G., Kooli, A.,
4. Discussion Toumi, M., 2014. Systematic review of appropriate cognitive assessment instruments
used in clinical trials of schizophrenia, major depressive disorder and bipolar
disorder. Psychiatry Res. 216, 291–302.
In this review we summarized and discussed the psychometric Baune, B.T., Li, X., Beblo, T., 2013. Short-and long-term relationships between
properties of cognitive tools commonly used for cognitive assessment in neurocognitive performance and general function in bipolar disorder. J. Clin. Exp.
BD. Overall the results show that both brief cognitive screeners (COBRA, Neuropsychol. 35, 759–774.
Bora, E., Pantelis, C., 2015. Meta-analysis of cognitive impairment in first-episode
SCIP, THINC-it, ICAT) and comprehensive neuropsychological batteries bipolar disorder: comparison with first-episode schizophrenia and healthy controls.
(MCCB, BACA) have moderate-to-high validity and sensitivity and may Schizophr. Bull. 41, 1095–1104.
be appropriate to detect deficits or to monitor changes over time in the Bora, E., Yücel, M., Pantelis, C., 2010. Cognitive impairment in affective psychoses: a
meta-analysis. Schizophr. Bull. 36, 112–125.
cognitive functioning of BD patients. However, the findings from this Bourne, C., Aydemir, Ö., Balanzá-Martínez, V., Bora, E., Brissos, S., Cavanagh, J.,
review should be considered in light of several limitations. Specifically, Clark, L., Cubukcuoglu, Z., Dias, V.V., Dittmann, S., 2013. Neuropsychological
differences in the studies’ design, BD sample size and characteristics (e. testing of cognitive impairment in euthymic bipolar disorder: an individual patient
data meta-analysis. Acta Psychiatr. Scand. 128, 149–162.
g., BD type, clinical state, pharmacotherapy, duration of illness, symp­
Cuesta, M.J., Pino, O., Guilera, G., Rojo, J.E., Gómez-Benito, J., Purdon, S.E., Franco, M.,
toms severity, number of mood episodes), and the statistical approaches Martínez-Arán, A., Segarra, N., Tabarés-Seisdedos, R., 2011. Brief cognitive
precluded a direct comparison between studies. Similarly, the cognitive assessment instruments in schizophrenia and bipolar patients, and healthy control
subjects: a comparison study between the Brief Cognitive Assessment Tool for
tools considered have heterogeneous structures and address partially
Schizophrenia (B-CATS) and the Screen for Cognitive Impairment in Psychiatry
different cognitive domains. Accordingly, this review highlights the (SCIP). Schizophr. Res. 130, 137–142.
need to conduct future studies that directly compare the validity and Folstein, M., 1975. A practical method for grading the cognitive state of patients for the
reliability of the different cognitive tools by using larger and better- children. J. Psychiatr. Res. 12, 189–198.
Guilera, G., Pino, O., Gómez-Benito, J., Rojo, J.E., Vieta, E., Tabarés-Seisdedos, R.,
characterized BD samples and longitudinal designs. Furthermore, the Segarra, N., Martínez-Arán, A., Franco, M., Cuesta, M.J., 2009. Clinical usefulness of
results of this review are limited to the psychometric properties of in­ the screen for cognitive impairment in psychiatry (SCIP-S) scale in patients with type
struments that assess ‘cold’ cognition, such as memory, attention and I bipolar disorder. Health Qual. Life Outcomes 7, 1–10.
Hafiz, P., Miskowiak, K.W., Kessing, L.V., Jespersen, A.E., Obenhausen, K., Gulyas, L.,
executive functions. Therefore, as more data becomes available, further Żukowska, K., Bardram, J.E., 2019. The internet-based cognitive assessment tool:
research should be conducted to examine the psychometric properties of system design and feasibility study. JMIR Formative Res. 3, e13898.
tools targeting affective and social cognition, with the ultimate goal of Jensen, J.H., Støttrup, M.M., Nayberg, E., Knorr, U., Ullum, H., Purdon, S.E., Kessing, L.
V., Miskowiak, K.W., 2015. Optimising screening for cognitive dysfunction in bipolar
better informing clinical practice. disorder: validation and evaluation of objective and subjective tools. J. Affect.
To conclude, our review showed that the most commonly used tools Disord. 187, 10–19.
to evaluate cognition in BD seem sensitive enough to detect cognitive Keefe, R.S., Fox, K.H., Davis, V.G., Kennel, C., Walker, T.M., Burdick, K.E., Harvey, P.D.,
2014. The Brief Assessment of Cognition In Affective Disorders (BAC-A):
changes in bipolar patients. However, the optimal tool remains to be
performance of patients with bipolar depression and healthy controls. J. Affect.
established. The applicability and clinical utility of the cognitive tests Disord. 166, 86–92.
may depend on multiple factors such as the purpose of the evaluation, Kurtz, M.M., Gerraty, R.T., 2009. A meta-analytic investigation of neurocognitive deficits
in bipolar illness: profile and effects of clinical state. Neuropsychology 23, 551.
the target population, or the resources available. In general, web-based
Lee, C.-Y., Wang, L.-J., Lee, Y., Hung, C.-F., Huang, Y.-C., Lee, M.-I., et al., 2018.
instruments are expected to become the first-choice tool for cognitive Differentiating bipolar disorders from unipolar depression by applying the brief
screening, as they can be applied at a large scale and are more affordable assessment of cognition in affective disorders. Psychol. Med. 48, 929–938.
than paper-and-pencil tools. Similarly, the BACA may be suitable for a Lima, F.M., Cardoso, T.A., Serafim, S.D., Martins, D.S., Solé, B., Martínez-Arán, A.,
Vieta, E., Rosa, A.R., 2018. Validity and reliability of the Cognitive Complaints in
second-level assessment, as it measures both affective and non-affective Bipolar Disorder Rating Assessment (COBRA) in Brazilian bipolar patients. Trends
cognition and shows robust (although preliminary) psychometric Psychiatry Psychother. 40, 170–178.
properties. Martino, D.J., Strejilevich, S.A., Scápola, M., Igoa, A., Marengo, E., Ais, E.D., Perinot, L.,
2008. Heterogeneity in cognitive functioning among patients with bipolar disorder.
We suggest all healthcare professionals working with BD patients J. Affect. Disord. 109, 149–156.
regularly consult the most up-to-date recommendations on the ISBD McIntyre, R.S., Best, M.W., Bowie, C.R., Carmona, N.E., Cha, D.S., Lee, Y.,
website https://www.isbd.org/Cognitive-Assessments on the cognitive Subramaniapillai, M., Mansur, R.B., Barry, H., Baune, B.T., 2017. The THINC-
integrated tool (THINC-it) screening assessment for cognitive dysfunction: validation
tools available for the assessment of cognition in BD. in patients with major depressive disorder. J. Clin. Psychiatry 78, 20938.
Supplementary data to this article can be found online at https://doi. Miskowiak, K., Petersen, J.Z., Ott, C.V., Knorr, U., Kessing, L.V., Gallagher, P.,
org/10.1016/j.jad.2023.06.026. Robinson, L., 2016. Predictors of the discrepancy between objective and subjective
cognition in bipolar disorder: a novel methodology. Acta Psychiatr. Scand. 134,
511–521.
CRediT authorship contribution statement Miskowiak, K.W., Burdick, K., Martinez-Aran, A., Bonnin, C., Bowie, C., Carvalho, A.,
Gallagher, P., Lafer, B., López-Jaramillo, C., Sumiyoshi, T., 2018. Assessing and
addressing cognitive impairment in bipolar disorder: the International Society for
CP and MGR designed the study. MGR and FG conducted the liter­
Bipolar Disorders Targeting Cognition Task Force recommendations for clinicians.
ature search and wrote the first version of the manuscript. All authors Bipolar Disord. 20, 184–194.
contributed to revising the manuscript critically for final approval. Miskowiak, K., Jespersen, A., Obenhausen, K., Hafiz, P., Hestbæk, E., Gulyas, L.,
Kessing, L., Bardram, J., 2021. Internet-based cognitive assessment tool: sensitivity
and validity of a new online cognition screening tool for patients with bipolar
Funding disorder. J. Affect. Disord. 289, 125–134.
Nasreddine, Z.S., Phillips, N.A., Bédirian, V., Charbonneau, S., Whitehead, V., Collin, I.,
This study was partially supported by a grant from the Italian Min­ Cummings, J.L., Chertkow, H., 2005. The Montreal Cognitive Assessment, MoCA: a
brief screening tool for mild cognitive impairment. J. Am. Geriatr. Soc. 53, 695–699.
istry of Health (GR-2016- 02361283 to CP). Nuechterlein, K.H., Green, M.F., Kern, R.S., Baade, L.E., Barch, D.M., Cohen, J.D.,
Essock, S., Fenton, W.S., Frese III, P.D., Frederick, J., Gold, J.M., 2008. The MATRICS

363
M.G. Rossetti et al. Journal of Affective Disorders 338 (2023) 358–364

consensus cognitive battery, part 1: test selection, reliability, and validity. Am. J. Schmid, P., Czekaj, A., Frick, J., Steinert, T., Purdon, S.E., Uhlmann, C., 2021. The screen
Psychiatr. 165, 203–213. for cognitive impairment in psychiatry (SCIP) as a routinely applied screening tool:
Page, M.J., McKenzie, J.E., Bossuyt, P.M., Boutron, I., Hofmann, T.C., Mulrow, C.D., pathology of acute psychiatric inpatients and cluster analysis. BMC Psychiatry 21,
et al., 2021. The PRISMA 2020 statement: an updated guideline for reporting 1–10.
systematic reviews. Int. J. Surg. 88, 105906. Van Rheenen, T.E., Rossell, S.L., 2014. An empirical evaluation of the MATRICS
Purdon, S.E., Psych, R., 2005. The Screen for Cognitive Impairment in Psychiatry. consensus cognitive battery in bipolar disorder. Bipolar Disord. 16, 318–325.
Administration and Psychometric Properties. PNL, Edmonton, Alberta, Canada. Wang, Z., Taylor, K., Allman-Farinelli, M., Armstrong, B., Askie, L., Ghersi, D., Bero, L.,
Rojo, E., Pino, O., Guilera, G., Gómez-Benito, J., Purdon, S.E., Crespo-Facorro, B., 2019. A Systematic Review: Tools for Assessing Methodological Quality of Human
Cuesta, M.J., Franco, M., Martínez-Arán, A., Segarra, N., 2010. Neurocognitive Observational Studies. https://doi.org/10.31222/osf.io/pnqmy. May 22.
diagnosis and cut-off scores of the Screen for Cognitive Impairment in Psychiatry Xiao, L., Lin, X., Wang, Q., Lu, D., Tang, S., 2015. Adaptation and validation of the
(SCIP-S). Schizophr. Res. 116, 243–251. “cognitive complaints in bipolar disorder rating assessment”(COBRA) in Chinese
Rosa, A.R., Sánchez-Moreno, J., Martínez-Aran, A., Salamero, M., Torrent, C., bipolar patients. J. Affect. Disord. 173, 226–231.
Reinares, M., Comes, M., Colom, F., Van Riel, W., Ayuso-Mateos, J.L., 2007. Validity Yatham, L.N., Torres, I.J., Malhi, G.S., Frangou, S., Glahn, D.C., Bearden, C.E.,
and reliability of the Functioning Assessment Short Test (FAST) in bipolar disorder. Burdick, K.E., Martínez-Arán, A., Dittmann, S., Goldberg, J.F., 2010. The
Clin. Pract. Epidemiol. Ment. Health 3, 1–8. international society for bipolar disorders–battery for assessment of neurocognition
Rosa, A.R., Mercadé, C., Sánchez-Moreno, J., Solé, B., Bonnin, C.D.M., Torrent, C., (ISBD-BANC). Bipolar Disord. 12, 351–363.
Grande, I., Sugranyes, G., Popovic, D., Salamero, M., 2013. Validity and reliability of Yoldi-Negrete, M., Fresán-Orellana, A., Martínez-Camarillo, S., Ortega-Ortiz, H.,
a rating scale on subjective cognitive deficits in bipolar disorder (COBRA). J. Affect. García, F.L.J., Castañeda-Franco, M., Tirado-Durán, E., Becerra-Palars, C., 2018.
Disord. 150, 29–36. Psychometric properties and cross-cultural comparison of the cognitive complaints
Rossetti, M.G., Bonivento, C., Garzitto, M., Caletti, E., Perlini, C., Piccin, S., in bipolar disorder rating assessment (COBRA) in Mexican patients with bipolar
Lazzaretti, M., Marinelli, V., Sala, M., Abbiati, V., Group, G, 2019. The brief disorder. Psychiatry Res. 269, 536–541.
assessment of cognition in affective disorders: normative data for the Italian Zhang, W., Zhu, N., Lai, J., Liu, J., Ng, C.H., Chen, J., Qian, C., Du, Y., Hu, C., Chen, J.,
population. J. Affect. Disord. 252, 245–252. 2020. Reliability and validity of THINC-it in evaluating cognitive function of patients
Rossetti, M.G., Perlini, C., Abbiati, V., Bonivento, C., Caletti, E., Fanelli, G., Group, G, with bipolar depression. Neuropsychiatr. Dis. Treat. 16, 2419.
2022. The Italian version of the Brief Assessment of Cognition in Affective Disorders: Zhu, N., Zhang, W., Huang, J., Su, Y., Lu, J., Yang, L., Fang, Y., 2023. Validation of the
performance of patients with bipolar disorder and healthy controls. Compr. THINC-it tool for assessment of cognitive impairment in patients with bipolar
Psychiatry 117, 152335. depression. Neuropsychiatr. Dis. Treat. 443–452.

364

You might also like