You are on page 1of 7

Psychiatry Research 262 (2018) 63–69

Contents lists available at ScienceDirect

Psychiatry Research
journal homepage: www.elsevier.com/locate/psychres

Reliability of two social cognition tests: The combined stories test and the T
social knowledge test
Élisabeth Thibaudeaua,b, Caroline Cellarda,b, Maxime Legendrea,b, Karèle Villeneuvea,b,

Amélie M. Achima,c,
a
CERVO Brain Research Center, 2601 Chemin de la Canardière, G1J 2G3 Québec, Canada
b
École de Psychologie, Université Laval, Pavillon Félix-Antoine-Savard, 2325 Allée des Bibliothèques, G1V 0A6 Québec, Canada
c
Département de Psychiatrie et Neurosciences, Université Laval, Pavillon Ferdinand-Vandry, 1050 Avenue de la Médecine, local 4873, G1V 0A6 Québec, Canada

A R T I C L E I N F O A B S T R A C T

Keywords: Deficits in social cognition are common in psychiatric disorders. Validated social cognition measures with good
Psychometric properties psychometric properties are necessary to assess and target social cognitive deficits. Two recent social cognition
Test-retest reliability tests, the Combined Stories Test (COST) and the Social Knowledge Test (SKT), respectively assess theory of mind
Inter-rater reliability and social knowledge. Previous studies have shown good psychometric properties for these tests, but the test-
retest reliability has never been documented. The aim of this study was to evaluate the test-retest reliability and
the inter-rater reliability of the COST and the SKT. The COST and the SKT were administered twice to a group of
forty-two healthy adults, with a delay of approximately four weeks between the assessments. Excellent test-retest
reliability was observed for the COST, and a good test-retest reliability was observed for the SKT. There was no
evidence of practice effect. Furthermore, an excellent inter-rater reliability was observed for both tests. This
study shows a good reliability of the COST and the SKT that adds to the good validity previously reported for
these two tests. These good psychometrics properties thus support that the COST and the SKT are adequate
measures for the assessment of social cognition.

1. Introduction (Nuechterlein et al., 2008). However, this group of experts was faced
with the challenge that very few social cognition tests are standardized
Social cognition is defined as the mental processes underlying social or have had their psychometric properties documented.
interactions, including the abilities involved in perceiving and inter- A subsequent major initiative in schizophrenia, the Social Cognition
preting social information in order to guide social interactions (Green Psychometric Evaluation Study (SCOPE) (Pinkham et al., 2017, 2016,
et al., 2015; Pinkham, 2014). Deficits in social cognition are common in 2014), had the objectives (1) to achieve a consensus on the most im-
several psychiatric or neurological disorders, including bipolar disorder portant social cognitive domains in schizophrenia, (2) to identify the
(Bora et al., 2016; Samamé et al., 2015), autism (Chung et al., 2014), most promising social cognition measures available in the experimental
traumatic brain injury (Martín-Rodríguez and León-Carrión, 2010; psychology literature for each of these domains and (3) to test the
Mcdonald, 2013) and schizophrenia (Achim et al., 2012; Savla et al., psychometric properties of the most promising tasks and make re-
2013). Social cognitive deficits are well-documented in schizophrenia commendations for the choice of social cognition measures, notably for
and are associated with important difficulties in social and occupational future clinical trials. This initiative led to a consensus on four domains
functioning (Achim et al., 2013b; Addington et al., 2006; Bell et al., of social cognition that were important for schizophrenia: emotion
2009; Fett et al., 2011; Lahera et al., 2012). Thus, there is an increased processing, attributional style/bias, social perception/knowledge and
interest to develop and validate treatments targeting these difficulties theory of mind (ToM) (Pinkham et al., 2014). ToM and social knowl-
(Pinkham et al., 2014; Schaafsma et al., 2015). The importance of social edge are two domains for which there is a significant lack of standar-
cognition was for instance recognized over a decade ago by the dized measures. ToM can be defined as the ability to represent or infer
group of experts taking part in the Measurement and Treatment Re- the mental states of others (e.g. their intentions, emotions or beliefs)
search to Improve Cognition in Schizophrenia (MATRICS) initiative (Achim et al., 2013a; Pinkham et al., 2014). Social perception can be


Correspondence to: 2601, de la Canardière (F-4561-4), Québec, QC G1J 2G3, Canada.
E-mail addresses: elisabeth.thibaudeau.2@ulaval.ca (É. Thibaudeau), Caroline.Cellard@psy.ulaval.ca (C. Cellard), maxime.legendre.1@ulaval.ca (M. Legendre),
karele.villeneuve.1@ulaval.ca (K. Villeneuve), amelie.achim@fmed.ulaval.ca (A.M. Achim).

https://doi.org/10.1016/j.psychres.2018.01.026
Received 30 September 2017; Received in revised form 23 December 2017; Accepted 12 January 2018
Available online 12 January 2018
0165-1781/ © 2018 Published by Elsevier B.V.
É. Thibaudeau et al. Psychiatry Research 262 (2018) 63–69

defined as the ability to decode and interpret social cues in others, expressing anything in this test; participants are rather told about
which includes social knowledge and social context processing generic situations that can happen in daily life and are asked to de-
(Pinkham et al., 2014). termine how most people would feel in these situations.
The SCOPE initiative (Pinkham et al., 2014) targeted three pro- In a previous paper with healthy participants and individuals with
mising ToM tasks for which the psychometric properties were assessed recent onset schizophrenia, we reported good psychometric properties
in people with schizophrenia and in healthy participants: the Reading for the COST, including a good convergent validity and internal con-
the Mind in the Eyes Test (RMET) (Baron-Cohen et al., 2001), the sistency, as well as an excellent inter-rater reliability (Achim et al.,
Hinting task (Corcoran et al., 1995), and The Awareness of Social In- 2012). Using the definition of ceiling effect used for the SCOPE study
ferences Test, Part III (TASIT) (McDonald et al., 2003). In the sub- (i.e. a perfect score on the test), no ceiling effects were observed. Fur-
sequent phases of SCOPE, improvements were made on the three ToM thermore, additional studies showed normal distributions even in
tasks and the psychometric properties of these improved versions were healthy participants (e.g. Achim et al., 2012; Gaudreau et al., 2015;
assessed (Pinkham et al., 2017). More specifically, for the RMET, de- Lavoie et al., 2014). We also recently documented that the COST is
finitions were added for each of the words provided as response sensitive to detect changes in performance following a cognitive re-
choices. For the Hinting task, the scoring criteria were refined. For the mediation therapy. Finally, a split-half analysis with the COST revealed
TASIT, a counterbalanced administration of test forms across sessions an excellent estimated test-retest reliability (unpublished results). For
was used and the response time was considered as an additional mea- the SKT, we reported an excellent inter-rater reliability and a moderate
sure of interest. Despite these modifications, the RMET still showed internal consistency (Achim et al., 2012).
limited internal consistency and was classified “as acceptable with re- Overall, the COST and the SKT have shown good psychometric
servations” by the authors of SCOPE (Pinkham et al., 2017). The properties and could represent informative tests to assess ToM and
Hinting task showed a practice effect, a moderate test-retest reliability social knowledge. However, the test-retest reliability of these tests re-
and a limited internal consistency for both patients and healthy parti- mains to be documented. Furthermore, in the initial evaluation of the
cipants, and the task was considered acceptable (Pinkham et al., 2017). inter-rater reliability (Achim et al., 2012), one of the two judges was the
The TASIT showed moderate test-retest reliability both in healthy person who had built the scoring grid (i.e. AMA). Therefore, it would be
participants and in people with schizophrenia and was classified as useful to examine the inter-rater reliability of the two tests with two
acceptable with reservations (Pinkham et al., 2017). naïve judges.
For social perception, the most promising task initially identified The first objective of this study was to evaluate the reliability of the
through the SCOPE initiative was the Relationships Across Domains COST and the SKT in a group of healthy adults, targeting both inter-
(RAD) (Sergi et al., 2009). However, despite previous reports of ade- rater reliability and test-retest reliability. Based on pilot results (i.e. a
quate psychometric properties for the RAD (Sergi et al., 2009), the re- split-half analysis using previous data), we expected an excellent test-
sults from SCOPE revealed limited test-retest reliability and internal retest reliability for the COST and a moderate to good test-retest re-
consistency in healthy participants, as well as a practice effect in people liability for the SKT. We also hypothesized that the excellent inter-rater
with schizophrenia and a lower tolerability for both people with schi- reliability previously reported in Achim et al. (2012) would be re-
zophrenia and healthy participants (Pinkham et al., 2014). Therefore, plicated.
in the final phase of the SCOPE, two new measures of social perception A second objective was to use the information about the reliability
were evaluated; the Mini Profile of Nonverbal Sensitivity (MiniPONS) of each item included in our test in order to improve our scoring grids,
(Bänziger et al., 2011) and the Social Attribution Task – Multiple which we identified as a useful step that could further improve the
Choice (SAT-MC) (Bell et al., 2010). The MiniPONS showed limited reliability of the tests before documenting their reliability in psychiatric
test-retest reliability in healthy participants, limited internal con- populations.
sistency in both people with schizophrenia and healthy participants and
participants from both groups rated the MiniPONS as the least pleasant
task (Pinkham et al., 2017). Hence, the MiniPONS was classified as “not 2. Methods
recommended” (Pinkham et al., 2017). The SAT-MC showed limited
test-retest reliability and limited internal consistency both for people 2.1. Participants
with schizophrenia and healthy participants and was also classified as
“not recommended” by the authors of SCOPE (Pinkham et al., 2017). Forty-three participants were recruited through an email sent to
While the SCOPE initiative improved our knowledge regarding the students and employees at Université Laval. The inclusion criteria were:
psychometric properties of various social cognitive tasks, several lim- 1) age between 18 and 45 years (this range was chosen since we ob-
itations were observed in relation to the tasks available to assess ToM served different pattern of response in individuals above 50 years old
and social knowledge. Thus, further research is needed to identify social for the SKT in Lajoie et al. (2014); 2) French as the first language or the
cognition tasks with good validity and reliability. language in which elementary school and high school were completed.
In the context of previous studies, our team developed and used two Exclusion criteria were: 1) current use of a psychoactive medication;
social cognition tests, the Combined Stories Test (COST) (Test des his- 2) a diagnosis for a current psychiatric disorder; 3) a first degree re-
toires combinées) and the Social Knowledge Test (SKT) (Test des lative with psychosis or bipolar disorder; 4) a neurological disorder; 5)
Situations) (Achim et al., 2012, 2013b; Lavoie et al., 2014, 2016). The a physical disability that could interfere with the testing; 6) an esti-
COST assesses the ability to infer different types of mental states mated verbal intellectual quotient (IQ) less than 70; 7) a history of
through open questions that require taking the mental states of the traumatic brain injury; 8) drug or alcohol abuse or dependence.
characters into account (Achim et al., 2012). The COST was created by One participant was excluded from the study because he presented
combining items adapted from four well-known ToM tasks, as well as with an estimated IQ below 70 (estimated IQ of 65), leaving 42 parti-
adding new items. Two important advantages of the COST are that it cipants. Sociodemographic information is presented in Table 1. Two of
includes a greater number of items than other story-based tasks and that these participants could not be included in the test-retest reliability
it targets a wide range of mental states (e.g. beliefs, intentions, emo- analyses because they did not complete the second session, but they
tions). were included in the analyses for the inter-rater reliability. This study
The SKT assesses social knowledge about mental states, or more was approved by the ethics committee of the Centre intégré uni-
specifically about emotions typically experienced in given situations. versitaire de santé et de services sociaux de la Capitale-Nationale
Contrary to ToM tasks, there is no character performing any action or (CIUSSS-CN), and all participants provided written informed consent.

64
É. Thibaudeau et al. Psychiatry Research 262 (2018) 63–69

Table 1

26 questions for a maximum of 52 points. The answers to 18 questions are scored 0,

6 questions for a maximum of 12 points. The answers are scored 0, 1 or 2 points.

14 situations for a maximum of 14 points. The answers are scored 0 or 1 point.


29 questions for a maximum of 29 points. The answers are scored 0 or 1 point.
Sociodemographic information (N = 42).

COST = Combined Stories Test; SKT = Social Knowledge Test; QM = Second-order ToM questions; QR = Non-social reasoning questions; QC = Control questions; Q1 = First-order ToM questions; QSFP = Without faux-pas questions.
3 questions for a maximum of 6 points. The answers are scored 0 or 2 points.

2 questions for a maximum of 2 points. The answers are scored 0 or 1 point.


Mean (SD) Range

Sex (M/F) 15/27 –

1 or 2 points and answers to 8 questions are scored 0 or 2 points.


Age (years) 25.1 (5.8) 20–43
Education (years) 16.9 (2.7) 13–25
Estimated verbal IQ 99.5 (12.4) 78–129
Delay between assessments (days) 27.6 (4.2) 20–35

2.2. Material

2.2.1. The Combined Stories Test (COST) (Achim et al., 2012)


The COST includes 30 stories adapted from the Hinting Task
(Corcoran et al., 1995), the False Belief Task (Baron-Cohen, 1989; Frith
and Corcoran, 1996), the Faux-pas task (Baron-Cohen et al., 1999) and
the Strange Stories Test (Happé, 1994) as well as new items. Partici-
pants are to read each story out loud and are asked to answer ToM and
control questions. They are encouraged to refer back to the written

Scoring
stories if needed, which is done to minimize the impact of memory load.
The first story is a practice item, and feedback is provided only for that
first item. The COST takes approximately 30 min to complete. Here, the

Where is Jane going to look for her

Somebody who has a big argument


What was the color of the door in
Why did the painting fall off the
responses were rated by two independent judges who received a short

Has anyone said something he


What does Mary really mean?
training to use the scoring grid that was validated in a previous study
(Achim et al., 2012) and used in additional studies (Achim et al., 2013b;

should not have said?


Gaudreau et al., 2015; Lavoie et al., 2014, 2016; Tousignant et al.,
Fictitious example

2016). The COST was initially designed and validated in French, and

with his friend.


the French version is used for the current study. More details about the
different types of questions, as well as examples, are presented in

the story?
Table 2, and further information can be found in Achim et al. (2012).
wall?

doll?

Short situations are presented. The participant is to state the emotion that would be felt
Non-social reasoning items that require the inference of the cause of an event in the

2.2.2. The Social knowledge test (SKT) (Achim et al., 2012)


The QSFP are included so that the answer to a faux-pas question is not always yes.
Ability to extract information that is available in the text to control for memory or
Second-order ToM that requires taking the character's mental states into account.

Participants are verbally presented with a series of short hypothe-


First-order ToM to evaluate the capacity to link a mental state to a behavior.

tical situations inspired by those of Blair and Cipolotti (2000). For each
physical world, that is not linked to the actions or thoughts of a character.

situation, the participant is asked to state the emotion that would be felt
by most people in that situation. The participant thus needs to disregard
how he would feel and instead think about how most people would feel.
This test is classified as a social knowledge task since there is no action
or reaction by any character. This requirement to reflect about others
without being presented with any specific behavior from others dis-
tinguishes social knowledge tasks from emotion recognition tasks that
typically involve the recognition of other people's expressed emotion
attentional impairment that could impact ToM.

and from ToM tasks that systematically involves understanding the


actions or reactions of the characters. In the SKT, no action or verba-
lisation is presented that a character would have done or said, and
hence participants who complete the task have to rely on what they
by most people in this situation.

know about these types of situations in order to provide an answer


about the most likely reaction expected for each situation.
For the current study, the answers were rated by two independent
judges who received a short training to use the scoring grid validated in
Achim et al. (2012) and then used in subsequent studies (Achim et al.,
2013b, 2012; Gaudreau et al., 2015; Tousignant et al., 2016). The SKT
takes less than five minutes to complete. The details of the test, as well
Process

as an example, are presented in Table 2, and further information can be


Details of the COST and the SKT.

found in Achim et al. (2012).


Type of question

2.2.3. Mill-Hill, part B (Deltour, 2005)


Estimated verbal IQ was measured with the Mill-Hill part B. In this
QSFP
QM

QR

QC

test, participants are to select, from a list of six words, the synonym of a
Q1

target word. This test was administered only at the first session and was
Table 2

COST
Test

SKT

included to ensure that the participants presented with an estimated


verbal IQ above 70.

65
É. Thibaudeau et al. Psychiatry Research 262 (2018) 63–69

2.3. Procedure observed for three COST-QM items (see Table 4). Following the clar-
ification of the scoring grid, a higher consistency was observed for these
Participants completed two assessments separated by a period of three items (see Table 4) and a higher overall intra-class correlation was
three to five weeks. These assessments included the COST, the SKT, the observed (ρ = 0.86, p < 0.01). The test retest-reliability calculated
Mill-Hill as well as additional measures that are not discussed in the using the ratings of only one judge (ρ = 0.85, p < 0.01) was very si-
current paper. milar to the intra-class correlation observed after the consensus.
For the COST-QR, a poor, non-significant intra-class correlation was
2.4. Statistical analysis observed, and the paired-sample t-test revealed no practice effect (see
Table 3). The percentage of consistency between the first and the
For all analyses, the significance threshold was set at p < 0.05. For second session for the QR items ranged from 74.5% to 94.9%, with no
the COST, our scores of interest were the QM and QR items (see items below two standard deviations. The test retest-reliability calcu-
Table 2). The other control questions were not targeted by the analyzes lated using the consensus score was in the low range (ρ = 0.15, p =
since they are included to verify that the participant is able to extract 0.30), which suggests no inflation of the reliability compared to the
relevant information from the stories (which is typically the case for low-medium intra-class correlation obtained with the ratings of only
healthy participants), which leads to almost no variability in the scores. one judge (ρ = 0.30, p = 0.13).
For the COST-QM, COST-QR and SKT, distributions were first examined
to identify potential floor and ceiling effects, and the normality of the 3.2. The SKT
distributions was evaluated with skewness and kurtosis statistics. Co-
hen's Kappas were then used to evaluate the inter-rater reliability of our The distribution of the SKT scores was normal.
measures between the two independent judges (including data from
both testing sessions). The percentage of agreement was also calculated 3.2.1. Inter-rater reliability
for each individual item, and the scoring grid was clarified for the items The inter-rater agreement for the SKT was almost perfect (κ = 0.99,
showing a percentage of agreement below a Z score of −2. We then p < 0.001). Eleven items reached a 100% agreement between the two
asked two naïve judges to re-score these items using the clarified grid, judges, and agreement for the three other items ranged from 96.3% to
and we updated our analyses. The final ratings were used as the con- 98.8%.
sensus score for the subsequent test-retest reliability analyses.
Test-retest reliability was evaluated with intra-class correlations 3.2.2. Test-retest reliability
between the two sessions. Paired-sample t-tests and Cohen's ds were A good test-retest reliability was observed for the SKT (ρ = 0.77,
calculated to examine potential practice effects. To ensure that the p < 0.01). Furthermore, there was no evidence of practice effect be-
consensus score did not inflate the test-retest reliability, we additionally tween the two sessions (see Table 3). A percentage of consistency below
examined test-retest reliability using the score of only one of the two two standard deviations was observed for one item (see Table 4). Fol-
judges (before consensus, but after correction of the scoring grids). In lowing the clarification of the scoring grid, a higher consistency for this
addition, the percentage of consistency of each individual item between item (see Table 4) and a higher overall intra-class correlation were
the first and the second session was examined. The scoring grid was observed (ρ = 0.78, p < 0.01). The test-retest reliability assessed using
clarified for items with a percentage below a Z score of −2, and we the ratings of only one of the two judges revealed a similarly high intra-
asked two naïve judges to score the answers to these items using the class correlation (ρ = 0.76, p < 0.01).
new grid. The same analyses were then repeated.
Finally, a correlation between the COST-QM and the SKT was con- 3.3. Correlation between the COST-QM and the SKT
ducted to determine the extent of the association between the two
constructs. The correlation between the COST-QM and the SKT revealed a
moderate and significant relationship (r = 0.36, p = 0.021).
3. Results
4. Discussion
3.1. The COST
This study assessed the test-retest reliability and the inter-rater re-
The COST-QM and the COST-QR showed normal distributions. liability of the COST and the SKT in healthy adults. Before the clar-
ification of the scoring grid, good to excellent reliability was observed
3.1.1. Inter-rater reliability for both tests. The clarification of the scoring grid for some items en-
The Cohen's kappa for the COST-QM items with three possible abled us to further improve these psychometric properties, which re-
scores (2, 1 or 0) revealed a strong inter-rater agreement (κ = 0.75, sulted in a good test-retest reliability, no practice effect and an excellent
p < 0.001), while the COST-QM items with two possible scores (2 or 0) inter-rater reliability for the COST-QM (a measure of ToM) and the SKT
revealed an excellent inter-rater agreement (κ = 0.97, p < 0.001). For (a measure of social knowledge). For the COST-QR (a control measure
the COST-QR items, a strong inter-rater agreement was also observed (κ included in the COST), a strong inter-rater agreement and no practice
= 0.76, p < 0.001). The individual item analysis revealed that only one effect were observed, but the non-significant intra-class correlation
COST-QM item (item 11 – scored 0, 1 or 2 points) presented a weaker suggests that the test-retest reliability of this control measure would
inter-rater agreement (78.8%). Following the clarification of the need to be improved. Combined with previous evidence of adequate
scoring grid, the agreement reached 90% for this item. The Cohen's validity for the COST and the SKT, the results of the current study
kappa for the COST-QM items with three possible scores (2, 1 or 0) nonetheless suggest that these two tests show good psychometric
following this clarification was κ = 0.76, p < 0.001. properties at least in healthy participants.

3.1.2. Test-retest reliability 4.1. The COST to assess ToM


For the COST-QM, the intra-class correlation between the two ses-
sions revealed an excellent test-retest reliability (ρ = 0.84, p < 0.01), The good test-retest reliability of the COST-QM indicates that per-
based on the classification proposed by Cicchetti (1994). The paired- formance remains stable and that no practice effect is observed in the
sample t-test revealed no practice effect for the COST-QM (see Table 3). short-term. Given the important association between ToM and func-
A percentage of consistency below two standard deviations was tioning and the difficulties in functioning in psychiatric populations

66
É. Thibaudeau et al. Psychiatry Research 262 (2018) 63–69

Table 3
Intra-class correlations and practice effects for the COST and the SKT following the clarification of the scoring grid.

Measure T1 T2 ICC Practice effect

Mean (SD) Range Mean (SD) Range ρ p t p d

COST – QM (/52) 47.4 (2.9) 39–51 47.6 (3.8) 37–52 0.86 < 0.01 −0.598 0.553 −0.06
COST – QR (/12) 11.1 (1.0) 9–12 11.3 (0.9) 7–12 0.15 0.30 −1.070 0.291 −0.21
SKT (/14) 11.3 (1.2) 9–14 11.2 (1.3) 9–14 0.78 < 0.01 0.598 0.553 0.08

COST = Combined Stories Test; QM = second-order theory of mind questions; QR = Non-social reasoning questions; SKT = Social Knowledge Test; T1 = Session 1; T2 = Session 2; ICC
(ρ) = intra-class correlation; t = t-test; d = Cohen's d.

Table 4
COST-QM and SKT items for which a consistency below two standard deviations from the mean was initially observed between the two sessions.

Test and type of question Items Percentage of consistency for the item Percentage of consistency with the clarified grid

COST – QM 11. Ravioli 60.0% 80.0%


20. Annoyed 64.0% 97.5%
26. Vase 62.5% 70.0%
SKT 11. Anger 75.0% 82.5%

COST = Combined Stories Test; QM = second-order theory of mind questions; SKT = Social Knowledge Test.

(Fett et al., 2011; Grove et al., 2016), it is important to have reliable documented by Achim et al. (2012). Taken together, these results reveal
ToM measures that are relatively stable across administrations. The that the COST-QM is a valid and reliable measure of ToM with healthy
inter-rater reliability was also evaluated and was excellent. These re- participants. The results of the SCOPE study suggested some limitations
sults confirmed the previous results of Achim et al. (2012) with healthy in other ToM tasks such as limited internal consistency (RMET, Hinting
participants and showed that the scoring grid is well-constructed and task), limited test-retest reliability (Hinting task, TASIT) and practice
exhaustive and allows naïve judges to use it correctly. These judges effects (Hinting task) (Pinkham et al., 2017). Furthermore, few ToM
received a short training before using the grid, and this is representative tasks are translated or validated in French. ToM is an important treat-
of the type of training that researchers and clinicians that currently use ment target in psychiatric populations, and clinical trials require the use
the test have received. Thus, these results indicate that a brief training of reliable measures to identify the efficacy of various interventions.
is sufficient for consistent scoring and proper use of the test. The COST is therefore a suitable choice to measure ToM since it over-
The COST however may have some limitations. First, while the in- comes these psychometric limitations, which will however have to be
tegration of non-social reasoning control questions is useful in order to confirmed in psychiatric populations. The COST is an interesting test for
determine if potential deficits extend to reasoning about non-social si- psychiatric populations such as people with bipolar disorder or schi-
tuations, the COST-QR showed a poor and non-significant test-retest zophrenia, for which social cognition deficits are often observed si-
reliability in the current study. A previous study of Achim et al. (2012) multaneously with deficits in various neurocognitive functions such as
showed that COST-QR items significantly contribute to the COST-QM memory, attention or executive functions (Bora et al., 2011, 2009;
performance. Furthermore, non-social reasoning has been associated Schaefer et al., 2013). The control questions included in the COST (i.e.
with ToM in multiples studies (Abdel-Hamid et al., 2009; Fernandez- COST-QR and COST-QC) can provide useful cues regarding the source
Gonzalo et al., 2014; Pickup, 2008; Ziv et al., 2011). In the previous of the difficulties observed during the ToM task. These questions can
study by Achim et al. (2012), the performance of healthy participants help to control for potential neurocognitive problems that could inter-
on the COST-QR was similar to those in the current study, but the first- fere with ToM performance such as difficulties with reasoning, problem
episode psychosis participants showed a lower performance, which solving, attention or memory. The COST is thus a comprehensive test
suggests good discriminant validity between psychiatric and non-psy- that could be an interesting complement in a cognitive assessment to
chiatric populations. In the current sample, a cut-off score between a evaluate ToM while taking into account other cognitive abilities.
normal and an impaired performance can be set at 10, since a score
below that cut-off represents a performance beneath one-standard de- 4.2. The SKT to assess social knowledge
viation from the mean (e.g. 16th percentile). Using a cut-off score for
the QR as a discriminant value between a psychiatric population and The SKT revealed good test-retest reliability and no practice effect
healthy people will however have to be formally tested. It could also be between the two sessions. As with the COST, the inter-rater reliability of
interesting to add more non-social reasoning questions or to increase the SKT was evaluated in order to confirm, with different judges, the
the difficulty of these items to get a larger range of performance that results of the previous study of Achim et al. (2012). The inter-rater
could improve the test-retest reliability of these COST-QR. Another agreement was almost perfect, which indicates that the scoring grid is
limitation of the COST is that the administration time varies between 20 well-built and allows judges, with a short training, to correctly score the
and 30 min. This could prevent clinicians or researchers from using the test.
test, given the limited time they often have with their patients. The The findings from the SKT show some limitations of the task, or at
creation of a shorter version of the test could thus be interesting. least some aspects that need to be considered. First, while the reliability
However, shorter tests with a limited number of items to assess ToM of the SKT is good, a modest internal consistency was observed in the
often present with ceiling effects and are not sensitive enough to detect previous study of Achim et al. (2012). However, the composition of the
subtler problems in ToM that could have an important impact on test (i.e. the evaluation of distinct component regarding the knowledge
functioning. of different mental states) and the dichotomous scoring could explain
The good test-retest reliability and inter-rater reliability of the this limited internal consistency. Second, there is an effect of age that
COST-QM adds to the good internal consistency and convergent validity needs to be considered when interpreting the performance on the SKT.

67
É. Thibaudeau et al. Psychiatry Research 262 (2018) 63–69

A previous study showed that adults above 50 years old present with Research Council of Canada (NSERC) [grant number #435556- 2013,
lower scores than younger adults (Lajoie et al., 2014). The validity and 2013]. AMA is supported by a salary grant from Fonds de Recherche du
reliability of the SKT thus need to be evaluated in other age groups in Québec en Santé (FRQS) and ET is supported by a studentship also from
the future. FRQS. The funding sources were not involved in any step of this study,
In addition to the good psychometric properties documented in the from the design to the writing of the manuscript.
current study, the SKT showed a convergent validity with the COST-QM
and was found to significantly contribute to the COST-QM performance References
in previous studies (Achim et al., 2012, 2013b). Furthermore, people
with schizophrenia exhibit moderate deficits in social knowledge, ac- Abdel-Hamid, M., Lehmkämper, C., Sonntag, C., Juckel, G., Daum, I., Brüne, M., 2009.
cording to the results of the meta-analysis of Savla et al. (2013). Taken Theory of mind in schizophrenia: the role of clinical symptomatology and neuro-
cognition in understanding other people's thoughts and intentions. Psychiatry Res.
together, this suggests the importance of assessing social knowledge in 165, 19–26. http://dx.doi.org/10.1016/j.psychres.2007.10.021.
social cognition studies since deficits in this domain could impact ToM Achim, A.M., Ouellet, R., Roy, M.A., Jackson, P.L., 2012. Mentalizing in first-episode
performance. In addition, most social knowledge tasks assess the psychosis. Psychiatry Res. 196, 207–213. http://dx.doi.org/10.1016/j.psychres.
2011.10.011.
knowledge about typical sequence in different social situations Achim, A.M., Guitton, M., Jackson, P.L., Boutin, A., Monetta, L., 2013a. On what ground
(Corrigan and Addis, 1995), while the SKT interestingly assesses the do we mentalize? Characteristics of current tasks and sources of information that
knowledge that people have about the link between typical situations contribute to mentalizing judgments. Psychol. Assess. 25, 117–126. http://dx.doi.
org/10.1037/a0029137.
and mental states. Given the difficulty that people with psychiatric Achim, A.M., Ouellet, R., Lavoie, M.-A., Vallières, C., Jackson, P.L., Roy, M.-A., 2013b.
disorder have to understand other peoples’ mental states, the SKT offers Impact of social anxiety on social cognition and functioning in patients with recent-
a novel perspective on the impact that social knowledge can have on onset schizophrenia spectrum disorders. Schizophr. Res. 145, 75–81. http://dx.doi.
org/10.1016/j.schres.2013.01.012.
ToM performance in these populations.
Addington, J., Saeedi, H., Addington, D., 2006. Influence of social perception and social
knowledge on cognitive and social functioning in early psychosis. Br. J. Psychiatry
4.3. Limitations 189, 373–378. http://dx.doi.org/10.1192/bjp.bp.105.021022.
Bänziger, T., Scherer, K.R., Hall, J.A., Rosenthal, R., 2011. Introducing the MiniPONS: a
short multichannel version of the profile of nonverbal sensitivity (PONS). J.
Limitations of the tests were detailed in the previous sections. An Nonverbal Behav. 35, 189–204. http://dx.doi.org/10.1007/s10919-011-0108-3.
additional aspect that needs to be considered in the interpretation of the Baron-Cohen, S., 1989. The autistic child's theory of mind: a case of specific develop-
results is the level of education of our participants. While it could have mental delay. J. Child Psychol. Psychiatry 30, 285–297. http://dx.doi.org/10.1111/j.
1469-7610.1989.tb00241.x.
led to ceiling effects or practice effects given that high levels of edu- Baron-Cohen, S., O’Riordan, M., Stone, V., Jones, R., Plaisted, K., 1999. Recognition of
cation have been associated with the development of cognitive skills faux pas by normally developing children with Asperger syndrome or high-func-
and knowledge necessary to solve verbal tasks (Ritchie et al., 2015), tioning autism. J. Autism Dev. Disord. 29, 407–418. http://dx.doi.org/10.1023/
A:1023035012436.
these problems were not encountered. It would nonetheless be useful to Baron-Cohen, S., Wheelwright, S., Hill, J., Raste, Y., Plumb, I., 2001. The “Reading the
document these psychometric properties with samples with different Mind in the Eyes” Test revised version: a study with normal adults, and adults with
sociodemographic characteristics, as well as in patients with different Asperger syndrome or high-functioning autism. J. Child Psychol. Psychiatry 42,
241–251. http://dx.doi.org/10.1111/1469-7610.00715.
psychopathologies. Bell, M., Tsang, H.W.H., Greig, T.C., Bryson, G.J., 2009. Neurocognition, social cognition,
perceived social discomfort, and vocational outcomes in schizophrenia. Schizophr.
4.4. Conclusion Bull. 35, 738–747. http://dx.doi.org/10.1093/schbul/sbm169.
Bell, M.D., Fiszdon, J.M., Greig, T.C., Wexler, B.E., 2010. Social attribution test - multiple
choice (SAT-MC) in schizophrenia: comparison with community sample and re-
The results of the current study add to the good psychometric lationship to neurocognitive, social cognitive and symptom measures. Schizophr. Res.
properties previously reported by Achim et al. (2012). The COST 122, 164–171. http://dx.doi.org/10.1016/j.schres.2010.03.024.
overcomes several psychometric limitations of the currently available Blair, R.J.R., Cipolotti, L., 2000. Impaired social response reversal A case of “acquired
sociopathy. Brain 123, 1122–1141. http://dx.doi.org/10.1093/brain/123.6.1122.
ToM tasks, while the SKT provides a unique perspective on social Bora, E., Bartholomeusz, C., Pantelis, C., 2016. Meta-analysis of Theory of Mind (ToM)
knowledge regarding mental states. While good psychometric proper- impairment in bipolar disorder. Psychol. Med. 46, 253–264. http://dx.doi.org/10.
ties were previously documented for these tasks in people with first- 1017/S0033291715001993.
Bora, E., Yucel, M., Pantelis, C., 2009. Cognitive endophenotypes of bipolar disorder: a
episode psychosis, the test-retest reliability will also have to be assessed meta-analysis of neuropsychological deficits in euthymic patients and their first-de-
in schizophrenia or other clinical populations before drawing firm gree relatives. J. Affect. Disord. 113, 1–20. http://dx.doi.org/10.1016/j.jad.2008.06.
conclusions regarding the suitability of the tests for these populations. 009.
Bora, E., Yücel, M., Pantelis, C., Berk, M., 2011. Meta-analytic review of neurocognition
An ongoing study will enable us to document the test-retest reliability in bipolar II disorder. Acta Psychiatr. Scand. 123, 165–174. http://dx.doi.org/10.
of these tests with individuals with schizophrenia, as well as the English 1111/j.1600-0447.2010.01638.x.
version of the tests. A copy of the COST and the SKT can be obtained by Chung, Y.S., Barch, D., Strube, M., 2014. A meta-analysis of mentalizing impairments in
adults with schizophrenia and autism spectrum disorder. Schizophr. Bull. 40,
contacting the corresponding author at the following email address:
602–616. http://dx.doi.org/10.1093/schbul/sbt048.
amelie.achim@fmed.ulaval.ca. Cicchetti, D.V., 1994. Guidelines, criteria, and rules of thumb for evaluating normed and
standardized assessment instruments in psychology. Psychol. Assess. 6, 284–290.
http://dx.doi.org/10.1037/1040-3590.6.4.284.
Acknowledgment
Corcoran, R., Mercer, G., Frith, C.D., 1995. Schizophrenia, symptomatology and social
inference: investigating “theory of mind” in people with schizophrenia. Schizophr.
ET, AMA and CC were involved in the design of the study, the Res. 17, 5–13. http://dx.doi.org/10.1016/0920-9964(95)00024-G.
analysis, the interpretation of the data, the decision to submit the paper Corrigan, P.W., Addis, I.B., 1995. The effects of cognitive complexity on a social se-
quencing task in schizophrenia. Schizophr. Res. 16, 137–144. http://dx.doi.org/10.
and the writing of the manuscript. ET, ML and KV were involved in the 1016/0920-9964(94)00072-G.
collection, the analysis, the interpretation of the data and the writing of Deltour, J.J., 2005. Échelle de vocabulaire Mill Hill. Les Éditions du Centre de
the manuscript. Psychologie Appliquée, Paris, France.
Fernandez-Gonzalo, S., Jodar, M., Pousa, E., Turon, M., Garcia, R., Rambla, Hernandez,
Carla Palao, D., 2014. Selective effect of neurocognition on different theory of mind
Conflict of interest domains in first-episode psychosis. J. Nerv. Ment. Dis. 202, 576–582. http://dx.doi.
org/10.1097/NMD.0000000000000164.
Fett, A.K.J., Viechtbauer, W., Dominguez, M., de, G., Penn, D.L., van Os, J., Krabbendam,
None L., 2011. The relationship between neurocognition and social cognition with func-
tional outcomes in schizophrenia: a meta-analysis. Neurosci. Biobehav. Rev. 35,
Role of the funding source 573–588. http://dx.doi.org/10.1016/j.neubiorev.2010.07.001.
Frith, C.D., Corcoran, R., 1996. Exploring “theory of mind” in people with schizophrenia.
Psychol. Med. 26, 521–530. http://dx.doi.org/10.1017/S0033291700035601.
This work was supported by the Natural Sciences and Engineering

68
É. Thibaudeau et al. Psychiatry Research 262 (2018) 63–69

Gaudreau, G., Monetta, L., Macoir, J., Poulin, S., Laforce, R.J., Hudon, C., 2015. Mental 203–213. http://dx.doi.org/10.1176/appi.ajp.2007.07010042.
state inferences abilities contribution to verbal irony comprehension in older adults Pickup, G.J., 2008. Relationship between theory of mind and executive function in
with mild cognitive impairment. Behav. Neurol. 2015. http://dx.doi.org/10.1155/ schizophrenia: a systematic review. Psychopathology 41, 206–213. http://dx.doi.
2015/685613. org/10.1159/000125554.
Green, M.F., Horan, W.P., Lee, J., 2015. Social cognition in schizophrenia. Nat. Rev. Pinkham, A.E., 2014. Social cognition in schizophrenia. J. Clin. Psychiatry 75, 14–19.
Neurosci. 16, 620–631. http://dx.doi.org/10.1038/nrn4005. http://dx.doi.org/10.4088/JCP.13065su1.04.
Grove, T.B., Tso, I.F., Chun, J., Mueller, S.A., Taylor, S.F., Ellingrod, V.L., McInnis, M.G., Pinkham, A.E., Harvey, P.D., Penn, D.L., 2017. Social cognition psychometric evaluation:
Deldin, P.J., 2016. Negative affect predicts social functioning across schizophrenia results of the final validation study. Schizophr. Bull. http://dx.doi.org/10.1093/
and bipolar disorder: findings from an integrated data analysis. Psychiatry Res. 243, schbul/sbx140.
198–206. http://dx.doi.org/10.1016/j.psychres.2016.06.031. Pinkham, A.E., Penn, D.L., Green, M.F., Buck, B., Healey, K., Harvey, P.D., 2014. The
Happé, F.G.E., 1994. An advanced test of theory of mind: understanding of story char- social cognition psychometric evaluation study: results of the expert survey and
acters' thoughts and feelings by able autistic, mentally handicapped, and normal RAND Panel. Schizophr. Bull. 40, 813–823. http://dx.doi.org/10.1093/schbul/
children and adults. J. Autism Dev. Disord. 24, 129–154. http://dx.doi.org/10.1007/ sbt081.
BF02172093. Pinkham, A.E., Penn, D.L., Green, M.F., Harvey, P.D., 2016. Social cognition psycho-
Lahera, G., Ruiz-Murugarren, S., Iglesias, P., Ruiz-Bennasar, C., Herreria, E., Montes, J.M., metric evaluation: results of the initial psychometric study. Schizophr. Bull. 42,
Fernandez-Liria, A., 2012. Social cognition and global functioning in bipolar dis- 494–504. http://dx.doi.org/10.1093/schbul/sbv056.
order. J. Nerv. Ment. Dis. 200, 135–141. http://dx.doi.org/10.1097/NMD. Ritchie, S.J., Bates, T.C., Deary, I.J., 2015. Is education associated with improvements in
0b013e3182438eae. general cognitive ability, or in specific skills? Dev. Psychol. 51, 573–582. http://dx.
Lajoie, M.-.P., Lavoie, M.-.A., Vistoli, D., Achim, A.M., 2014. La tâche de cognition sociale doi.org/10.1037/a0038981.
Les Situations est-elle valide pour des adultes de différents groupes d’âge?, In: 36e Samamé, C., Martino, D.J., Strejilevich, S.A., 2015. An individual task meta-analysis of
Congrès de La Société Québécoise Pour La Recherche En Psychologie. Montreal. social cognition in euthymic bipolar disorders. J. Affect. Disord. 173, 146–153.
Lavoie, M.-A., Plana, I., Jackson, P.L., Godmaire-Duhaime, F., Bédard Lacroix, J., Achim, http://dx.doi.org/10.1016/j.jad.2014.10.055.
A.M., 2014. Performance in multiple domains of social cognition in parents of pa- Savla, G.N., Vella, L., Armstrong, C.C., Penn, D.L., Twamley, E.W., 2013. Deficits in do-
tients with schizophrenia. Psychiatry Res. 220, 118–124. http://dx.doi.org/10.1016/ mains of social cognition in schizophrenia: a meta-analysis of the empirical evidence.
j.psychres.2014.07.055. Schizophr. Bull. 39, 979–992. http://dx.doi.org/10.1093/schbul/sbs080.
Lavoie, M.A., Vistoli, D., Sutliff, S., Jackson, P.L., Achim, A.M., 2016. Social re- Schaafsma, S.M., Pfaff, D.W., Spunt, R.P., Adolphs, R., 2015. Deconstructing and re-
presentations and contextual adjustments as two distinct components of the Theory constructing theory of mind. Trends Cogn. Sci. 19, 65–72. http://dx.doi.org/10.
of Mind brain network: evidence from the REMICS task. Cortex 81, 176–191. http:// 1016/j.tics.2014.11.007.
dx.doi.org/10.1016/j.cortex.2016.04.017. Schaefer, J., Giangrande, E., Weinberger, D.R., Dickinson, D., 2013. The global cognitive
Martín-Rodríguez, J.F., León-Carrión, J., 2010. Theory of mind deficits in patients with impairment in schizophrenia: consistent over decades and around the world.
acquired brain injury: a quantitative review. Neuropsychologia 48, 1181–1191. Schizophr. Res. 150, 42–50. http://dx.doi.org/10.1016/j.schres.2013.07.009.
http://dx.doi.org/10.1016/j.neuropsychologia.2010.02.009. Sergi, M.J., Fiske, A.P., Horan, W.P., Kern, R.S., Kee, K.S., Subotnik, K.L., Nuechterlein,
Mcdonald, S., 2013. Impairments in social cognition following severe traumatic brain K.H., Green, M.F., 2009. Development of a measure of relationship perception in
injury. J. Int. Neuropsychol. Soc. 19, 231–246. http://dx.doi.org/10.1017/ schizophrenia. Psychiatry Res. 166, 54–62. http://dx.doi.org/10.1016/j.psychres.
S1355617712001506. 2008.03.010.
McDonald, S., Flanagan, S., Rollins, J., Kinch, J., 2003. TASIT: a new clinical tool for Tousignant, B., Jackson, P.L., Massicotte, E., Beauchamp, M.H., Achim, A.M., Vera-Estay,
assessing social perception after traumatic brain injury. J. Head Trauma Rehabil. 18, E., Bedell, G., Sirois, K., 2016. Impact of traumatic brain injury on social cognition in
219–238. http://dx.doi.org/10.1097/00001199-200305000-00001. adolescents and contribution of other higher order cognitive functions.
Nuechterlein, K.H., Green, M.F., Kern, R.S., Baade, L.E., Barch, D.M., Cohen, J.D., Essock, Neuropsychol. Rehabil. 2011, 1–19. http://dx.doi.org/10.1080/09602011.2016.
S., Fenton, W.S., Frese, F.J., Gold, J.M., Goldberg, T., Heaton, R.K., Keefe, R.S.E., 1158114.
Kraemer, H., Mesholam-Gately, R., Seidman, L.J., Stover, E., Weinberger, D.R., Ziv, I., Leiser, D., Levine, J., 2011. Social cognition in schizophrenia: cognitive and af-
Young, A.S., Zalcman, S., Marder, S.R., 2008. The MATRICS consensus cognitive fective factors. Cogn. Neuropsychiatry 16, 71–91. http://dx.doi.org/10.1080/
battery, Part 1: test selection, reliability, and validity. Am. J. Psychiatry 165, 13546805.2010.492693.

69

You might also like