Professional Documents
Culture Documents
Psychometrics of Mayer-Salovey-Caruso
Emotional Intelligence Test (MSCEIT) Scores1
Michael T. Brannick, Monika M. Wahi,andSteven B. Goldin
University of South Florida
Summary.A sample of 183 medical students completed the Mayer-SaloveyCaruso Emotional Intelligence Test (MSCEIT V2.0). Scores on the test were examined for evidence of reliability and factorial validity. Although Cronbachs alpha for
the total scores was adequate (.79), many of the scales had low internal consistency
(scale alphas ranged from .34 to .77; median=.48). Previous factor analyses of the
MSCEIT are critiqued and the rationale for the current analysis is presented. Both
confirmatory and exploratory factor analyses of the MSCEIT item parcels are reported. Pictures and faces items formed separate factors rather than loading on
a Perception factor. Emotional Management appeared as a factor, but items from
Blends and Facilitation failed to load consistently on any factor, rendering factors
for Emotional Understanding and Emotional Facilitation problematic.
Emotional intelligence (EI) refers to individual differences in the capacity to perceive emotions, to use emotions in productive ways, and to
understand and regulate emotions (Mayer, Salovey, Caruso, & Sitarenios,
2003). Emotional intelligence, defined in this manner, is an intellectual
capacity that connects reasoning with feeling (the typical distinction between trait and ability EI is avoided; see OSullivan, 2007). Mayer, Salovey,
Caruso, and colleagues (e.g., Mayer & Salovey, 1997; Mayer, et al., 2003)
have developed questionnaires for evaluating EI; their most recent version
is called the MSCEIT V2.0 (Mayer, Salovey, & Caruso, 2002; Mayer, et al.,
2003). The purpose of this paper is to provide information regarding the
reliability and factor validity of the MSCEIT V2.0 test scores.
Measurement Efforts
Numerous empirical articles evaluating the reliability and validity of various measures of EI have appeared in the research literature.
To date, several reviews of the literature have also been published (Matthews, Zeidner, & Roberts, 2002; MacCann, Matthews, Zeidner, & Roberts,
2003; Zeidner, Matthews, & Roberts, 2004; Conte, 2005; Spector & Johnson,
2006). One might characterize the general tone of the reviews as being
cautiously optimistic, but noting that some claims regarding EI have little
empirical support (Zeidner, et al., 2004).
The MSCEIT (Mayer, et al., 2002) provides scores representing four
branches of emotional intelligence. The branches are labeled PercepAddress correspondence to Michael T. Brannick, Psychology Department, PCD 4118G, University of South Florida, Tampa, FL 33620-7200 or e-mail (mbrannick@usf.edu).
DOI 10.2466/03.04.PR0.109.4.327-337
ISSN 0033-2941
328
M. T. Brannick, et al.
329
are 39 such parcels in the test, with the number of parcels ranging from
three to seven per subscale (see Table 4). The item parcels computed by the
test-scoring algorithm should form the basis of the factor analysis of the
MSCEIT because these are the items that are actually scored and form
the basis of the individual reports. If the parcels correspond well to the
theory outlined in the test manual, then the covariances for the 39 parcels
should fit well into a four-factor model corresponding to the four branches of emotional intelligence.
Although there have been several previous factor analyses of the
MSCEIT, none have been based on the item parcels that are actually used
in the scoring algorithm to score the test. Here, both confirmatory and exploratory analyses of the MSCEIT are presented to better understand its
quality and structure. These results are compared to those of previous
studies and the paper concludes with suggestions for further research and
development of tests of emotional intelligence.
Method
Participants and Procedure
Participants were 183 first- and second-year medical students enrolled
in a medical school in the Southeastern U.S. (108 or 59% women, 75 or 41%
men). The mean age of the sample was 23.7 yr. (SD=2.7) and frequencies
for self-reported race were Asian n=28, Black n=7, Hispanic n=19, White
n=112, and Not Reported n=17. Participants were volunteers in one of
two groups. The first group was offered lunch as part of completing the
surveys. The second group was offered a feedback session in which they
were given their scores, shown a presentation about the general meaning
of the scores, and offered individual consultation about their own scores if
they so desired. All participants were assured that their scores would never be considered in making decisions about them and would not be seen
by any faculty members responsible for assigning course grades. Participants were informed that the purpose of the study was to examine the relations between EI scores obtained in the first two years of medical school
(basic science years) and measures of performance during the last 2 yr. of
medical school (clinical years). The local institutional review board approved the study prior to data collection.
Item and Scale Scores
The MSCEIT was administered through the web portal provided by
Multi-Health Systems, Inc. (MHS). Responses were scored by MHS and
returned as computer files containing raw and adjusted scores for items
and scales; the scoring program is proprietary, and tests cannot be scored
by users but must be scored by MHS. The MSCEIT item scores are based
upon consensus scoring and upon a specific normative group. The con-
330
M. T. Brannick, et al.
sensus scoring means that a normative group of people was asked to respond to the item. A given proportion of that group chose each alternative
to the item (e.g., for Item 1, 50% may have chosen Option A, 10% Option
B, and so forth). A test takers score would be the proportion of the normative group choosing that option, so if, for example, a person chose A for
Item 1, they would receive a raw score of .50.
The general reference group was used for scoring rather than the expert group because both scoring methods yield very similar results, and
general consensus scoring is recommended for most applications (Mayer, et al., 2002, p. 17). The test administrator can choose a reference group
for each individual, which results in adjusted scores for that individual
(for example, different adjusted scores are available for men and women).
Adjusted scores were provided to individuals for feedback, but unadjusted raw scores were used for all analyses in this paper.
Item parcels.Although there are 141 items to which each examinee
provides responses, many items refer to a common stimulus. For example,
a single picture stimulus may require several different (item) responses.
Separate items requiring responses to a common stimulus form parcels.
The MSCEIT is scored by first taking the average of the raw scores for
items within the parcel to form parcel scores. Then, scores for the item parcels are averaged to form raw score scales. Thus, the raw score for a scale
is the average of a number of item parcels, each of which is the average of
one or more items. MHS provided the information necessary to compute
the item parcel scores (D. Logan, personal communication, March 3, 2008).
Such parcel scores form the basis for the reliability estimates and factor
analyses reported in the results (see Bagozzi & Heatherton, 1994, for additional information about factoring item parcels).
Results
Scale Descriptions and Relations
Scale descriptive statistics are shown in Table 1. There were a total of
179 complete responses to the MSCEIT; data for those participants without missing data are reported throughout the results. Descriptive statistics
for the four Branches are shown in Table 2.
Disattenuated correlations are shown above the main diagonal in
both Table 1 and Table 2. If the factor structure hypothesized to produce
the test scores is correct, one would expect to see correlations among entities belonging to the same factor to be larger than the other entries. For
example, in Table 1 the correlation for Scales A and E (faces and pictures,
disattenuated r=.52) should be higher than other correlations in the same
row and column. However, such a pattern is only found for the scales in
Branch 3. Tables 1 and 2 show moderate positive correlations among all
the variables, which is consistent with a general factor.
331
.77
.30
.30
.09
.39
.21
.20
.12
.53
.79
.13
.46
.56
.20
.21
.29
.23
.19
.16
.45
.77
.08
.59
.46
.34
.15
.16
.08
.28
.10
.60
.88
.05
.16
.43
.39
.44
.10
.25
.07
.36
.43
.82
.06
.52
.46
.33
.18
.72
.26
.21
.07
.53
.84
.08
.41
.52
.24
.63
.52
.35
.17
.39
.51
.83
.07
.36
.41
.77
.17
.40
.46
.39
.22
.52
.86
.07
.19
.30
.23
.76
.11
.90
.49
.52
.44
.82
.08
Note.n=179. Labels in parentheses indicate assignment of scales to branches. Coefficient alpha reported on the main diagonal. Correlations below the diagonal greater than or equal to
.12 in absolute value are significant at p<.05. Correlations above the main diagonal are dis
attenuated for reliability. % Max.=the scale mean divided by the maximum possible scale
score. *Items refers to the number of item parcels used in computing the raw scores and alpha.
Factor Analyses
Confirmatory factor analyses.Two sets of confirmatory factor analyses (CFA) were computed on the scores. In the first set of computations,
four different models were computed. First, a single, general factor was
estimated. Next, the eight scale scores were hypothesized to show nonzero coefficients on relevant branches (this is equivalent to the analyses by
Mayer, et al., 2003, and initial analyses by Palmer, et al., 2005). Four correlated factors were hypothesized, one for each branch. Relevant scales were
hypothesized to show nonzero coefficients on a single branch (scales and
corresponding branches are shown in Table 1). A deliberately misspecified
model was also tested to provide a comparison of fit. The purpose of the
deliberately misspecified model is to provide descriptive fit indices for a
Table 2
Branch Score Correlations and Descriptive Statistics
Branch
B1 Perceiving Emotions
B2 Facilitating Thought
B3 Understanding Emotions
B4 Managing Emotions
M
SD
Items*
B1
10
.78
.40
.33
.14
.53
.09
B2
10
.61
.55
.26
.38
.48
.06
B3
11
.53
.50
.49
.21
.56
.05
B4
.20
.66
.39
.60
.44
.05
Note.n=281. Coefficient alpha reported on the main diagonal. All reported correlations are
significant at p<.05. Correlations above the main diagonal are disattenuated for reliability.
*Items refers to the number of item clusters used in computing the raw scores and alpha.
332
M. T. Brannick, et al.
model with a similar number and pattern of fixed and free parameters to
the model hypothesized to be most likely a priori. In this deliberately misspecified model, Scales A and B were hypothesized to correspond to the
first factor, C and D to the second factor, and so on. The final model was
described as a nested model by Gignac (2007). This model contains a general factor plus a scale factor for each of the hypothesized factors. However, in the nested analysis, all of the factors are defined to be uncorrelated
with each other, so that both a general factor and a series of specific factors may be estimated. The data were fit to all models using LISREL 8.3
(Jreskog & Sorbom, 1993). For all CFA solutions except the single-factor
solution, there was an estimation problem (see the top half of Table 3).
A second, parallel set of four confirmatory factor analyses was computed on scores from the item parcels. Again, the first model contained a
single, general factor. The next two models contained four correlated factors (one for each branch). Items were hypothesized to correspond only to
their respective branches, so that noncorresponding elements of the factor
pattern matrix were fixed at zero. That is, Branch 1 contained item clusters
Table 3
Fit Statistics For Confirmatory Factor Analyses Scales Only (8 Variables)
Fit Statistic
Model
One Factor
Four Oblique
Four Oblique
Plus
Nested
54.22
20
.10
88.95
.78
.85
.076
.93
.87
.51
1,247.25
702
.071
1,488.19
.52
.71
.086
.72
.69
.65
18.24a
14
.041
62.16
.93
.98
.042
.98
.94
.38
1,086.6
696
.058
1,285.73
.58
.79
.080
.76
.73
.67
45.56 a
14
.12
93.01
.82
.86
.07
.94
.83
.36
1,056.18
696
.053
1,216.71
.59
.81
.086
.77
.74
.69
29.20b
12
.089
76.96
.88
.92
.057
.96
.88
.32
941.37
663
.045
1,131.47
.64
.85
.074
.79
.76
.68
Note. aEstimation problem; matrix PHI was not positive definite. bEstimation problem; TD
not identified. n=179. Four Oblique Plus refers to the deliberately misspecified model. Nested refers to the Gignac (2007) parameterization. RMSEA is the root mean squared error of approximation; RMSR is the root mean squared residual; GFI is goodness-of-fit index.
333
from Scales A and E, Branch 2 contained clusters from Scales B and F, and
so on. A second, deliberately misspecified CFA was computed for comparison. In this CFA, items were deliberately assigned to the wrong factor,
so that Branch 1 contained items from Scales A and B, Branch 2 contained
items from Scales C and D, and so on. Finally, a nested factor analysis was
computed that allowed for both a general factor and specific factors corresponding to each of the four branches in the test manual.
Resulting fit statistics are shown in the bottom half of Table 3. As can
be seen in Table 3, the single-factor model and both oblique solutions fit
rather poorly, and the deliberately misspecified model fit slightly better
than the hypothesized model on some fit indices, and slightly worse on
others. The nested solution showed marginally good fit to the data. However, several of the scales showed weak loadings in the factor pattern matrix for either the general factor, the specific factor, or both. Taken together,
the results suggest that there are discrepancies between the authors intentions and the factor structure of the MSCEIT. Confirmatory factor analysis is useful for testing theoretical propositions, but it was not designed to
reveal the structure of the data. Therefore, exploratory factor analysis was
used to better understand the number and nature of the factors underlying the item parcels.
Exploratory factor analyses.Exploratory factor analysis (EFA) was
computed using principal axis factoring with squared multiple correlations as prior communality estimates and promax rotation using SAS
9.1 (SAS Institute, Inc., 1985). There were five eigenvalues greater than
1; a scree plot suggested three and five factors might be reasonable. Rotated factor pattern matrices were examined (standardized regression coefficients) for interpretability for solutions ranging from one to five factors. The four-factor solution, which was found to be most interpretable,
is shown in Table 4. The maximum correlation among factors was .36
(full results are available from the senior author). The four-factor solution
showed total variance (sum of eigenvalues) of 14.07 and final communality estimates totaling 9.27. The variance of each factor after rotation and
eliminating the other factors was 2.34, 2.07, 2.07, and 1.31 for Factors one
through four, respectively.
Discussion
MSCEIT scores for a sample of medical students were examined for
evidence of reliability and factorial validity. Both reliability and validity
analyses showed problems, thus questioning the interpretability of scores
on the MSCEIT. Findings are next considered in greater detail.
Score Reliability
Of the eight scale scores, only two (faces and pictures) showed ac-
334
M. T. Brannick, et al.
Table 4
Promax Rotated Factor Pattern (Standardized Regression Coefficients)
Item
Factor
I
A1 Faces
A2
A3
A4
B1 Facilitation
B2
B3
B4
B5
C1 Changes
C2
C3
C4
C5
C6
C7
D1 Management
D2
D3
D4
D5
E1 Pictures
E2
E3
E4
E5
E6
F1 Sensations
F2
F3
F4
F5
G1 Blends
G2
G3
G4
H1 Relations
H2
H3
.43
II
.71
.66
.68
.48
.32
III
.32
.24
.26
.58
.61
.55
.69
.56
.45
.21
.20
.25
.34
.24
IV
.34
.33
.22
.40
.44
.45
.31
.36
.34
.35
.42
.50
.21
.37
.24
.31
.26
.60
.57
.22
.45
.42
.23
335
336
M. T. Brannick, et al.
scale scores. Scores at the Branch and Scale levels tended to show scores
that were unreliable, incompatible with the observed factor structure, or
both.
One obvious remedy to some of the problems with the MSCEIT is to
lengthen the scales. Correlations among the scale types were low enough
that individual scales may have different relations with criteria of interest,
and the authors suggest that future validation research should examine
whether different scales show different correlations with criteria such as
emotional difficulties, quality of teamwork, records of sales achievement,
and so forth. The authors would expect that the emotions attached to pictures are more culturally laden than emotions attached to faces (at least for
some emotional expressions; Ekman, 1993), and thus it is not surprising to
find separate factors for faces and pictures.
Cross-cultural comparisons of scales by item type would be an interesting avenue for research. Other item types may be useful as well. For example, voices could be used to convey emotions. Methodological research
on consensus scoring and item parcels as they relate to factor analysis
might be fruitful for researchers interested in individual differences. We
chose item parcels based on the way the MSCEIT is scored; however, others might explore alternative parcels (we are indebted to an anonymous
reviewer for this suggestion).
References
337
Matthews, G., Zeidner, M., & Roberts, R. (2002)Emotional intelligence: science and
myth. Cambridge, MA: The MIT Press.
Mayer, J. D., & Salovey, P. (1997)What is emotional intelligence? In P. Salovey & D.
Sluyter (Eds.), Emotional development and emotional intelligence: educational implications. New York: Basic Books. Pp. 3-31.
Mayer, J. D., Salovey, P., & Caruso, D. R. (2002)Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT) users manual. North Tonawanda, NY: Multi-Health Systems.
Mayer, J. D., Salovey, P., Caruso, D. R., & Sitarenios, G. (2003)Measuring emotional
intelligence with the MSCEIT V2.0. Emotion, 3, 97-105.
Mulaik, S. A., & Millsap, R. E. (2000)Doing the four-step right. Structural Equation
Modeling, 7, 36-73.
OSullivan, M. (2007)Trolling for trout, trawling for tuna: the methodological morass in measuring emotional intelligence. In G. Matthews, M. Zeidner, & R. D.
Roberts (Eds.), The science of emotional intelligence. New York: Oxford Univer.
Press. Pp. 258-287.
Palmer, B. R., Gignac, G., Manocha, R., & Stough, C. (2005)A psychometric evaluation of the Mayer-Salovey-Caruso Emotional Intelligence Test Version 2.0. Intelligence, 33, 285-305.
Roberts, R. D., Schulze, R., OBrien, K., MacCann, C., Reid, J., & Maul, A. (2006)Exploring the validity of the Mayer-Salovey-Caruso Emotional Intelligence Test
(MSCEIT) with established emotion measures. Emotion, 6, 663-669.
Rossen, E., Kranzler, J. H., & Algina, J. (2008)Confirmatory factor analysis of the
Mayer-Salovey-Caruso Emotional Intelligence Test V 2.0 (MSCEIT). Personality
and Individual Differences, 44, 1258-1269.
SAS Institute, Inc. (1985)SAS users guide: statistics, Version 5. Cary, NC: Author.
Spector, P. E., & Johnson, H. M. (2006)Improving the definition, measurement, and
application of emotional intelligence. In K. R. Murphy (Ed.), A critique of emotional
intelligence: what are the problems and how can they be fixed? Mahwah, NJ: Erlbaum.
Pp. 325-344.
Zeidner, M., Matthews, G., & Roberts, R. D. (2004)Emotional intelligence in the
workplace: a critical review. Applied Psychology: An International Review, 53, 371399.
Accepted June 15, 2011.
Copyright of Psychological Reports is the property of Ammons Scientific, Ltd. and its content may not be
copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written
permission. However, users may print, download, or email articles for individual use.