You are on page 1of 21

Intelligence 33 (2005) 285 305

A psychometric evaluation of the MayerSaloveyCaruso Emotional Intelligence Test Version 2.0


Benjamin R. Palmera, Gilles Gignacb, Ramesh Manochac, Con Stougha,*
a

Centre for Neuropsychology, Swinburne University of Technology, P.O. Box 218 Hawthorn Victoria 3122, Australia b School of Psychology, Deakin University, Australia c Barbara Gross Research Unit, Faculty of Medicine, University of New South Wales, Australia Received 17 September 2003; received in revised form 8 June 2004; accepted 16 November 2004 Available online 12 February 2005

Abstract There has been some debate recently over the scoring, reliability and factor structure of ability measures of emotional intelligence (EI). This study examined these three psychometric properties with the most recent ability test of EI, the MayerSaloveyCaruso Emotional Intelligence Test (MSCEIT V2.0; Mayer, Salovey, & Caruso, [Mayer, J. D., Salovey, & P., Caruso, (2000). Models of emotional intelligence. In R. J., Sternberg (Ed.). Handbook of intelligence (pp. 396420). New York: Cambridge; Mayer, J. D., Salovey, P., & Caruso, D. R., (2000). The Mayer, Salovey, and Caruso emotional intelligence test: Technical manual. Toronto, ON: MHS]), with a sample (n=431) drawn from the general population. The reliability of the MSCEIT at the total scale, area and branch levels was found to be good, although the reliability of most of the subscales was relatively low. Consistent with previous findings, there was a high level of convergence between the alternative scoring methods (consensus and expert). However, unlike Mayer et al.s [Mayer, J. D., Salovey, P., Caruso, D. R., & Sitarenios, G. (2003). Measuring emotional intelligence with the MSCEIT V2. 0. Emotion, 3, 97105.] contentions, there was only partial support for their four-factor model of EI. A model with a general first-order factor of EI and a three first-order branch level factors was determined to be the best fitting model. There was no support for the Experiential Area level factor, nor was there support for the Facilitating Branch level factor. These results were replicated closely using the Mayer et al. [Mayer, J. D., Salovey, P., Caruso, D. R., & Sitarenios, G., (2003). Measuring emotional intelligence with the MSCEIT V2. 0. Emotion, 3, 97105.] data. The results are discussed in light of the close comparability of the two scoring methods.

* Corresponding author. Tel.: +61 3 9214 8167; fax: +61 3 9214 5230. E-mail address: cstough@swin.edu.au (C. Stough). 0160-2896/$ - see front matter D 2004 Elsevier Inc. All rights reserved. doi:10.1016/j.intell.2004.11.003

286

B.R. Palmer et al. / Intelligence 33 (2005) 285305

Furthermore, the fundamental limitations of the MSCEIT V2.0, with respect to the inadequate number of subscales theorized to measure each branch level factor are identified and discussed. D 2004 Elsevier Inc. All rights reserved.
Keywords: Emotional intelligence; Emotional competencies; Emotions; Factor structure; Reliability

1. Introduction Very rarely do psychological constructs receive as widespread attention as the recently conceptualised construct of emotional intelligence (EI). EI has appeared on the cover of Time magazine (Gibbs, 1995), is the topic of the most widely read social science book in the world (Goleman, 1995) and many other popular books, magazine and newspaper articles (Mayer, Salovey, & Caruso, 2000). It has been argued to be a different construct to intellectual intelligence, and, thus, may add to knowledge relevant to individual differences (Mayer et al., 2000), and possibly offer unique predictive validity in wide variety of instances (Goleman, 1995). Mayer and Salovey (1997) have conceptualised emotional intelligence (EI) as a set of mental abilities concerned with emotions and the processing of emotional information. With such it has been argued that the most valid assessment of EI will be gained from ability-based scales that involve (like other tests of mental ability), items for which there are more and less correct answers, that assess individuals capacity to reason with and about emotions (Mayer, Caruso, & Salovey, 2000). Over a series of studies Mayer et al. have designed and examined the reliability and validity of a number of ability-based measures of EI (Mayer, Caruso, & Salovey, 1999; Mayer, DiPaolo, & Salovey, 1990; Mayer & Geher, 1996). This work has culminated in their most recent ability-based test of EI, the MayerSaloveyCaruso Emotional Intelligence Test (MSCEIT, Mayer et al., 2000). Independent psychometric evaluations of the MSCEIT are few in number as it has only been available for a short period. However, there are conceptual, developmental and correlational criteria inherent within the theoretical framework of the ability model from which it can be evaluated. Furthermore, there are research findings recently put forth by the authors (Mayer, Salovey, Caruso, & Sitarenios, 2003), and research findings with previous measures against which it can be compared (Ciarrochi, Chan, & Caputi, 2000; Mayer et al., 1999; Roberts, Zeidner, & Matthews, 2001). 1.1. Conceptual, developmental and correlational criteria for ability measures of EI Mayer and Saloveys (1997) ability model of EI comprises four conceptually related abilities arranged hierarchically from the most basic to the more psychologically complex, including: (1) the ability to perceive emotions; (2) the ability to utilise emotion to facilitate reasoning; (3) the capacity to understand the meaning of emotions and the information they convey; and (4) the ability to effectively regulate and manage emotion. Within this hierarchical organisation, the abilities are proposed to develop sequentially implying that they are a function of age and cognitive maturation (Mayer & Salovey, 1997). Consistent with this theoretical framework, measures of EI such as the MSCEIT are expected to; (a) show a positive manifold of correlations amongst the subscales designed to assess these four major areas; (b) a consistent factor structure that comprises a general factor of EI and four correlated primary factors; and (c) show

B.R. Palmer et al. / Intelligence 33 (2005) 285305

287

age related differences that reflect the developmental perspective of the model (Mayer et al., 1999). As evidence that EI is an intelligence, in addition to the above, ability measures are expected to positively correlate with established measures of mental ability (as mental abilities typically do), such as those that index individuals verbal intelligence (Mayer et al., 1999). Finally, EI has been theoretically related to several important life criteria that ability measures are expected to predict (Mayer, Salovey, & Caruso, 2000; Salovey & Mayer, 1990). These include variables such as psychological well-being, life satisfaction, empathy, the quality of interpersonal relationships, success in occupations that involve considerable reasoning with emotions (e.g., leadership, sales and psychotherapy) and scholastic and academic success. Mayer et al. (Mayer et al., 1990, 1999, 2003; Mayer & Geher, 1996) and others (e.g., Ciarrochi et al., 2000; Roberts et al., 2001), have assessed the validity of ability measures of EI according to these conceptual, developmental and correlational criteria. Research with the predecessor measure to the MSCEIT, the Multi-factor Emotional Intelligence Scale (MEIS; Mayer et al., 1999), has provided preliminary evidence that EI meets some of the underlying conceptual, developmental and correlational criteria of Mayer and Saloveys (1997) ability model (Ciarrochi et al., 2000; Mayer et al., 1999; Roberts et al., 2001). This research has shown for example, that the four abilities measured form a positively interrelated set, and that they correlate with other measures of established ability e.g., verbal IQ (Mayer et al., 1999), and general intelligence (Roberts et al., 2001). The research by Mayer et al. (1999) also demonstrated age related differences with an adult criterion group scoring significantly higher on the MEIS than did an adolescent criterion group. Research with the MEIS has also demonstrated that scores on the test are meaningfully correlated with theoretically related criteria such as life satisfaction, empathy and parental warmth (Ciarrochi et al., 2000; Mayer et al., 1999). Importantly, the study by Ciarrochi et al. (2000), demonstrated that scores on the MEIS were related to criterion measures (e.g., life satisfaction) even after controlling for IQ and personality. Although these studies (Ciarrochi et al., 2000; Mayer et al., 1999; Roberts et al., 2001) provided promising evidence for the validity of the MEIS, the findings also revealed some psychometric problems with the test. Specifically, problems concerning the validity of the scoring methods (expert and consensus), the unacceptably low levels of internal consistency for some of the subscales, as well as the facture structure of the MEIS. 1.2. The MayerSaloveyCaruso Emotional Intelligence Test The MSCEIT (Mayer et al., 2000) has been designed to improve upon the MEIS in these three areas (scoring, reliability and factor structure). Mayer et al. (2000) have developed a more well-founded expert scoring criterion for the MSCEIT, have attempted to build upon the reliability of the test at the subscale level through item selection methods and to develop a test with a facture structure more consistent with the underlying theory of EI through a collection of new subscales. While the expert scoring criterion for the MEIS was based on responses to the test set by Mayer and Caruso, the expert scoring criterion for the MSCEIT is based on responses to the test items from 21 members of the International Society of Research in Emotion (ISRE). Recent analyses by Mayer et al. (2003) of the MSCEIT standardization data (n=2112) demonstrated a higher level of convergence between expert and consensus scoring methods (r=.908) than that found with the MEIS. Mayer et al. also reported that there was higher inter-rater reliability in identifying correct alternatives to the test items amongst the expert group than a matched sample from the standardization group. Additionally, the standardization group as a whole obtained significantly higher scores on the Emotional Perception and

288

B.R. Palmer et al. / Intelligence 33 (2005) 285305

Emotional Understanding subscales when scored with the expert scoring criterion than when scored with the consensus scoring criterion. Thus, Mayer et al. concluded that the expert scoring method may provide a more accurate criterion for identifying correct answers to the test items, particularly in the areas where the scientific study of emotion may have provided the expert group greater institutionalised knowledge concerning emotions. That is, emotional perception and emotional understanding, given that a great deal of emotion research has focused on coding emotional expressions (e.g., Ekman & Friesen, 1975; Scherer, Banse, & Wallbott, 2001), and delineating emotional understanding (e.g., Ortony, Clore & Collins, 1988; cf. Mayer et al., 2003). Thus, the high level of convergence between the expert and consensus scoring methods found with the MSCEIT (Mayer et al., 2003) need to be replicated. If the magnitude of correlation between the two methods is replicated, such findings may refocus, as stated by Mayer, Salovey, Caruso, and Sitarenios (2001), questions concerning the validity of the scoring protocols to bWhat does consensus mean?Q, and bIs this form of determining a correct answer much different than that used in cognitive intelligence tests?Q (p. 236). 1.3. Reliability and factor structure of the MSCEIT Like the MEIS, the MSCEIT has been designed to assess the four conceptually related abilities of Mayer and Saloveys (1997) ability model of EI. Scores on the MSCEIT represent three categories; (1) an Overall EI score reflecting a general level of EI; (2) two area scores, Experiencing EI reflecting the ability to identify emotions and to assimilate emotions in thought, and Strategic EI reflecting the ability to understand and manage emotions; and (3) four branch scores (each measured by two subtests) that assess the four primary abilities of Mayer and Saloveys model. Reliability analyses of the MSCEIT with the standardization sample suggest that it has good internal consistency at the full-scale, area and branch level. Mayer et al. (2003) report split-half reliabilities ranging from r=.93 to r=.91 at the full-scale level, split-half reliabilities ranging from r=.90 to r=.86 at the area level, and split-half reliabilities ranging from r=.91 to r=.76 at the branch level (according to the consensus and expert scoring criteria). The reliability of the eight individual subscales were higher than those of the MEIS (ranging from a=.64 to a=.88), however, approximately half of the subscales have coefficient alphas below the a=.7 criterion (Mayer et al., 2003). Factor analyses of the MSCEIT suggest that its factor structure better represents the underlying theory of Mayer and Saloveys (1997) ability model. Mayer et al. (2003) assessed whether 1, 2 and 4 (oblique correlated) factor models of the MSCEIT provided a statistically significant fit with the standardization data via structural equation modeling. Mayer et al. report that the general factor model, two factor Experiential and Strategic models, and four primary factor models, were all found to exhibit reasonably good model fit statistics, suggesting that each model provides viable representations of the tests underlying factor structure. However, it has been demonstrated recently that the close-fit statistics reported by Mayer et al. are inaccurate. Specifically, the NFI, TLI and RMSEA values all overestimate the degree of fit for each model. Based on the re-analyses of Gignac (in press), only the four-factor model was associated with good fit. However, to effect an acceptable CFA four-factor solution, Mayer et al. specified that b. . .the two within-area latent variable covariances (i.e., between Perceiving and Facilitating, and between Understanding and Managing) were additionally constrained to be equal so as to reduce a high covariance between the Perceiving and Facilitating Branch scoresQ (p. 103). The consequences of

B.R. Palmer et al. / Intelligence 33 (2005) 285305

289

specifying such a constraint will be explored in this investigation, based both on the current studys data, as well as the expert scored correlation matrix reported in Mayer et al. (p. 102).1 1.4. Summary Published research findings with the MSCEIT suggest that its psychometric properties are considerably better than those of its predecessor the MEIS, particularly with respect to the scoring, reliability and the factor structure of the test. The level of convergence between the consensus and expert scoring methods further demonstrates that more and less correct answers to the test items may exist, and findings with the expert criterion suggest that more and less correct answers to the test may exist with respect to a more objective criterion (particularly for Emotional Perception and Understanding). Moreover, the overall quality of the MSCEIT appears to have adequate levels of reliability, and a factor structure more consistent with the underlying theory (Mayer et al., 2003; the CFA errors identified in Gignac (submitted) notwithstanding). Nonetheless, these findings need to be replicated, particularly given that the MSCEIT represents an entirely new collection of tasks and items (Mayer et al., 2003). 1.5. Objectives of the present study In their initial research study with the MSCEIT, Mayer et al. (2003) examined three of its fundamental psychometric properties; (1) the level of convergence between the consensus and expert scoring methods; (2) the reliability of the MSCEIT; and (3) its factor structure. The current study similarly examines the relationship between the consensus and expert scoring methods, and the reliability and factor structure of the MSCEIT with an Australian general population sample. The current study also expands upon Mayer et al.s original work by examining the relationship between consensus scores determined with Mayer et al.s. standardization data and consensus scores determined with the present sample, and their respective relationships with expert-based scores. In addition, the current study examines; (a) differences in MSCEIT scores according to gender; and (b) the relationship between scores on the MSCEIT and age. The objective of these analyses was to determine the replicability of Mayer et al.s findings, and to examine the extent to which the consensual norms determined with Mayer et al.s. standardization data provide a relevant scoring criterion for other Western societies, specifically the Australian population. On the basis of Mayer et al.s (2003) research findings it was hypothesised that; (1) there would be a strong relationship between consensus and expert-based scores on the MSCEIT; (2) that the sample would obtain somewhat higher test scores when scored with the expert scoring method on the Perceiving and Understanding branch scores reflecting the higher inter-rater reliability and superiority of the expert scoring criterion found by Mayer et al.; (3) that the MSCEIT would exhibit high internal consistency reliability at the full-scale and branch level; (4) that the factor analytic results obtained by Mayer et al. would be replicated; (5) that females would obtain significantly higher test scores than males as has been found with the MEIS (Ciarrochi et al., 2000; Mayer et al., 1999); and (6) that there would be positive
The CFA analyses reported in Mayer et al. (2003) are based on the consensus scored data, and, consequently, the re-analyses performed by Gignac (in press) were based on the consensus scored portion of the correlation matrix (Mayer et al., 2003, p. 102). However, in this investigation, all of the CFA analyses were based on the data derived from the expert scoring system, because of our belief that the expert scoring system is a more valid method.
1

290

B.R. Palmer et al. / Intelligence 33 (2005) 285305

relationships between scores on the MSCEIT and age, supporting previous research with the MEIS demonstrating age related differences (Mayer et al., 1999) and those reported in the MSCEIT technical manual (Mayer et al., 2000).

2. Method 2.1. Participants The sample comprised of 450 participants (297 females, 150 males, 4 unreported) ranging in age from 18 to 79 years with a mean age of 37.39 years (S.D.=14.13). The participants were drawn from the general population across the two most populated Australian states, Victoria and New South Wales via advertisements. None of the participants in this study are represented in the Mayer et al.s (2000) MSCEIT standardization sample. The ethnic composition of the sample was diverse comprising; 62% (279) White Caucasian Australians; 17% (71) White Caucasian Emigrants; 8% (38) Asian/Pacific Islanders; and 13% (61) others/not reported. The levels of education amongst the sample were also diverse; 2.2% (10) reported to have completed primary school education only; 22% (99) reported to have completed secondary school education only; 17% (76) reported to have completed a tertiary certificate/ diploma; 24% (106) reported to have completed an undergraduate degree; and 19% (86) reported to have completed a postgraduate degree; (16% (73) not reported). In summary although there was a gender imbalance in the sample it was relatively diverse with respect to age, ethnicity and levels of education.

3. Materials 3.1. The MayerSaloveyCaruso Emotional Intelligence Test (MSCEIT) Participants completed the MSCEIT Research Version 1.1 (RV1.1), a 292-item test comprising of 12 subscales designed to measure the four major abilities of Mayer and Saloveys (1997) ability model of EI. For the purposes of replication however, the test publisher Multi-Health Systems (MHS), scored the present data according to the MSCEIT Version 2 (V2) scoring algorithms. The MSCEIT V2 was comprised from the MSCEIT RV 1.1 by reducing the number of test items and subscales. More specific details concerning this reduction procedure can be found in the MSCEITs technical manual (Mayer et al., 2000). Data that represents the MSCEIT V2 can be obtained from the MSCEIT RV 1.1 (as done by the current study) because no new items were created for MSCEIT V2, thus MSCEIT V2 is essentially inherent within MSCEIT RV 1.1. Moreover, although the MSCEIT V2 was designed to b. . .make the test-taking experience smoother for the test-takerQ (Mayer et al., 2000, p. 64), and to increase the research and practical utility of the test by making it shorter, Mayer et al. report a high degree of correspondence between the two forms. For example, correlations between factor-based scores of the MSCEIT RV 1.1 and V2 are reported to range from r=.96 for Understanding Emotions to r=.80 for Emotional Perception (Mayer et al., 2000). Thus although the test used in the present study may have been somewhat more laborious for the participants to complete, data collected with the MSCEIT RV 1.1 can be scored using the MSCEIT V2 scoring algorithms to produce representative data. MHS returned

B.R. Palmer et al. / Intelligence 33 (2005) 285305

291

the data-base with raw responses that represented MSCEIT V2 data (141 items), consensus scored item and subscale scores and expert scored item and subscale scores. 3.2. The MSCEIT V2 The 141-item MSCEIT V2 comprises 8 subscales, 2 pertaining to each of the four branches of the ability model (Mayer & Salovey, 1997), that is, (1) Perceiving Emotions (Perceiving), (2) Using Emotions to Facilitate Thought (Facilitating), (3) Understanding Emotions (Understanding) and (4) Managing Emotions (Management). Each subscale comprises a number of so-called ditem parcelsT that contain a number of individual items. For example, in the Faces subscale, participants view photographed facial expressions (each photograph representing an item parcel), and are asked to indicate the extent to which different emotions (which form the individual items for a parcel) are inherent in the facial expressions presented. In the Faces test five individual items combine to form an item parcel as they each ask about a different emotion related to the same face. Some subscales contain free-standing items in that they comprise only one response per stimulus. To reduce correlated measurement error and to ensure that results generalise across response methods, response formats are varied across the subscales (Mayer et al., 2003). Perceiving Emotions is measured by the Faces and Pictures subscales. In the Faces test participants view four photographed faces and are asked to indicate the degree to which five specific emotions are inherent in the stimulus on a five point rating scale. The Pictures test is similar except that different landscapes and abstract designs are presented as the stimuli and the response scale consists of cartoon faces depicting varying degrees of the specific emotions. Facilitation is measured by the Sensations and Facilitations subscales. In the Sensations test participants are asked to imagine certain emotions and to indicate the extent to which they match different sensations (e.g., imagine feeling frustrated, how much is the feeling of frustration like the following sensations; hot, slow, green, etc.). In the Facilitation test participants are asked to indicate the extent to which certain emotions would assist cognitive tasks or behaviours (e.g., the extent to which contentment, fear and happiness might be helpful to feel when negotiating with a salesperson to reduce the price on a product). Understanding Emotions is measured by the Blends and Changes subscales. The Blends test, asks participants to identify emotions that combine to form more complex feelings (e.g., that sadness, guilt and regret combine to form (a) grief, (b) annoyance, (c) depression, etc.). The Changes test participants are required to identify emotions that result from the intensification of certain feelings (e.g., a person felt more and more ashamed and began to feel worthless, then the person felt (a) overwhelmed, (b) depressed, (c) ashamed, etc.). Finally, Managing Emotions is measured by the Emotional Management and Emotional Relationships subscales. In the Emotional Management test participants are asked to indicate how effective certain actions might be in regulating certain moods and emotions (e.g., reducing anger, prolonging joy, keeping frustration at bay). Similarly, in the Emotional Relationships test participants are asked to indicate how effective the actions of a person might be in regulating or managing the emotions of another person. For more detailed information about each of the subscales of the MSCEIT refer to the technical manual (Mayer et al., 2000). 3.3. Scoring The MSCEIT yields 8 subscale scores determined by summing the weights for each item (as determined by either the consensus or expert scoring method discussed in the Introduction); 4 branch

292

B.R. Palmer et al. / Intelligence 33 (2005) 285305

scores determined by summing the two corresponding subscale scores that measure each branch; 2 area scores Experiential EI (that represents individual differences in the lower order abilities, perceiving and facilitating emotions in thought, determined by summing their respective branch scores), and Strategic EI (that represents individual differences in the higher order abilities, understanding and managing emotions); and an Overall score, that represents individuals general emotional intelligence analogous to IQ. MHS scored the present data using the consensus weights determined from the MSCEITs general standardization sample which are referred to as the American Consensus Scores (n=2112, Mayer et al., 2003) and the expert weights determined from the sample of experts drawn from the ISRE which are referred to as the Expert Scores. For the purpose of comparison we also determined consensus weights using the present sample via the procedure outlined by Mayer et al. (2000), which are referred to as the Australian Consensus Scores. 3.4. Procedure Participants responding to advertisements about the study collected pencil and paper MSCEIT item booklets and scannable answer sheets and were briefed about the purpose of the study. The briefing sessions were conducted according to the instructions for remote administrations of the MSCEIT as outlined in the technical manual (Mayer et al., 2000) with the administrator emphasising that the MSCEIT was to be completed independently without input from others and in its entirety. Upon completion participants returned the item booklets and answer sheets, and received a small stipend for participating. 3.5. Statistical analyses All non-CFA analyses were performed using SPSS 11.5. The CFA analyses were performed using AMOS 5.0 (Arbuckle, 2003). All CFA analyses were based on Maximum-Likelihood Estimation (MLE) and covariance matrices. The correlation matrix reported in Mayer et al. (2003, p. 102) was converted into a covariance matrix using the standard deviation information reported in Table 1 of Mayer et al. (2003, p. 102) and the dMCONVERTT command in SPSS. In accordance with Hu and Bentler (1999), a combination approach will be used to evaluate CFA model fit. Specifically, two absolute close-fit indices (SRMR and RMSEA) and two incremental close-fit indices (CFI and TLI) will be reported. Absolute close-fit values b.06 and incremental close-fit values in the area of .95 or larger will be considered satisfactory (Hu & Bentler, 1999). Differences in model fit between competing models that are nested within each other will be tested using the Chi-square difference test (Steiger, Shapiro, & Browne, 1985).

4. Results Prior to conducting the analyses a missing value analysis was performed to evaluate the validity of the participants responses. Mayer et al. (2000) consider participants responses to be invalid if 10% or more of a given subscales items are missing. The missing value analyses found 20 of the participants responses to be invalid as per this criterion and were omitted from the data. As such, results reported hereafter are based on the sample for which all data at the subscale level were complete.

B.R. Palmer et al. / Intelligence 33 (2005) 285305

293

4.1. Descriptive statistics The reliabilities, means and standard deviations of the present sample according to the American and Australian Consensus criteria, and expert criterion are presented in Table 1. The means and standard deviations of the present sample were highly comparable across the three scoring methods and very similar to those reported by Mayer et al. (2003). Moreover the present sample was found to obtain somewhat higher test scores according to the expert scoring method in the areas where Mayer et al. reported a higher degree of convergence amongst the expert group (i.e., Perceiving and Understanding Emotions). For example, the mean Understanding branch score of the present sample was significantly higher according to Expert Scores than the American Consensus Scores t(430)=41.53, pb.001, Cohens d=.91. This finding provides further evidence that the expert scoring criteria may prove superior to the general consensus in these areas. The correlation between the mean scores of the present sample and those reported by Mayer et al. (2003), was r=.748, and r=.893 for Australian Consensus and expert-based scores respectively, confirming that the profile of mean scores for both scoring criteria was highly comparable with the previously reported results, a finding consistent with other studies in the area (Roberts et al., 2001).

Table 1 Means, standard deviations and reliabilities for the MSCEIT V2 according to American Consensus Scores, Expert Scores and Australian Consensus Scores MSCEIT subscale Perceiving Faces Pictures Facilitating Facilitation Sensations Understanding Changes Blends Managing Management Relationships Experiencing EI Strategic EI Overall EI ACSa M .47 .48 .44 .42 .45 .38 .52 .54 .49 .40 .39 .40 .44 .46 .45 S.D. .08 .09 .10 .05 .08 .06 .07 .08 .08 .06 .05 .09 .06 .06 .05 rd .90 .80 .86 .73 .63 .49 .71 .63 .50 .76 .59 .55 .91 .78 .91 AUSCb M .47 .50 .44 .43 .46 .41 .53 .53 .52 .41 .41 .42 .45 .47 .46 S.D. .08 .10 .09 .06 .08 .07 .07 .08 .10 .06 .06 .10 .05 .06 .05 rd .90 .81 .86 .80 .64 .54 .73 .60 .56 .74 .61 .53 .91 .79 .91 ESc M .49 .52 .46 .40 .42 .39 .62 .63 .60 .41 .39 .42 .45 .51 .48 S.D. .11 .15 .11 .05 .06 .07 .11 .12 .14 .07 .06 .11 .06 .07 .06 rd .89 .84 .85 .67 .48 .48 .69 .60 .54 .66 .48 .51 .90 .76 .89

The means and standard deviations reported are unscaled as per Mayer et al. (2003). a American Consensus Scores. b Australian Consensus Scores. c Expert Scores. d Coefficient alpha reliabilities are reported at the subscale level due to item homogeneity and split-half reliabilities (with Spearman Brown correction) are reported at the Branch, Area and Overall test levels due to item heterogeneity as per Mayer et al. (2003).

294

B.R. Palmer et al. / Intelligence 33 (2005) 285305

4.2. Reliability Table 1 also reports the internal consistency reliability of the MSCEIT at the full-scale (Overall EI), area, branch and subscale level. Split-half reliabilities coefficients (with Spearman Brown correction) have been determined for the full-scale, area and branch scores as the items that combine to form these scores are heterogeneous, while coefficient alphas have been determined for the 8 subscales as the items at the subscale level all share the same response format (as per Mayer et al., 2003). As shown in Table 1, the MSCEIT full-scale split-half reliabilities were high according to the American and Australian Consensus Scores, and Expert Scores, respectively. The split-half reliabilities across the area and branch scores were also good, although the reliability coefficients of the expert-based scores were somewhat lower than the consensus-based scores, particularly for the Facilitating and Managing Branch scores that were below the criteria of .70. The reliabilities of the MSCEIT subscales were varied ranging from a high of a=.86 for the Pictures subscale to a low of a=.48 for the Facilitation, Sensations and Management subscales. In general, the reliabilities of the MSCEIT subscales were somewhat lower than those reported by Mayer et al. (2003). Collectively the findings of the current study suggest that while the reliabilities of some of the MSCEIT subscales are unacceptably low, the MSCEIT is reliable at the full-scale, area and branch level. 4.3. EI and age Although there were no significant relationships between Overall MSCEIT scores and age, there were a number of small relationships found between the MSCEIT subscales and age in the present sample. Across all three scoring criteria there was a small negative correlation between age and scores on the Faces subscale (ranging from r=.12, pb.05 to r=.101, pb.05; according to American Consensus and Expert Scores respectively). In contrast, there were small positive relationships between age and scores on the Facilitation subscale (ranging from r=.193, pb.001 to r=.16, pb.01 according to the Australian Consensus and Expert Scores), and the Management subscale (ranging from r=.13 pb.01 to r=.12, pb.05 according to the American Consensus and Expert scores). These findings are consistent with those reported in the MSCEITs technical manual (Mayer et al., 2000). While these correlations were all significant, the magnitude of the relationships suggests that there is very little increase in EI associated with age in the present adult population sample. 4.4. EI and gender Table 2 presents descriptive statistics for the MSCEIT branch and Overall EI scores for female and male participants. Consistent with previous findings, females were found to score significantly higher than males on the MSCEIT scores according to all three scoring criteria. According to expert-based scores, female participants scored significantly higher than the male participants by approximately half a standard deviation (e.g., for Overall EI, F(1,427)=44.80, pb.001, d=.65). Similarly, according to the Australian and American Consensus weights, female participants scored approximately two thirds of a standard deviation above the male participants ( F(1,427)=54.97, pb.001, d=.73; F(1,427)=51.23, pb.001, d=.69) respectively. These findings are consistent with those found with the MEIS (Ciarrochi et al., 2000; Mayer et al., 1999), and those reported in the MSCEIT technical manual (Mayer et al., 2000).

B.R. Palmer et al. / Intelligence 33 (2005) 285305 Table 2 Descriptive statistics for MSCEIT branch and overall EI scores by gender MSCEIT Perceiving EI Femalesc Malesd Facilitating EI Femalesc Malesd Understanding EI Femalesc Malesd Managing EI Femalesc Malesd Overall EI Femalesc Malesd
a b c d

295

Expert scores M .51 .47 .41 .39 .63 .58 .42 .38 .49 .45 S.D. .10 .11 .04 .06 .01 .13 .07 .07 .05 .07

Am cona M .48 .45 .43 .39 .53 .50 .41 .37 .46 .43 S.D. .072 .086 .047 .064 .062 .086 .055 .065 .039 .056

Aus conb M .48 .45 .45 .41 .54 .50 .43 .39 .47 .44 S.D. .07 .09 .05 .07 .07 .09 .06 .07 .04 .06

American Consensus Scores. Australian Consensus Scores. Females n=286. Males n=143.

4.5. MSCEIT intercorrelations Table 3 presents the intercorrelations amongst the Australian Consensus and expert-based scores on the 8 subscales and Overall MSCEIT score. Consistent with previous findings (Mayer et al., 2003) and the theory of EI that the MSCEIT has been designed to assess (Mayer & Salovey, 1997), there was a positive manifold of correlations amongst the
Table 3 Intercorrelations, means and standard deviations for amongst American Consensus (above the diagonal) and expert-based (below the diagonal) MSCEIT subscale scores 1 1. Faces 2. Pictures 3. Facilitation 4. Sensations 5. Changes 6. Blends 7. Management 8. Relationships Mean S.D. .94 .34 .14 .29 .18 .17 .17 .24 .52 .15 2 .33 .98 .26 .24 .14 .14 .18 .17 .46 .11 3 .14 .28 .92 .26 .25 .19 .25 .21 .42 .06 4 .32 .24 .26 .96 .35 .22 .27 .31 .39 .07 5 .24 .19 .24 .35 .96 .51 .30 .31 .63 .12 6 .22 .15 .17 .20 .50 .94 .27 .28 .60 .14 7 .22 .20 .34 .28 .26 .21 .88 .32 .39 .06 8 .29 .16 .21 .28 .32 .28 .34 .94 .42 .11 Mean .48 .44 .45 .38 .54 .49 .39 .40 S.D. .09 .10 .08 .06 .08 .08 .05 .09

All correlations in the table are statistically significant at the pb.01 level, and correlations above .15 are significant at the pb.001 level. The correlation between consensus and expert-based scores for each subscale is presented in boldface down the main diagonal of the table respectively.

296

B.R. Palmer et al. / Intelligence 33 (2005) 285305

subscales according to the different scoring methods. In addition, each subscale correlated mostly highly with its sister subscale with which it combines (e.g., the Faces and Pictures subscales which measure Perceiving emotions), with the exception of the subscales measuring Facilitating Emotions (which also exhibited low reliability). Very strong relationships were also found between scores determined by the different scoring methods. American and Australian Consensus determined Overall EI scores correlated r=.99, with correlations amongst the subscales ranging from r=.88 for the Management subscale to r=.98 for the Pictures subscale. This level of convergence suggests that the American Consensus weights determined from the MSCEIT standardization sample (Mayer et al., 2003) may be cross culturally applicable at least within other Western societies. Similarly, Australian consensus and expert-based Overall EI scores correlated r=.97, while correlations between Australian consensus and expert-based subscale scores ranged from r=.86 to r=.99, as shown in Table 3. Finally, American Consensus and expert-based Overall EI scores correlated r=.97, with correlations between subscale scores ranging from r=1.0 for the Sensations subscale to r=.88 for the Management subscale. The correlation between the two correlation matrices (i.e., the correlation matrix below the diagonal-Australian consensus, and the correlation matrix above the diagonal-expert as shown in Table 3) was also very high (r=.93), suggesting a high degree of correspondence between the pattern of intercorrelations based on these two scoring criteria. Collectively, these findings suggest that the two scoring criteria for the MSCEIT are highly related, a finding consistent with those reported previously (Mayer et al., 2003). Indeed, the correlation between the American Consensus Scores correlation matrix of the present study, and that reported by Mayer et al. (2003) was r=.60, and the correlation between the expert-based scores correlation matrix of the present study and that reported by Mayer et al. was r=.74. The replicability of the intercorrelations amongst the MSCEIT subscales across two quite different data sets (particularly those generated with the expert criterion), suggests that they are robust. This property was also demonstrated with the MEIS (Roberts et al., 2001). 4.6. Confirmatory factor analyses As can be seen in Table 5, for the current studys sample, the general factor model (model 1) yielded a v 2 =90.22, as well as non-satisfactory levels of fit, according to all of the close-fit indices (e.g., (20) TLI=.815). However, all of the factor loadings on the general factor were all positive and statistically significant, ranging in size from .37 to .64 (see Table 4). Similarly, the data from Mayer et al. (2003) yielded a v 2 =505.55 and a TLI of .873 for model 1, (20) which is below the demarcation criterion for acceptable fit (Hu & Bentler, 1999). However, as was the case for the data from the current study, the factor loadings from the general factor were all positive and statistically significant, ranging in size from .36 to .68. Thus, there was some indication that a general factor existed within the covariation of the MSCEIT subscales based on both samples. For the data from the current study, the oblique two-factor model (model 2) yielded a v 2 =63.72, (19) which was statistically significantly better fitting than the general factor model (Dv 2 =26.52, pb.001). (1) However, according to the close-fit indices (see Table 5), model 2 was not well fitting (e.g., TLI=.876). The correlation between Area 1 and Area 2 was .72 ( pb.001). For model 2, the data from the Mayer et al. (2003) study produced a v 2 =260.21, which was (19) statistically significantly better fitting than the general factor model (Dv 2 =245.34, pb0.01). However, (1) the close-fit indices indicated that model 2 was not a well fitting model (e.g., TLI=.910). The correlation between the two factors was .74 ( pb.001).

B.R. Palmer et al. / Intelligence 33 (2005) 285305 Table 4 MSCEIT 2.0 standardized parameter estimates (MLE) for one-, two- and four-factor (oblique) models (expert scoring) Current study Model 1 Model 2 dgT Faces Pictures Facilitation Sensations Changes Blends Management Relationships Factor correlations .40 .37 .42 .55 .64 .56 .51 .53 .48 .46 .45 .61 .70 .62 .50 .51 Br.1 Br.2 Br.3 Br.4 Model 3 .59 .58 .44 .58 .80 .64 Mayer et al. (2003) Model 1 Model 2 .36 .48 .62 .44 .68 .66 .55 .66 .58 .66 .47 .59 .71 .50 .70 .69 .68 .67 Br.1 Br.2 Br.3 Br.4 Model 3 .54 .66 .66 1 .48 .77 .75

297

Area l Area 2 Br.1 Br.2 Br.3 Br.4 dgT

Area 1 Area 2 Br.1 Br.2 Br.3 Br.4

.74 .74 1.0 .94 1.0 .51 .76 1.0 .53 .80 .76 1.0

1.0 .78 1.0 .37 .70 1.0 .58 .90 .70 1.0

The standard errors associated with the unstandardized factor loadings (available upon request) ranged as follows: current study sample: model 1=.003.008; model 2=.003.008; model 3=.003.010; Mayer et al. (2003) sample: model 1=.002.004; model 2=.002.004, model 3=.002.005. All factor loadings were statistically significant ( pb.001).

Model 3 (four oblique factors), based on the current studys data, yielded a v 2 =16.65, which was (14) not significant statistically ( p=.276), indicating that the model was excellent fitting (e.g., TLI=.990). As can be seen at the bottom of Table 4, the correlations between the factors were all positive and large, ranging in size from .37 to .90. The correlation of .90 between Branch 2 and Branch 4 suggested the
Table 5 Exact and close-fit statistics/indices (MLE) for the MSCEIT models Model Current sample 0 Null model 1 General factor 2 2 Area factors 3 4 Branch factors 4a Nested Area 5 Nested Branch Mayer et al. (2003) 0 Null model 1 General factor 2 2 Area factors 3b 4 Branch factors 4 Nested Area 5 Nested Branch
a b

v 2 (df) 558.86 90.22 63.72 16.65 14.01 23.88 (28) (20) (19) (14) (13) (16)

CFI .000 .868 .916 .970 .998 .985

TLI .000 .815 .876 .990 .996 .974

RMSEA .205 .088 .072 .020 .013 .033

SRMR .230 .057 .049 .023 .021 .030

3972.55 505.55 260.21 38.31 14.18 161.47

(28) (20) (19) (14) (12) (16)

.000 .877 .939 .994 .999 .963

.000 .828 .910 .988 .999 .935

.266 .111 .080 .030 .010 .068

.304 .062 .041 .018 .009 .040

The error variance for the Changes subscale was constrained to equal .0001 in accordance with Chen et al. (2001). This solution yielded a non-positive definite matrix; however, the fit values and parameter estimates were all within an acceptable range, consequently, the results of the solution were reported for thoroughness, despite the fact that the model was considered unacceptable.

298

B.R. Palmer et al. / Intelligence 33 (2005) 285305

possibility that the two factors were measuring the same construct. Consequently, a supplementary analysis was performed, allowing the Facilitating subtests and the Managing subtests to load onto a single factor (thus, an oblique three-factor model), which resulted in a v 2 =21.58, p=.201, indicating (17) that the model was excellent fitting, and not worse fitting that the oblique four-factor model (Dv 2 =4.95, (3) p=.175). Thus, the four-factor model was not considered an appropriate model. Stated alternatively, there was no statistical justification for a unique distinction of the Branch 2 factor, independent of the Managing factor. All of the factor loadings and inter-factor correlations were positive and statistically significant (available upon request). As can be seen in Table 5, model 3 (four oblique factors) for the Mayer et al. (2003) data produced a 2 v (14)=38.31, pb.001 and was associated with close-fit index values that indicated very good fit (e.g., TLI=.988). However, as was the case with the current studys data set, one of the correlations between the factors (Branch 1 and 2) was very high .94. Furthermore, the model was associated with a nonpositive definite matrix, rendering the solution inadmissible. Consequently, a supplementary oblique three-factor model was tested, which allowed the four subtests from Branches 1 and 2 to load onto a single factor. The oblique three-factor model resulted in a v 2 =103.35, pb.05, CFI=.978, TLI=.964, (17) RMSEA=.051 and SRMR=.031, which indicated a very good fit. Thus, as was the case with the current
1 1
Experiential EI 'g'

1
Strategic

Faces 1

Pictures 1 1

Sensations 1 1 Facilitating

Facilitation 1

Change 1

Blends 1

Emot. Mng. 1

Emot. Relat. 1 1

Model 4

1
EI 'g' 1 Understanding

Perceiving

Managing

e1

e1

e2

e2

e3

e3

e4

e4

Faces 1

Pictures 1

Sensations 1

Facilitation 1

Change 1

Blends 1

Emot. Mng. 1

Emot. Relat. 1

Model 5

Fig. 1. General EI factor models of the MSCEIT V2.0, with nested Area level factors (model 4) and nested Branch level factors (model 5).

B.R. Palmer et al. / Intelligence 33 (2005) 285305

299

studys sample, there was evidence against the notion that the MSCEIT subscales measured four different factors. All factor loadings and inter-factor correlations were positive and statistically significant (available upon request). Although the oblique three-factor models yielded acceptable model fit, as well as positive and statistically significant factor loadings for all subtests, it was noted that it was the Facilitating Branch, in both cases, that was demonstrated not to be measuring a unique factor. However, in the current studys sample, the Facilitating Brach was found to be most strongly related to the Management Branch, whereas with the Mayer et al. (2003) data, the Facilitating Branch was found to be most strongly related to the Perceiving Branch. To test more directly and clearly the hypothesis that the Facilitating Branch did not measure any unique construct related variance, independent of a general factor and the other Branches, supplementary models were tested based on nested factor modeling (Gustafsson & Balke, 1993; Mulaik & Quartetti, 1997). That is, a first-order general factor and two first-order Area level factors were specified in model 4 (see Fig. 1). Thus, in effect, this nested factor model (model 4) measured the factors from model 1 and model 2, simultaneously. The second nested factor model (model 5) consisted of a first-order general factor and four first-order Branch level factors (see Fig. 1). Thus, in effect, this nested factor model (model 5) measured the factors from model 1 and model 3, simultaneously. For the purpose of identification, in model 5, the respective factor loadings for each Branch level first-order factor were constrained to be equal, in accordance with Little, Lindenberger, and Nesselroade (1999, pp. 204205). The general factor model with two nested Area factors (model 4) produced a v 2 =14.01, p=.375, (13) indicating that the model was excellent fitting. However, as can be seen in Table 6, two of the factor loadings on Strategic Area factor (Area 2) were not significant statistically. Specifically, the subscales from the Managing Branch (Emotional Management and Emotional Relationships) had factor loadings of .04 and .06 on the Area factor they were designed to load upon according to Mayer et al. (2003). Thus, although this model was marginally better fitting than the oblique four-factor model (TLI=.996 vs. .990), it could not be considered an appropriate model, because of the two non-significant factor loadings on the Area 2 factor (see Table 6).
Table 6 MSCEIT 2.0 standardized parameter estimates (MLE) for nested factor models based on current study and Mayer et al. (2003) correlation matrix (expert scoring) Current study Model 4 dgT Faces Pictures Facilitation Sensations Changes Blends Management Relationships .35 .28 .40 .54 .61 .48 .54 .58 Area l .31 .78 .18 .13 .79 .27 .04 .06 Model 5 Area 2 dgT .40 .37 .45 .59 .56 .45 .51 .53 Br.1 .44 .44 .00 .00 .51 .51 .22 .22 Br.2 Br.3 Mayer et al. (2003) Model 4 Br.4 dgT .26 .39 .56 .37 .71 .67 .70 .73 .54 .48 .34 .34 .41 .25 .15 .25 Model 5 Br.1 .41 .41 .04 .04 .45 .45 .40 .40 Br.2 Br.3 Br.4 .38 .51 .67 .47 .62 .59 .62 .62 Area 1 Area 2 dgT

The standard errors associated with the unstandardized factor loadings (available upon request) ranged as follows: current study sample: model 4=.003.029; model 5=.00346977.21; Mayer et al. (2003): model 4=.002.012; model 5=.002.020. All standardized factor loadings in bold were not significant statistically ( pN.05).

300

B.R. Palmer et al. / Intelligence 33 (2005) 285305

When this same nested model (model 4) was tested on the Mayer et al. (2003) correlation matrix, the results were very similar. Specifically, although the model was excellent fitting based on the Chi-square test (v 2 =14.18, p=.289), the Emotional Management and Emotional Relationships subscales did not yield (12) theoretically acceptable values on their respective Area level factor (i.e., .15 and .25, respectively). Thus, this model was not considered acceptable. For the current studys data, the nested factor model with a general factor and four Branch level factors (model 5) produced a v 2 =23.88, p=.092, and CFI=.985, TLI=.974, RMSEA=.033 and (16) SRMR=.030, indicating excellent fit. However, the factor loadings for the Facilitation factor (Branch 2) were effectively .00 (S.E.=46977.21), and, consequently, this model could not be considered appropriate.
1 EI 'g' .37 .46 .62 .41 .39 .51 .40 .46 .51 1 Understanding .50
.35

1 Perceiving .48 .60

1 Managing

.42

.21

Facilitation 1

Sensations 1

Faces 1

Pictures 1

Changes 1

Blends 1

Emotion Manag. 1

Emotion Relation. 1

Model 6 (current study's data) 1 EI 'g' .30 .49 .72 .42 .55 .53 .51 .52 .55 1 Understanding .52
.51

1 Perceiving .41 .59

1 Managing

.58

.44

Facilitation 1

Sensations 1

Faces 1

Pictures 1

Changes 1

Blends 1

Emotion Manag. 1

Emotion Relation. 1

Model 6 (data from Mayer et al., 2003)

Fig. 2. Final general EI factor and nested factor models of the MSCEIT V2.0.

B.R. Palmer et al. / Intelligence 33 (2005) 285305

301

A final supplementary model was tested (model 6), which was identical to model 5, with exception of the removal of the Branch 2 factor. Furthermore, a covariance link between the Branch 3 and Branch 4 factors was added to the model to reflect Mayer et al.s (2003) contention that they measure a similar Area level factor (i.e., Strategic). This modified model produced a v 2 =20.25, p=.209, indicating that (16) the model was excellent fitting. The close-fit index values were, of course, also very impressive (CFI=.992, TLI=.986, RMSEA=.024 and SRMR=.026). Furthermore, as can be seen in Fig. 2 (model 6: current studys data), all of the factor loadings on the general factor were positive and statistically significant, ranging in size from .39 to .62. The factor loadings for the nested factors were also positive and statistically significant. The standard errors associated with the unstandardized factor loadings ranged from .003 to .008. The correlation between the nested Understanding (Branch 3) and Managing (Branch 4) factors was .35, pb.001. An examination of the standardized residual covariance matrix did not identify any values greater than j2.0j, nor were there any modification index values greater than 4.0. The results from model 5 based on the Mayer et al. (2003) covariance matrix produced very similar results. Specifically, the model was associated with a v 2 =161.47, pb.001, and CFI=.963, TLI=.935, (16) RMSEA=.068 and SRMR=.040, indicating moderately satisfactory fit. However, as was the case with the current studys sample, the subscales from the Facilitating branch (Branch 2) did not form a statistically significant factor, each exhibiting factor loadings of .04 (see Table 6). The standard errors associated with the unstandardized factor loadings ranged from .002 to .020. Thus, the same modified model that was tested above, with the removal of the Branch 2 factor, and the addition of a covariance link between the nested Brach 3 (Understanding) and Brach 4 (Managing) factors was tested. This model produced a v 2 =69.35, pb.001, and was associated with close-fit index values which indicated very (16) good fit: CFI=.986, TLI=.976, RMSEA=.041 and SRMR=.025. As can be seen in Fig. 2 (model 6: Mayer et al. (2003)), all of the factor loadings for the general factor were positive and statistically significant, ranging in size from .42 to .72. Furthermore, the factor loadings for the nested factors were also positive and statistically significant (see Table 6). The standard errors associated with the unstandardized factor loadings ranged from .002 to .004. The correlation between the Understanding and Managing factors was .51, pb.001. Thus, based on model fit and sensible parameter estimates, the general factor model with a nested orthogonal Perceiving (Branch 1) factor and oblique Understanding (Branch 3) and Managing (Branch 4) factors was determined to be the model that best characterized the covariance between the eight MSCEIT 2.0 subscales.

5. Discussion The results of the current study associated with the consensus and expert scoring replicated those reported by Mayer et al. (2003), suggesting close comparability between the two methods. Furthermore, there does appear to be a positive manifold and a concomitant general factor of EI within the MSCEIT V2.0. However, the factor structure of the MSCEIT V2.0 does not appear to reflect the four-factor model postulated by Mayer and Salovey (1997) and ostensibly demonstrated empirically in Mayer et al. (2003). The convergence (or lack thereof) between the consensus and expert scoring criteria was identified as one of the most central psychometric issues facing the predecessor to the MSCEIT, the MEIS (Roberts et al., 2001). In recognition of such, Mayer et al. sought to improve the validity of the expert scoring criterion for the MSCEIT, by employing a more optimal group to serve as experts (Mayer et al., 2000).

302

B.R. Palmer et al. / Intelligence 33 (2005) 285305

In the present study, consensus and expert determined scores were highly correlated (r=.97) replicating the relationship between the two scoring methods previously reported by Mayer et al. (2003; i.e., r=.98). Moreover, there was a strong relationship between the pattern of intercorrelations based on the two scoring criteria (r=.93), further illustrating the relatively high degree of correspondence between them. The present sample also achieved higher expert-based test scores in comparison to consensus-based test scores in areas where the expert group have been previously found to demonstrate higher inter-rater reliability in identifying correct answers (Mayer et al., 2003; i.e., Perceiving and Understanding emotions). This result further substantiates the findings of Mayer et al. (2003), that the expert criterion is superior to the consensus criterion in terms of determining more and less correct test answers (at least in the areas where research has possibly established clear criteria for answers) and Mayer et al.s. conclusion that the expert criterion may be the criterion of choice for ability tests if such findings are further replicated. Finally, there was a strong relationship (r=.993) between scores determined using American Consensus Weights and Consensus Scores determined with the present sample (Australian Consensus Weights), suggesting that American Consensus Weights may be applicable as a scoring criterion for use in research and applied contexts within other Western societies. The theory underlying the MSCEIT purports that for EI to be considered dintelligenceT measures of the construct should meet three standard intelligence criteria; (1) the variables measured need to be operationalized as abilities; (2) the variables should show a positive manifold of correlations; and (3) there should be age related differences. The findings of the current study support the first and second of these criteria. The variables measured have been operationalized as abilities (Roberts et al., 2001) and a positive manifold of correlations amongst the subscales was found according to all three scoring methods consistent with previous findings (Mayer et al., 2003). The age related criteria was not supported, while there were small significant relationships between scores on the MSCEIT and age, the magnitude of these relationships suggest that there were very little differences in EI associated with age in the adult range of 18 to 79. Thus, the non-effect observed in this investigation may be due to the absence of children and adolescents in the sample. Indeed Mayer et al. (1999) have found that adult criterion groups tend to exhibit higher EI than adolescent criterion groups. However, it should be noted that possible relationships between scores from the MSCEIT and age are potentially confounded by the possibility that the MSCEIT may measure aspects of both fluid and crystallized intelligence, which have been argued to both decrease and increase, respectively, within the adult lifespan tested in this investigation (e.g., Kaufman, Reynolds, & McLean, 1989). Thus, more research is required to decompose the nature of the relationship between EI and intellectual intelligence, for the purposes of developing firm hypotheses about the possible relationship between EI and age. Consistent with previous research females were found to outperform males in EI by approximately half a standard deviation (Ciarrochi et al., 2000; Mayer et al., 1999). Importantly, this finding was consistent across the two scoring methods. In their evaluation of the MEIS, Roberts et al. (2001) found that gender differences in EI varied as a function of the scoring criteria used. The consistency of gender differences across the scoring methods found in the present study further illustrates the level of convergence between the two scoring methods for the MSCEIT. Mayer et al. (1999) acknowledge that no studies bear on whether this difference is biological or environmental. Research on EI in general has yet to address gender differences in more detail, despite the potential incremental understanding about the nature of the construct that might result. A better understanding about the nature and causes of gender differences in EI might, for example, help delineate the underlying biology, heritability and nature of cultural and environmental influences on EI.

B.R. Palmer et al. / Intelligence 33 (2005) 285305

303

In general, the reliability of the MSCEIT was found to be consistent with that reported previously (Mayer et al., 2003), according to all scoring methods. Split-half reliabilities for the Overall scale, Area and Branch scores were high for consensus-determined scores and relatively good for expert-based scores. However, somewhat lower reliability coefficients (ranging from a high of a=.86 to a low of a=.48) were found across the subscale scores in the present study. It should be noted that in the present study, MSCEIT V 2.0 data were carved out of participants responses to MSCEIT V1.1 (by MHS) that is more than twice the size of V2.0 (i.e., 292 items verses 141). Although the items are identical, participants completed a longer version of the test, thus, more research on new data, using the shorter MSCEIT V2.0 are needed, in order to more fully substantiate the tests reliability. It should also be noted that the dRelationshipsT subscale from the MSCEIT V2.0 consists of only nine items. Thus, it is perhaps not surprising that the dRelationshipsT subscale has lower reliability (a=.51) than the dPicturesT subscale (a=.85), which has 30 items. Thus, the findings of the current study suggest that some further revising of the MSCEIT, aimed at establishing acceptable internal consistency reliability across all the subscales, may be required. In this context, it may be necessary for the test creators to add several internally consistent items to the current subscales, particularly those subscales from the Using Emotions to Facilitate Thought, Understanding Emotions and Managing Emotions Branches. The results of the confirmatory factor analyses determined that a general factor dominated the covariance between the eight MSCEIT 2.0 subscales. There was some evidence for the existence of a nested, Strategic, Area level factor, based on the positive correlation between the two Branch level (3 and 4) factors. In contrast, there was little evidence in favour of the existence of an Experiential, Area level factor. More specifically, the Facilitation and the Sensations subscales do not appear to have any construct variance, beyond that which they share with the general EI factor. It could be argued that the nested Branch level factors should not be considered factors at all, because they do not meet the minimum of three variables required to identify/define a factor (Gorsuch, 1997, p. 545), nor do they meet the minimum of four indicators required to test the hypothesis of a common factor properly (Mulaik & Millsap, 2000, p. 40). That is, a factor with two variables could (and perhaps should) be considered a bi-variate correlation. In fact, the branch level nested factor model could have been modeled by adding residual covariance links between each respective branch level subscale residuals. The disadvantage with the correlated residual approach is that the residuals cannot be intercorrelated with each other. Thus, in this study, two-indicator factors were modeled to test the hypothesis that the branch level factors would correlate with each other positively, suggesting the existence of Area level factors. In both samples, positive correlations were found between the Understanding and Managing Branch bfactorsQ, suggesting the possible existence of a Strategic Area level factor. It is well known that models with dtwo-indicator factorsT tend to create problems with respect to the production of non-positive definite matrices (Bentler & Jamshidian, 1994; Worthke, 1993), which are technically inadmissible in SEM. Mayer et al. (2003) stated, bIn the four factor solution only, the two within-area latent variable covariances (i.e., between Perceiving and Facilitating, and between Understanding and Managing) were additionally constrained to be equal so as to reduce a high covariance between Perceiving and Facilitating branch scoresQ (p. 103). Unfortunately, Mayer et al. did not report the factor correlations obtained in their study. However, based on the analyses performed in this study, the correlation between Branch 1 and Branch 2 was very large (.94), which may have had an impact with respect to the observation of a non-positive definite matrix. Thus, it may be conjectured that

304

B.R. Palmer et al. / Intelligence 33 (2005) 285305

Mayer et al. added the constraints to their model to yield a positive definite matrix. Based on our analyses, the non-constrained model produced parameter estimates that were within sensible range values, and, thus, the model was interpreted without the constraints. Ultimately, the prospect of measuring Branch level factors via the MSCEIT is seriously compromised by the lack of subscales designed to measure each factor. For the purpose of proper identification, a minimum of three subscales for each branch would be required to meet the dthree-indicator ruleT (Bollen, 1989)2. Thus, a total of 12 subscales (3 subscales*3 first-order factors) should probably be considered a substantial improvement, for the purposes of testing via CFA the dFour-Branch Model of Emotional IntelligenceT postulated by Mayer and Salovey (1997) (see also Gignac, in press). Although Mayer et al. (2003) acknowledge the lack of reliability within the subscales, and, consequently, recommend the use of the MSCEIT by calculating Total scale scores, Area level scores and Branch level scores, the results of this investigation suggest that only one of the Area level scores should be calculated and interpreted (i.e., Strategic), and only three of the four Branch level scores should be calculated and interpreted: Perceiving, Understanding and Managing (but not Facilitating).

6. Conclusion The MSCEIT V2.0 may be viewed as an improvement over the MEIS, with respect to the time required to administer the test, as well as an increase in the convergence between the two scoring methods: consensus and expert. However, there remain some significant problems with respect to internal consistency at the subscale level. Furthermore, and perhaps relatedly, the Facilitating Branch level factor, as well as the Experiential Area level factor, does not appear to be measured by the current version of the MSCEIT. The addition of valid items to the current subscales, as well as the creation of more subscales in general, could be argued to be the necessary next step in the development of the MSCEIT, for the purpose of measuring the theoretical model of EI proposed by Mayer and Salovey (1997).

References
Arbuckle, J. L. (2003). AMOS 5.0. Chicago7 SmallWaters. Bentler, P. M., & Jamshidian, M. (1994). Gramian matrices in covariance structural models. Applied Psychological Measurement, 18(1), 79 94. Bollen, K. A. (1989). Structural equations with latent variables. New York7 Wiley. Chen, F., Bollen, K. A., Paxton, P., Curran, P. J., & Kirby, J. B. (2001). Improper solutions in structural equation modeling. Sociological Methods and Research, 29(4), 468 508. Ciarrochi, J. V., Chan, A. Y. C., & Caputi, P. (2000). A critical evaluation of the emotional intelligence construct. Personality and Individual Differences, 28, 539 561.

Readers will note that the four oblique factor model (model 3) proposed by Mayer et al. (2003) and tested in this study did not meet the dthree-indicator ruleT. However, it did meet the dtwo-indicator ruleT (Bollen, 1989, p. 244), and, thus, was identified for the purpose of SEM. Nonetheless, identification problems and non-positive definite matrix issues, as was observed in this investigation, and discussed more thoroughly in Gignac (in press), are much less likely to occur when each factor has a minimum of three variables (see Bollen, 1989, chapter seven, for a more detailed discussion).

B.R. Palmer et al. / Intelligence 33 (2005) 285305

305

Ekman, P., & Friesen, W. V. (1975). Unmasking the face: A guide to recognizing the emotions from facial cues. Englewood Cliffs, NJ7 Prentice Hall. Gibbs, N. (1995, October 2). The EQ factor. Time Magazine, 146(14), 60 68. Gignac, G. E. (in press). Evaluating the MSCEIT V2.0 via CFA: Corrections to Mayer et al., 2003. Emotion. Goleman, D. (1995). Emotional Intelligence. New York7 Bantam Books. Gorsuch, R. L. (1997). Exploratory factor analysis: Its role in item analysis. Journal of Personality Assessment, 68(3), 532 560. Gustafsson, J. -E., & Balke, G. (1993). General and specific abilities as predictors of school achievement. Multivariate Behavioral Research, 28(4), 407 434. Hu, L., & Bentler, P.M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1 55. Kaufman, A. S., Reynolds, C. R., & McLean, J. E. (1989). Age and WAIS-R intelligence in a national sample of adults in 20- to 74-year age range: A cross-sectional analysis with educational level controlled. Intelligence, 13, 235 253. Little, Lindenberger, & Nesselroade (1999). On selecting indicators for multivariate measurement and modeling with latent variables: when bgoodQ indicators are bad and bbadQ indicators are good. Psychological Methods, 4(2), 192 211. Mayer, J., Caruso, D. R., & Salovey, P. (2000b). Selecting a measure of emotional intelligence: The case for ability scales. In R. Bar-On, & J. D. A. Parker (Eds.), The handbook of emotional intelligence (pp. 320 342). San Francisco7 Jossey-Bass. Mayer, J., & Salovey, P. (1997). What is emotional intelligence? In P. Salovey, & D. Sluyter (Eds.), Emotional development and emotional intelligence: Implications for educators (pp. 3 31). New York7 Basicbooks. Mayer, J. D., Caruso, D., & Salovey, P. (1999). Emotional intelligence meets traditional standards for an intelligence. Intelligence, 27, 267 298. Mayer, J. D., DiPaolo, M., & Salovey, P. (1990). Perceiving affective content in ambiguous visual stimuli: A component of emotional intelligence. Journal of Personality Assessment, 54, 772 781. Mayer, J. D., & Geher, G. (1996). Emotional intelligence and the identification of emotion. Intelligence, 22, 89 113. Mayer, J. D., Salovey, P., & Caruso (2000c). Models of emotional intelligence. In R. J. Sternberg (Ed.), Handbook of intelligence (pp. 396 420). New York7 Cambridge. Mayer, J. D., Salovey, P., & Caruso, D. R. (2000a). The Mayer, Salovey, and Caruso Emotional Intelligence Test: Technical manual. Toronto, ON7 MHS. Mayer, J. D., Salovey, P., Caruso, D. R., & Sitarenios, G. (2001). Emotional intelligence as a standard intelligence. Emotion, 1, 232 242. Mayer, J. D., Salovey, P., Caruso, D. R., & Sitarenios, G. (2003). Measuring emotional intelligence with the MSCEIT V2. 0. Emotion, 3, 97 105. Mulaik, S. A., & Millsap, R. E. (2000). Doing the four-step right. Structural Equation Modeling, 7(1), 36 73. Mulaik, S. A., & Quartetti, D. A. (1997). First order or higher order general factor? Structural Equation Modeling, 4(3), 193 211. Ortony, A., Clore, G. L., & Collins, A. M. (1988). The cognitive structure of emotions. Cambridge7 Cambridge University Press. Roberts, R. D., Zeidner, M., & Matthews, G. (2001). Does emotional intelligence meet traditional standards for an intelligence? Some new data and conclusions. Emotion, 1, 196 231. Salovey, P., & Mayer, J. D. (1990). Emotional intelligence. Imagination, Cognition and Personality, 9, 185 211. Scherer, K. R., Banse, R., & Wallbott, H. G. (2001). Emotion inferences from vocal expression correlate across languages and cultures. Journal of Cross-Cultural Psychology, 32, 76 92. Steiger, Shapiro, & Browne, M. K. (1985). On the multivariate asymptotic distribution of sequential chi-square statistics. Psychometrika, 50(3), 253 264. Worthke, W. (1993). Nonpositive definite matrices in structural equation modelling. In K. A. Bollen, & J. S. Long (Eds.), Testing structural equation models (pp. 256 293). Newbury Park, CA7 Sage.