You are on page 1of 12

HEALTH EDUCATION RESEARCH Vol.

21 (Supplement 1) 2006
Theory & Practice Pages i73–i84
Advance Access publication 3 October 2006

Introducing multidimensional item response modeling


in health behavior and health education research

Diane D. Allen and Mark Wilson*

Abstract assess the effectiveness of such programs, re-


searchers will need to consider that some of these

Downloaded from http://her.oxfordjournals.org/ by guest on November 17, 2015


When measuring participant-reported attitudes mediators may be multidimensional (i.e. have
and outcomes in the behavioral sciences, there more than one dimension). In creating scales or
are many instances when the common measure- surveys to examine such mediators, researchers
ment assumption of unidimensionality does not want to have sufficient items for adequate reliability
hold. In these cases, the application of a multi- (i.e. a high proportion of the variance considered
dimensional measurement model is both techni- true rather than attributable to error) but not so
cally appropriate and potentially advantageous many that they overburden participants. Interpret-
in substance. In this paper, we illustrate the ing a single score across all items (i.e. conducting
usefulness of a multidimensional approach to
a composite analysis) may result in adequate re-
measurement using an empirical example taken
liability at the expense of loss of information about
from the Behavior Change Consortium. Data
the separate dimensions. Interpreting separate
from the Treatment Self-Regulation Question-
scores on each of the dimensions (i.e. conducting
naire have been analyzed to investigate whether
a consecutive analysis) may improve information
self-regulation can be regarded as a single con-
struct, or if it has multiple dimensions based on about each subscale, but at the expense of lower
the type of regulation or motivation that partic- reliability for the subscales than for the composite
ipants say helps them consider an improvement scale. One solution to this dilemma, and the focus
in healthy behavior. Comparison with con- of this paper, is to conduct a simultaneous or
secutive analyses shows the advantages of mul- ‘multidimensional’ analysis of the subscales.
tidimensional measurement for interpreting One proposed mediator of behavior change is
participant-reported data. examined with the Treatment Self-Regulation Ques-
tionnaire (TSRQ) [1]. Figure 1 illustrates a way to
The question of dimensionality understand the structure of this scale. Subdividing
Health education and behavioral change programs a construct like self-determination can potentially
directed toward smoking cessation, diet improvement continue until each item represents a single di-
or maintenance of adequate physical activity mension. However, the number of dimensions
involve complex mediators of behavior change. To depends on both the substantive meaning of these
subscales and the number of items needed to define
them with sufficient reliability. One of the authors
of the TSRQ confirmed for us the substantive
Graduate School of Education, University of California,
Berkeley, CA 94720, USA
subscales shown in Fig. 1 for this scale and the
*Correspondence to: M. Wilson. items associated with each subscale (G. C. Williams,
E-mail: markw@calmail.berkeley.edu personal communication, 26 August 2004).

Ó 2006 The Author(s). doi:10.1093/her/cyl086


This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/
licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is
properly cited.
D. D. Allen and M. Wilson

Self-
Determination:
Autonomous
regulation of
behavior

Autonomous Amotivated Controlled


reasons for reasons for reasons for
healthy healthy healthy
behavior behavior behavior

Downloaded from http://her.oxfordjournals.org/ by guest on November 17, 2015


Identified Integrated Intrinsic Introjected Externally
reasons for reasons for reasons for reasons for Controlled
healthy healthy healthy healthy reasons for
healthy
behavior behavior behavior behavior behavior

items items items items items items

Fig. 1. Example of a multidimensional instrument: the TSRQ and its reasons for healthy behavior.

The number of items and the statistical model The purpose of this paper is to illustrate the
employed to analyze data resulting from a scale like usefulness of a multidimensional approach to
the TSRQ can have an impact on the reliability measurement by comparing these three different
estimate and the interpretation of its subscales. In analyses of the TSRQ [1]. We use data from the
this paper, we consider three methods to analyze the Behavior Change Consortium (BCC) which was
TSRQ: (i) A ‘composite’ analysis examines self- a group of 15 projects designed to explore how
determination based on a single score incorporating people manage changes in health behaviors [2]. Al-
all items across subscales for each participant; (ii) A though several of the BCC projects used the TSRQ,
‘consecutive’ set of analyses examines each subscale this paper targeted data from a project promoting
alone as a separate subscale of reasons for healthy diet improvement and smoking cessation. We first
behavior and analyzes data from these subscales one describe the TSRQ and the context of our data
at a time; and (iii) A ‘multidimensional’ analysis within the BCC project. The following section
examines the different subscales simultaneously, briefly introduces and describes a multidimensional
incorporating the responses to all the items in- version of the item response model described in
dicating reasons for healthy behavior on the whole other papers in this supplement [3, 4]. This is
scale and capitalizing on the correlations between followed by a summary of results obtained using
subscales. The actual use of the TSRQ most closely the multidimensional model, and comparisons with
follows the consecutive method with average scores consecutive and composite analyses of the data
on subscales reported separately. The composite arising from the TSRQ.
method is added somewhat artificially here for
illustrative purposes since it is commonly employed The TSRQ
with many scales used in health education research Self-determination, the construct underlying the
and the behavioral sciences. TSRQ, means one chooses behavior for one’s

i74
Introducing multidimensional item response modeling

own reasons rather than basing choices on other sured by demands and contingencies; controlled
people’s expectations or desires. It is defined as items are divided into introjected (IJ) and externally
deciding on one’s actions independent of others. controlled (EC) reasons for healthy behavior.
People who are self-determined feel free to do what Amotivated (AMOT) behavior lacks an identified
is interesting, personally important and vitalizing motive, e.g. ‘I really don’t think about it’. Thus,
(http://www.scp.rochester.edu/SDT/). Evidence for out of a single construct of self-determination, six
Self-Determination Theory (SDT) as applied to distinct item sets evolved (see Fig. 1) that might
choices for healthy living would include indications then function as separate dimensions. Figure 2
that greater self-determination or ‘self-regulation’ is depicts the items and their associated dimensions
associated with an increase in and maintenance of within the TSRQ.
healthy behaviors [5]. Our data set, however, only The data from the TSRQ for this paper came
included pre-intervention data. from the University of Rochester BCC project. This

Downloaded from http://her.oxfordjournals.org/ by guest on November 17, 2015


The TSRQ assesses self-determination by asking project had two foci: (i) testing a SDT model of
how people regulate their behaviors regarding maintained smoking cessation and diet improve-
health, whether the behavior in question is diet im- ment and (ii) testing an intervention against a ‘usual’
provement or smoking cessation [1]. Self-regulation treatment [5]. In the intervention, a counselor
is thought to be a continuum anchored by ‘auton- contacted participants four to six times to educate
omous’ or self-determined reasons for behavior and promote autonomous motivation for change,
at one end and ‘controlled’ reasons for behavior at and support for change if attempted. Smoking
the other [5]. Autonomous behavior ‘has an internal adults were recruited through physicians’ offices
perceived locus of causality and is experienced and insurers, and through announcements and
as chosen and volitional’ [6, p. 768]; autonomous advertisements. Once recruited, participants were
items are divided into identified (ID), integrated stratified by amount of low-density lipoprotein
(INT) and intrinsically motivated (INTR) reasons cholesterol in their lipid profiles. For those with
for healthy behavior. Controlled behavior has an higher cholesterol, diet counseling was incor-
external causality, such that individuals feel pres- porated along with counseling about smoking

The reason I would (eat a healthy diet/stop smoking) is:

1. ID Because I feel that I want to take responsibility for my own health.


2. IJ Because I would feel guilty or ashamed of myself if I did not eat a healthy
diet/stop smoking.
3. ID Because I personally believe it is the best thing for my health.
4. EC Because others would be upset with me if I did not.
5. AMOT I really don’t think about it.
6. INTR Because I have carefully thought about it and believe it is very important for
many aspects of my life.
7. IJ Because I would feel bad about myself if I did not eat a healthy diet/stop
smoking.
8. INT Because it is an important choice I really want to make.
9. EC Because I feel pressure from others to do so.
10. AMOT Because it is easier to do what I am told than think about it.
11. INT Because it is consistent with my life goals.
12. EC Because I want others to approve of me.
13. ID Because it is very important for being as healthy as possible.
14. EC Because I want others to see I can do it.
15. AMOT I really don’t know why.

Choices offered:
Not at all somewhat true very true
1 2 3 4 5 6 7
Fig. 2. The TSRQ items and the dimensions to which the authors attribute them.

i75
D. D. Allen and M. Wilson

cessation. Participants completed the TSRQ for Table I. Characteristics of the participants in this data set
both diet improvement and smoking cessation; only (n = 931)
baseline data were analyzed for this paper (n = 931). Participant characteristics n %a
The participants in this project completed 30
Gender 928
items [5]: 15 each related to diet improvement and
Male 333 36
smoking cessation (Fig. 2). The six dimensions Female 595 64
were represented with different numbers of items: Race 928
ID had six, INT had four, INTR had two, IJ had White, not Hispanic 765 82
four, EC had eight and AMOT had six. Each item Black, not Hispanic 121 13
Hispanic 20 2
had seven response categories with the following
Other 22 2
designations: 1 = not at all, 4 = somewhat true, 7 = Age 930
very true. Response categories were recoded 0–6

Downloaded from http://her.oxfordjournals.org/ by guest on November 17, 2015


18–37 235 25
for ease of analysis. The characteristics of the 38–45 228 25
participants who completed the TSRQ for diet 46–52 223 24
53–82 244 26
improvement and smoking cessation are shown
Marital status 929
in Table I. Not married 229 25
Presently married 355 38
A multidimensional item response model Living with partner 87 9
The potential usefulness of multidimensional item Divorced 177 19
Separated 50 5
response models has been recognized for many
Widowed 31 3
years in the psychological and educational literature Education level 781
[7–11]. Development of these models is well- Less than high school diploma 68 9
covered elsewhere [12, 13], including exploration High school diploma 228 29
of the consequences of looking at multidimensional Some college, trade, vocational or 294 38
technical school
data as if it is a single dimension [14]. Despite this
Four years college/graduated 88 11
development in the methodological literature, the Post-graduate work 103 13
application of multidimensional item response Income (per year) 901
models in the behavioral sciences has been limited. $0–39 999 524 58
This has probably been due (i) to the statistical $40 000–79 999 290 32
$80 000+ 87 10
problems that have been involved in fitting such
models and (ii) to the difficulty associated with a
Percent values may not add to 100 for some characteristics
interpreting multidimensional item response mod- because of rounding.
els. The statistical problems of multidimensional
item response models have been addressed with the situations involving differential item functioning
development of the multidimensional random co- (e.g. see Baranowski et al., this volume [15]), raters
efficients multinomial logit (MRCML) model [12]. and other measurement features. The MRCML
The MRCML model is an extension of the Rasch is built up from a basic conceptual building block.
family of item response models, and, in particular, To illustrate this building block, assume that an
is a direct extension of the model described Item i with ordered categories of response (e.g.
elsewhere in this supplement by Wilson et al. [3]. strongly disagree, disagree, agree, strongly agree)
It was developed to have sufficient flexibility to indexed by k helps to measure a single unique
represent a wide range of Rasch family models, dimension d (d = 1, ..., D). (If the item is intended to
including those that apply to scales having either measure only one dimension, the model is labeled
yes/no or Likert-type responses. The MRCML multidimensional ‘between items’. If the item is
models can also apply to scales having multiple intended to measure two or more dimensions, the
items arising from the same stem or more complex model is labeled multidimensional ‘within items’

i76
Introducing multidimensional item response modeling

[12]. The latter model is beyond the scope of this df = 145, p < 0.0001, so it was used throughout
paper.) The relationship of this item and response the analyses.) In this technique, log response
to a participant’s level of attitude on dimension probabilities are summed up over items and partic-
d is depicted in Equation 1. Here, the probability ipants into a likelihood function. Maximum likeli-
that a participant’s response is in category k of Item hood estimates and asymptotic standard errors are
i (Pik) rather than category k1 (Pik1) is related to found iteratively using the alternating estimation-
the level of that participant’s attitude on that maximization algorithm, and person estimates are
dimension (hd) and the relative difficulty of cate- calculated using an Expected A Posteriori (EAP) ap-
gory k (dik) to endorse with that level of agreement: proach [18, 19]. A detailed discussion of parameter
estimation in the MRCML models can be found
logðPik =Pik1 Þ = hd  dik : ð1Þ in Adams et al. [12].

Downloaded from http://her.oxfordjournals.org/ by guest on November 17, 2015


Moreover, each participant has several hd values, Modeling self-determination
one for each dimension of attitude measured by the
We chose to illustrate the multidimensional analy-
scale: u = (h1, ..., hD), where the dimensions are
sis using the six dimensions of the TSRQ, as
allowed to be non-orthogonal (i.e. correlated). For
described above. It would also be possible to group
example, in the TSRQ, the dimensions as depicted
the autonomous, controlled and amotivational items
in Fig. 1 are subcomponents of participant self-
as three dimensions as is usually done in scoring the
determination, allowing participants to have dif-
TSRQ, or the diet improvement and smoking
ferent levels of endorsement for ‘autonomous’,
cessation items as two separate dimensions, but
‘amotivated’ or ‘controlled’ reasons for healthy
we chose to examine these, as there were interesting
behavior. Besides providing a way to model the
patterns among the six dimensions. To enhance our
several dimensions of the TSRQ, the MRCML
illustration, we compare three approaches with the
models allow the researcher to determine whether
analysis of these data: the composite, consecutive
participants use the response categories consistently
and multidimensional approaches.
across all items within a dimension (as in a ‘rating
scale model’) or differently across different items Composite approach
(as in a ‘partial credit model’) [16].
In the composite approach, the sum of scores re-
The data for creating the MRCML model ceived on the 30 items, ranging between 0 and 180,
building blocks consist of the responses all the could be treated as the indicator for a single estimate
participants make to all the items in the scale. The of participants’ perception of self-determination.
researcher specifies which MRCML model to try, The probability of a response in one of the seven
and then a complex estimation procedure—now categories, as opposed to the previous category, for
greatly simplified by appropriate software—gener- each of the 30 TSRQ items is given by:
ates the numerical values for the items and
logðPik =Pik1 Þ = h  dik ; ð2Þ
participants on a log odds or ‘logit’ scale. These
estimated values maximize the fit to the specified where h and d are as defined above.
model. The estimated values include the item Analyzing the data compositely with the soft-
parameters and the population means, variances ware package ConQuest [17] produces estimates
and covariances of the hD parameters as related to for a total of 181 parameters—180 item and step
the scale. For this paper, we fit a partial credit difficulties (one average item difficulty and five
MRCML model to the data, using marginal max- estimated steps from one response category to the
imum likelihood (MML) estimation as imple- next for each item; with seven categories, items are
mented in the ConQuest software [17]. (The constrained to have only six identifiable parameters)
partial credit model fit better than the rating scale and one population variance. In the analysis, we
model using a likelihood ratio test, v2 = 521.4, constrained the population mean to zero so that all

i77
D. D. Allen and M. Wilson

item parameters could be estimated while ensuring dimension had one variance, one parameter for
parameter identification. each item and one parameter for each estimated
The composite approach has the advantage of step. For example, for the ID dimension, we had
parsimony in modeling the participant perspective, one variance, six item parameters and 30 step para-
summarizing self-determination with a single meters (six items 3 five steps). For INT, we had one
number and its associated standard error. Further- variance, four item parameters and 20 step param-
more, the reliability of participant self-determination eters. Likewise for INTR, IJ, EC and AMOT, there
estimates is quite high at 0.85 (calculated using was one variance each, one parameter for each
the MML formulation [20]). Yet a clear disadvan- item (2 + 8 + 4 + 6 = 20) and one parameter for each
tage is that the differential information about self- step (20 3 5). As in the composite approach, the
determination relative to the six dimensions of population means were all constrained to be zero.
reasons for healthy behavior is not represented in The consecutive approach has the advantage

Downloaded from http://her.oxfordjournals.org/ by guest on November 17, 2015


these results. of producing self-determination estimates and stan-
dard errors for all six dimensions of reasons for
Consecutive approach healthy behavior. Yet the consecutive approach
In the consecutive approach, the sum of scores on ignores the possibility that self-determination
items associated with the six dimensions of reasons across the six variables might be interrelated.
for healthy behavior are treated as six different Because the number of items defining each di-
indicators and modeled separately in different ana- mension is necessarily smaller, the standard errors
lyses. Separate models are used for each dimension of the consecutive estimates are larger than those
using ConQuest. The difference between the com- resulting from a composite analysis. Reliabilities
posite and consecutive approaches is depicted from the consecutive analyses range from 0.50 to
conceptually in Fig. 3. In the consecutive approach, 0.83 (see Table II), which are all smaller than the
186 parameters would be estimated, such that each composite reliability of 0.85.

Composite Approach
SD
Xi = Individual items
SD = Single estimate of latent
self-determination. Thirty items
X1 X2 . . . . . . . . . . . . . . . . . . . . X30 contribute to the SD.

ID INT INTR Consecutive Approach

Xi = Individual items within a


dimension
X1 . . . X6 X1 . . . X4 X1 . . . X2
= Six independent estimates of
latent self-determination, for the
six dimensions, respectively. A
different number of items
EC IJ AMOT contributes to the different
values, depending on the number
of items in each dimension.
X1 . . . X8 X1 . . . X4 X1 . . . X6

Fig. 3. Modeling self-determination: composite and consecutive approaches (after [14]).

i78
Introducing multidimensional item response modeling

Multidimensional approach each dimension on the items assigned to that


The multidimensional approach can be viewed as dimension depicted through the straight arrows,
a compromise between the composite and consec- but that there is also an influence from all the
utive approaches, one that incorporates the best of other dimensions through the curved lines (which
both approaches. The scores on each dimension represent associations rather than causes). This
provide distinct information about each participant, approach can be modeled using the MRCML
yet by incorporating the correlation between the model to estimate latent attitudes across the six
dimensions directly into the model; the reliability self-determination dimensions simultaneously. The
for each of the six self-determination dimensions is multidimensional approach uses the formulation
closer to the composite reliability. Figure 4 illus- from Equation 1, repeated here for convenience
trates the multidimensional approach. Note how (compare with Equation 2):

Downloaded from http://her.oxfordjournals.org/ by guest on November 17, 2015


there is a direct influence of the latent attitude for logðPik =Pik1 Þ = hd  dik : ð3Þ

Note that the major difference between Equa-


Table II. Comparing reliabilities of three analyses of the tions 2 and 3 is that here the person estimate, hd,
self-determination data
is now also subscripted by the dimension, d. The
Dimension Consecutive Multidimensional Composite multiple d values allow the researcher to model
approach approach approach participants’ level of attitude on each of the separate
ID 0.77 0.83 NA dimensions represented in the scale.
INT 0.75 0.84 NA There are now six population means for this
INTR 0.50 0.84 NA analysis (constrained to zero for identification), six
EC 0.83 0.88 NA variance estimates and 15 covariance estimates.
IJ 0.78 0.85 NA
Altogether, a total of 201 parameters are estimated
AMOT 0.60 0.70 NA
Self-determination NA NA 0.85 using ConQuest.
(whole scale) Because the multidimensional approach is hier-
archically related to the composite approach (i.e.
NA = not applicable. the models are nested), the model fit can be

ID INT INTR EC IJ AMOT

X1 . . . X6 X1 . . . X4 X1 . . . X2 X1 . . . X8 X1 . . . X4 X1 . . . X6

Multidimensional Approach

Xi = Individual items within a dimension


= Six correlated estimates of latent
self-determination, for the six
dimensions, respectively

Fig. 4. Modeling self-determination: multidimensional approach, showing different numbers of items attributable to each dimension.

i79
D. D. Allen and M. Wilson

compared using the change in the deviance (G2) dimensions when using the two approaches. Under
value (the change in deviance indicates ‘relative’ the multidimensional approach, the reliability for
model fit). The difference in deviance between the each self-determination dimension comes closer to
two nested models will approximate a v2 distribu- the composite reliability of 0.85 (we use this
tion with the number of degrees of freedom equal reliability as the standard because it incorporates
to the difference in the number of parameters esti- all the items possible in this scale). The most
mated by the two models. Using the data from notable difference is for the dimension labeled
Table III, the difference in deviance between the INTR, which only had two items of its own (one
two models is 6452.287, and the difference between from each of the contexts). Its reliability is consid-
the number of parameters is 20. This indicates that erably enhanced (error due to randomness of
the multidimensional model fits the data signifi- responses to items is diminished) by correlational
cantly better than the composite model. On the information from responses to the other items in the

Downloaded from http://her.oxfordjournals.org/ by guest on November 17, 2015


basis of a comparison of Akaike’s Information multidimensional analysis. Components that had
Criterion (AIC) [21], a method of comparing non- a larger number of items, such as EC with eight, had
nested models analyzing the same data, the multi- a smaller increase in reliability. Wang et al. [13]
dimensional model fits the data better than both the provide additional examples of reliability enhance-
composite and the consecutive model (i.e. the AIC ment through multidimensional analysis. Note that
value is lower for the multidimensional approach). these increases are conditional on using all the
These indicators of statistical significance lead one information available in the scale: if, say, the INTR
to expect that there will also be differences between results were to be recalculated using just the two
the models in terms of effect size (i.e. at the INTR items, then the reliability would fall back to
interpretational level). We illustrate several of these a level similar to that for the consecutive approach.
below with respect to (i) reliability, (ii) estimated The multidimensional and consecutive ap-
correlations among dimensions and (iii) self-de- proaches can also be compared with respect to
termination estimates under the different models. estimated correlations between the self-determina-
tion dimensions (Table IV). ConQuest calculates
Interpreting multidimensionality both the correlation and covariance between the
Earlier the claim was made that an advantage of the dimensions automatically in a multidimensional
multidimensional approach is an improvement in analysis. In this case, the correlation between the
reliability relative to the consecutive approach. six dimensions ranged from –0.65 (between INTR
Table II supports this claim by showing the re- and AMOT) to 0.05 (between EC and ID) to 0.98
liability for each of the six self-determination (between INT and INTR). The highest positive
correlations are between dimensions within the
autonomous set, as might be expected by the
Table III. Comparing three analyses of the self-determination
data
Table IV. Correlations between self-determination dimensions
Approach No. of G2 AIC
parameters ID INT INTR EC IJ AMOT

Composite 181 73000.408 73362.408 ID 1.0 0.75 0.75 0.05 0.22 0.37
Consecutive (six 186 69107.112 69479.112 INT 0.97 1.0 0.76 0.10 0.24 0.41
combined) INTR 0.97 0.98 1.0 0.11 0.25 0.42
Multidimensional 201 66548.121 66950.121 EC 0.05 0.14 0.13 1.0 0.63 0.13
IJ 0.27 0.32 0.33 0.82 1.0 0.01
Multidimensional compared with composite approach: AMOT 0.58 0.62 0.65 0.19 0.00 1.0
likelihood ratio test (G2): v2 = 6452.287, df = 20, p < 0.00001.
Multidimensional compared with consecutive approach (AIC): Multidimensional correlations are shown below the diagonal;
66950.121 < 69479.112. consecutive correlations above the diagonal.

i80
Introducing multidimensional item response modeling

relationship depicted in Fig. 1. Negative correla- 3.5


tions are noted between the amotivated dimension 3
and the three autonomous dimensions, a surprise, 2.5
since the controlled dimensions, not the amotivated
2
dimension, were hypothesized to lie at the opposite

IJ case estimates
1.5
end of the scale from the autonomous dimensions.
The controlled and autonomous sets of dimensions 1

are relatively uncorrelated. So a person who chooses 0.5


mostly 7s (‘very true’) on items associated with 0
autonomous dimensions will generally choose lower -0.5
numbers (not true) on items indicating amotiva-
-1
tion, but will have no identifiable pattern of response

Downloaded from http://her.oxfordjournals.org/ by guest on November 17, 2015


on items associated with controlled dimensions. -1.5

Whether or not these correlation statistics lead to -2


-3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5
revision of ideas about self-determination, they INTR case estimates
clearly indicate that modeling the data with a single
Fig. 5. Standardized estimates of participants’ perception of
composite dimension will not do justice to the in-
self-determination on two dimensions.
teresting complexity of the scale, and hence, these
results should be considered when designing medi-
ator studies. When calculated using the consecutive range of raw scores from 0 to 24 on IJ. Which
approach, the correlations between the case esti- participants thus have greater self-determination?
mates for all six dimensions are smaller (closer to Multidimensional analysis has allowed us a closer
zero) than when calculated using the multidimen- look at the data, with two possible interpretive
sional approach. Unlike the variance–covariance consequences. First, the items in each of these
matrix produced by ConQuest using the multidi- dimensions may require additional examination
mensional model, the consecutive approach gives to see if they indeed represent autonomous and
correlations that are attenuated due to measurement controlled reasons for healthy behavior as the
error. authors of the TSRQ projected [6]; if not, modifi-
Using only the multidimensional estimates of cation of the items might result. Second, the idea
participant attitudes about their self-determination, that autonomous and controlled reasons fit on
a comparison of two of the dimensions is plotted in opposite ends of a continuum in a construct for
Fig. 5. Both sets of h values were standardized by self-determination [5] may not hold for all these
creating z values [(case estimate – mean of case dimensions; refinement of the construct might
estimates)/standard deviation of case estimates] so result. In any case, these empirical data allow
that they could be compared directly. Note that the a closer look at the interpretation possible for the
data are not continuous: each possible integer raw common practice of averaging over autonomous
score value (0–13 for INTR and 0–24 for IJ) has and controlled items of the TSRQ and finding the
a corresponding logit value. Positive values in- difference between them.
dicate that these participants chose higher response Another way to compare the self-determination
categories indicating the item statements were estimates across dimensions, d, is to use a sum of
‘more true’ for them. Participants’ perception of squares indicator, DI, for each participant, p [14].
self-determination along the autonomous dimen- 6
DIp = + ðh  hd Þ
2
sion INTR and the controlled dimension IJ shows
d=1
the low correlation, 0.33, between these two
dimensions. Note that participants with a raw score If we (somewhat arbitrarily) set the threshold
of 12 out of 12 on INTR could have any of the whole for a discrepant case at DIp = 1.0, there are 831

i81
D. D. Allen and M. Wilson

participants (89%) in this sample for whom di- Note that in the multidimensional analysis, partic-
mensional estimates would reveal differing stories ipant 598’s INTR dimension is strongly influenced
about participant perception of self-determination. by the other dimensions, which should be expected
Analyzing examples of discrepant cases is an because it has the fewest items. The ID dimension
indication that there might be important diagnostic is also somewhat influenced by other dimensions
or interpretive ramifications of ignoring multidi- in the multidimensional analysis of both cases.
mensionality when assessing participant percep-
tion. For most participants in our sample, the
dimensional scores look very different from a com- Discussion
posite score. Consider, for example, two of the In much of the behavioral sciences, there is a desire
cases in our data, each having a raw score of 94 (out to either (i) disaggregate the scores from a scale into
of 180 possible). Table V shows the differences subscales and report them as separate dimensions of

Downloaded from http://her.oxfordjournals.org/ by guest on November 17, 2015


in the raw scores across the dimensions that are participant attitude or perception or (ii) to aggregate
obscured by reporting only the composite score. the scores of subscales and report this as a single
Participant number 522’s responses showed more dimension of perception or outcome. Both these
autonomous types of self-determination, compared scenarios depart from the measurement ideal of
with number 598’s more controlled types of self- single-dimension scales. The MRCML models are
determination. The fact that the TSRQ is not useful tools when the objective is to measure more
generally reported as a single composite score is than one latent construct. In this paper, we analyzed
consistent with the problem noted here of compar- the TSRQ as a single dimension to illustrate the
ing two participants with the same score; other composite approach, as separate and isolated di-
scales may have similarly important interpre- mensions to illustrate the consecutive approach and
tational differences between the composite and as a scale with multiple correlated dimensions to
multidimensional results. The importance for the illustrate the multidimensional approach using
TSRQ is that, according to SDT, those who show a MRCML model. Our example taken from the
more autonomous types of self-determination BCC data on the TSRQ highlights some important
should have a better chance of changing behavior differences between the three approaches. The
than anyone else [5, 6]. Reliance only on some essential difference is that the consecutive approach
composite estimate may falsely identify par- is simply a single-dimensional model repeated
ticipants that will successfully change behavior. a number of times using subsets of the full range
Comparing the multidimensional estimates with of items on a given scale. Because there are fewer
evidence of behavior change may support the items, just those items included that define each
theory and can help guide appropriate intervention. latent domain, the standard error of measurement

Table V. Comparing ability estimates in logits (standard error) of two participants with the same self-determination
score of 94/180

hID hINT hINTR hEC hIJ hAMOT hSD-composite

Participant 522
Raw score (% of total) 19 12 7 4 5 5 52
Consecutive estimate 0.27 (0.58) 0.21 (0.51) 0.35 (0.81) 0.00 (0.26) 0.24 (0.29) 0.38 (0.22) 0.14 (0.11)
Multidimensional estimate 0.44 (0.57) 0.12 (0.52) 0.30 (0.97) 0.03 (0.26) 0.22 (0.30) 0.38 (0.22)
Participant 598
Raw score (% of total) 18 7 4 9 9 4 52
Consecutive estimate 0.54 (0.49) 0.91 (0.30) 0.71 (0.38) 0.45 (0.21) 0.79 (0.31) 0.33 (0.22) 0.14 (0.11)
Multidimensional estimate 0.71 (0.49) 1.02 (0.30) 1.10 (0.41) 0.43 (0.22) 0.82 (0.32) 0.34 (0.22)

i82
Introducing multidimensional item response modeling

for person estimates increases and the reliability of The data example used in this paper is a fairly
estimates does not attain that of the full composite simple approach to modeling both unidimensional
model. In other words, the multidimensional ap- and multidimensional self-determination. We ig-
proach results in less error than the consecutive nore what is a possible violation of local item
approach. In addition, the lower deviance scores independence in the ‘bundling’ of items within
with the multidimensional approach indicate less a common prompt (i.e. because we ask the same
error than the composite approach when comparing exact questions for both diet and smoking). This
model fit. Thus, we present the multidimensional analysis is not an attempt to draw inferences in any
approach as an improvement over both the com- absolute sense about participant-reported self-
posite and consecutive approaches. The approach determination or the use of the TSRQ. Rather,
provides distinct estimates for multiple latent con- it has been meant as an illustration of the ‘relative’
structs, yet by modeling the dimensions as in- merits of a multidimensional approach to a compos-

Downloaded from http://her.oxfordjournals.org/ by guest on November 17, 2015


terrelated, the reliability of the estimates comes ite or consecutive approach in the context of the
closer to that found under the much less informative behavioral sciences.
composite analysis.
Structural Equation Modeling (SEM) is another
method that allows for multidimensional analysis of
Acknowledgements
scales. Apart from the aim to be consistent with the
approach used in earlier articles in this special issue We would like to thank Louise C. Mâsse formerly
[3, 4], item response modeling differs from SEM in of the National Cancer Institute (NCI), now from
that (i) it models the actual response data rather than the University of British Columbia, for organizing
the covariance matrix among the variables, and the project on which this work is based and for
hence, (ii) it allows a finer grain of interpretation providing crucial guidance throughout the writing
and fit analysis. Among item response models, the of the paper. We thank Geoff Williams for directing
MRCML model was chosen because, as a direct us in interpreting the scale and its intent. Support
extension of the same models used to analyze for this project was provided by NCI (Contract No.
a scale via the composite and consecutive ap- 263-MQ-31958). We thank the BCC for providing
proaches, the MRCML provides a consistency to the data that were used in the analyses. Any errors
the argument for similarities and differences be- or omissions are, of course, solely the responsibility
tween the three approaches. of the authors.
Beyond the statistical rationale for a multidimen-
sional approach, we believe that there are important References
interpretational differences as well, particularly for
the behavioral sciences. Treating participant- 1. Williams GC, Grow VM, Freedman ZR et al. Motivational
reported data that are multidimensional in nature predictors of weight loss and weight-loss maintenance.
J Pers Soc Psychol 1996; 70: 115–26.
as if they represent a single dimension may mis- 2. Ory MG, Jordan PJ, Bazzarre T. The Behavior Change
represent participant’s perceptions about their self- Consortium: setting the stage for a new century of health
determination, for example. This is less of a danger behavior-change research. Health Educ Res 2002; 17: 500–11.
3. Wilson M, Allen DD, Li JC. Improving measurement in
when the dimensions in question have a moderately health education and health behavior research using item
high positive correlation, as in the case of the response modeling: introducing item response modeling.
autonomous motivation dimensions in our exam- Health Educ Res 2006; 21(Suppl 1): i4–i18.
4. Wilson M, Allen DD, Li JC. Improving measurement in
ple; one dimension could represent all three. health education and health behavior research using item
Nonetheless, we show that participants can have response modeling: comparison with the classical test theory
their perceptions or outcomes misrepresented, par- approach. Health Educ Res 2006; 21(Suppl 1): i19–i32.
5. Williams GC, Minicucci DS, Kouides RW et al. Self-
ticularly if the scale includes dimensions that have determination, smoking, diet and health. Health Educ Res
small or negative correlations. 2002; 17: 512–21.

i83
D. D. Allen and M. Wilson

6. Williams GC, Deci EL. Internalization of biopsychosocial 14. Briggs DC, Wilson M. An introduction to multidimensional
values by medical students: a test of self-determination measurement using Rasch models. J Appl Meas 2003; 4: 87–100.
theory. J Pers Soc Psychol 1996; 70: 767–79. 15. Baranowski T, Allen DD, Mâsse L et al. Does participation
7. Reckase MD. The difficulty of test items that measure more in an intervention affect responses on self-report question-
than one ability. Appl Psychol Meas 1985; 9: 401–12. naires? Health Educ Res 2006; 21(Suppl 1): i98–i109.
8. Reckase MD, McKinley RL. The discriminating power of 16. Wright BD, Masters GN. Rating Scale Analysis. Chicago,
items that measure more than one dimension. Appl Psychol IL: MESA Press, 1982.
Meas 1991; 15: 361–73. 17. Wu ML, Adams RJ, Wilson MR. ACER ConQuest:
9. Ackerman T. A didactic explanation of item bias, item Generalised Item Response Modelling Software [computer
impact, and item validity from a multidimensional perspec- program]. Hawthorn, Australia: ACER (Australian Council
tive. J Educ Meas 1992; 29: 67–91. for Educational Research) Press, 1998.
10. Ackerman T. Using multidimensional item response theory 18. Dempster AP, Laird NM, Rubin DB. Maximum likelihood
to understand what items and tests are measuring. Appl Meas estimation from incomplete data via the EM algorithm. J R
Educ 1994; 7: 255–78. Stat Soc 1977; 39: 1–38.
11. Embretson SE. A multidimensional latent trait model for 19. Bock RD, Aitken M. Marginal maximum likelihood esti-
measuring learning and change. Psychometrika 1991; 56: mation of item parameters: an application of the EM

Downloaded from http://her.oxfordjournals.org/ by guest on November 17, 2015


495–515. algorithm. Psychometrika 1981; 46: 443–59.
12. Adams RJ, Wilson M, Wang W. The multidimensional 20. Mislevy RJ, Beaton AE, Kaplan B et al. Estimating
random coefficients multinomial logit model. Appl Psychol population characteristics from sparse matrix samples of
Meas 1997; 21: 1–23. item responses. J Educ Meas 1992; 29: 133–61.
13. Wang W-c, Wilson M, Adams RJ. Rasch models for multi- 21. Akaike H. A new look at the statistical model identification.
dimensionality between items and within items. In: Wilson IEEE Transactions on Automatic Control 1974; 19: 716–23.
M, Engelhard G, (eds). Objective Measurement: Theory into
Practice, vol. IV. Norwood, NJ: Ablex, 1997, 139–55. Received on October 20, 2005; accepted on July 19, 2006

i84

You might also like