Karen Sova Thesis Final

IRT Analysis of the CRPBI
Running head: IRT ANALYSIS OF THE CRPBI
Evaluation of a Measure of Parent Behavior: An Item Response Theory Approach to
Dimensionality and Informant Agreement
by
Karen Sova
Mentor: Judy Garber, Ph.D.
Thesis completed in partial fulfillment of the requirements of the Honors Program in
Psychological Sciences in Peabody College, Vanderbilt University
April 2016
Nashville, TN
IRT Analysis of the CRPBI 2
Abstract
The Children's Report of Parental Behavior Inventory is a widely used three-dimensional
psychometric measure to assess parenting behaviors as reported by children and parents. Parent-
child dyads tend to report discrepant information about parenting, which has been of great
interest to clinical and developmental psychologists, as well as psychometricians. Researchers
agree that construct disagreement within families can be meaningful, though measurement error
and interpretative limitations can obscure this information. Using item response theory (IRT),
this paper produced a shortened two-dimensional version of the CPRBI for a study of parent-
child disagreement, which then was analyzed for dimensionality and discrepant item functioning
between children and parents in a sample of sixth graders and their mothers. Methods include
exploratory factor analysis, principal components analysis, unidimensional and multidimensional
item response modeling, and differential item functioning. The results propose that CRPBI is not
group invariant across mothers and children and that the dimensionality of the CRPBI might vary
across different populations.

Introduction
The overall aim of the current study was to evaluate a commonly used parenting measure,
the Children's Report of Parent Behavior Inventory (CRPBI) using item response theory (IRT),
which is a model-based framework for the evaluation of psychometric measures. In particular,
two questions were addressed: (a) what is the dimensionality of the CRPBI, and (b) What is the
comparability of parallel child and mother responses on the CRPBI?
Measures of Parenting
Parenting behaviors have been found to be associated with psychopathology in both
parents and children. Therefore, it is important to have measures of parenting that are
psychometrically sound. Many different parenting measures exist and have been used in studies
of the relation between parenting and psychopathology in parents and children. In a meta-
analysis of studies of the association between parenting and anxiety and depression in youth,
Gerlsma et al. (1990) identified fourteen “factor analytically derived instruments measuring
retrospective perceptions of parenting.” Wilson and Durbin (2010) conducted a meta-analysis of
studies exploring the relation of depression to fathers’ parenting behaviors and identified fifteen
distinct self-report measures of parenting. In another study of parenting in relation to
internalizing symptoms in adolescents, Yap et al. (2014) identified nineteen measures of
parenting, of which the Children’s Report of Parent Behavior Inventory (CRPBI; Schaefer, 1965)
was one of the most frequently used. Hurley et al. (2014) conducted the only study, to date, that
focused specifically on the psychometric properties of parenting measures, and found 164
measures of parenting, of which only 25 had some psychometric information.
Three studies have applied IRT to measures of parenting. In a sample of parents of
children with autism spectrum disorders, Zaidman-Zait et al. (2010) used IRT to evaluate the
Parenting Stress Index-Short Form, which measures the quality of parenting under conditions of
stress. Lorber et al. (2014) evaluated the Parenting Scale, specifically the Over-reactivity and
Laxness subscales, in a sample of parents of 3- to 7-year-old children. Chen et al. (2015)
examined the performance of the Vegetable parenting practices scale in a sample of parents of
children ages three to five years old. Thus, parenting measures have only been studied with IRT
narrowly, both in terms of the scope of parenting behaviors and the age of the children.
Moreover, children’s responses to parenting items have not been evaluated alongside parents’
responses, even though IRT provides a well-developed framework for such comparisons.
Children’s Report of Parent Behavior Inventory
The first version of the CRPBI had 26 subscales with 10 items each (Schaefer, 1965a).
Using factor analysis, Schaefer (1965b) identified three orthogonal factors in the CRPBI:
acceptance vs. rejection, psychological autonomy vs. psychological control, and firm vs. lax
control (from here forward, psychological autonomy vs. psychological control is referred to as
psychological control and firm vs. lax control is referred to as firm control). Schludermann and
Schludermann (1970) replicated Schaefer’s factor analysis with a sample of college students and
shortened the measure to 108 items. Margolies and Weintraub (1977) evaluated a 56-item
version of the CRPBI, which had one- and five-week test-retest reliabilities (r = .66 to .93, p
< .001) and reproduced the same factor structure of previous, longer versions of the CRPBI. In a
meta-analysis evaluating the psychometric properties of and congruence between parents’ and
children’s reports of parenting using the CRPBI, Korelitz and Garber (2016) found that 1020
studies have administered the CRPBI to a parent, child, or both. The CRPBI has been variously
adjusted to range from 30 to 260 items with typically 2 to 5 point Likert-type anchors, resulting
in inconsistent formats and hence difficult-to-generalize reports of reliability and validity. No
IRT analysis of the CRPBI has been conducted to date.
Dimensionality of the CRPBI
The dimensionality of the CRPBI has been debated and modified in a few studies (Cross,
1969; Kawash & Clewes, 1988; Raskin et al., 1971; Schmidt, 1969). Both Schaefer (1965a) and
Schludermann and Schludermann (1970) proposed that the three original dimensions were
orthogonal, which justifies the separate scoring of subscales. In a sample of undergraduates,
Cross (1969) confirmed the three dimensions1. In a sample of fifth, sixth, ninth, and tenth
graders, Kawash and Clewes (1988) found that a five-factor solution might be more appropriate
for the CRPBI, which they found using a non-orthogonal rotation to allow for suspected
obliquity among dimensions, which might be suppressed with orthogonal rotations. They
suggested expanding both "controlling" dimensions to two factors each. In a sample of adults
hospitalized for depression and normal adults, Raskin et al. (1971) replicated the three-factor
structure.
In a sample of fifth and sixth graders, Armentrout (1970) reported intercorrelations
among the three dimensions to show that the acceptance vs. rejection and psychological control
and also psychological control and firm control (but not acceptance vs. rejection and firm vs. lax
control) were consistently correlated, which they should not be if the dimensions are indeed
orthogonal. Armentrout (1970) argued that younger children cannot differentiate between
different types of parenting as well as older children and hence the CRPBI might require
different subscales and scoring based on the age of respondents. Specifically, he suggested a
two-dimensional ("Love-Hostility" and "Control-Autonomy") measurement model for children
younger than seventh grade. Armentrout and Burger (1972) administered the CRPBI to fourth
1
"Dimensions," "factor," and "component" will be used interchangeably in this paper.
through eighth graders and replicated the three-factor structure. They also showed that between
fourth and eighth grade, scores on the acceptance vs. rejection and psychological control more or
less decreased on average, whereas firm control decreased from grade four to six and then
increased from grade six to eight, suggesting that the CRPBI might require a different
dimensional and scoring structure for different aged children. In all studies, firm control was the
tertiary factor and the one that received the least substantive interest.
Parent and Child Agreement
In the current study, we also explored the extent of parent-child agreement in terms of
scores and item characteristics. Non-agreement between parents and offspring on parallel
measures of child or parent behaviors or attitudes has resulted in considerable debate among
developmental psychologists and psychometricians, particularly regarding whether these
discrepancies constitute measurement error or reflect substantive differences (Achenbach, 2006,
2011; De Los Reyes, 2011; Guion, Mrug, & Windle, 2009). Historically, discrepancies were
most often interpreted as measurement error. Over the past decade, however, studies have shown
that parent and child psychopathology can be related to between-informant discrepancies and
thus, multiple informants can provide meaningfully different perspectives (Berg-Nielsen et al.,
2003; Carlton-Ford et al., 1991; Edelbrock et al., 1985; Jessop, 1982). De Los Reyes and
colleagues (2010) suggested that parent-adolescent discrepancies on reports of parental
monitoring might prospectively predict child delinquency. Moreover, parenting is a significant
and modifiable predictor of psychopathology in youth (Kouros & Garber, 2014; McLeod et al.,
2007); therefore, it is critical for researchers to rigorously evaluate measures of children’s and
parents’ perceptions about parenting. Schwarz et al. (1985) compared the CRPBI scores of first-
year undergraduates with their siblings, mothers, and fathers' ratings of their parenting behaviors.
They found moderate to high internal consistency within informants, but low correlations
between raters. For example, children perceived their mothers and fathers' parenting to be more
alike than mothers and fathers rated each other.
Item Response Theory
Item response theory (IRT) “is a model-based measurement in which trait level estimates
depend on both participants’ responses and the properties of the items” (Embretson & Reise,
2000, p. 13). IRT uses probabilistic equations to quantify item characteristics, such as difficulty,
discrimination, or guessing based on responses from samples. These item characteristics are also
called parameters. In IRT, respondents are typically referred to as “persons.” IRT models can
also estimate parameters for each person, which can be interpreted as that person’s “trait level.”
In this context, “trait” refers to a latent construct that can be measured psychometrically. This
does not limit IRT to “pen and paper” questionnaires. IRT also could be applied to data from
coded behavioral events or computerized tasks, if the data structure is appropriate.
IRT models can range from simple, well-understood models to highly complex models,
some of which are still in development, in order to accommodate a variety of types of responses,
data structures, and research questions. IRT models can be categorized in several ways, such as
one-, two-, three- or more parameters; nonparametric or parametric; unidimensional or multi-
dimensional; and exploratory (i.e., descriptive) or explanatory (i.e., confirmatory).
Nonparametric latent modeling procedures rely on no or few assumptions about the population
distribution, whereas parametric models are based on assumptions about the population
distribution from which a sample is drawn, commonly the normal distribution.
Unidimensional IRT models assume that only one latent construct is sufficient to explain
the variance in responses between and within persons. Multidimensional (MIRT) models are
used when a construct has two or more dimensions; that is, that more than one dimension is
necessary to explain why a person would endorse a specific answer. MIRT models can be
defined such that each item contributes to only one dimension (between item, also referred to as
“explanatory”) or to multiple dimensions where the delineation between subscales is unknown or
unclear (within item, also referred to as “exploratory”).
One special case in the family of MIRT models is the bi-factor in which one overarching
trait exists, but the results benefit from modeling additional sub-traits to explain responses. To
understand the difference between exploratory and explanatory models in the context of MIRT
models, an analogy can be drawn between IRT and factor analysis. Like exploratory factor
analysis (EFA), exploratory IRT models aim to describe and tease out the dimensionality
underlying a set of data, whereas explanatory models aim to test the hypothesized number of
factors or dimensions and to specify each item to a single dimension. Lastly, persons and items
can be modeled as multi-group or multi-level. Multi-group models define subsets of persons or
items based on a grouping variable, whereas multi-level models accommodate hierarchical data
structures, such as classrooms within schools within counties.
A useful method in IRT is differential item functioning (DIF), which measures bias in
item difficulty between either manifest or latent groups of persons. DIF can take into account
grouping variables and shows whether item difficulty is independent of membership in a given
group. DIF does not detect group differences; rather DIF specifically measures bias in difficulty
across groups that are matched on their trait levels.
This study used three types of models: (1) three separate unidimensional two-parameter
logistic graded response models (Figures 4 and 5) (Samejima, 1969), (2) an exploratory two-
dimensional Rasch model (Figure 6 ) (Briggs and Wilson, 2003), and (3) a two-dimensional,
two-group confirmatory Rasch model to determine if responses to the CRPBI should be
interpreted as a total, as subscales, or as multi-dimensional (Figure 7) (Suh and Cho, 2014).
Advantages of IRT
A major advantage of IRT is group invariance, which allows for measures to be
compared between nonequivalent samples after a linear transformation. Baker (2000) showed
that “one of the interesting features of item response theory is that the item parameters are not
dependent upon the ability level of the examinees responding to the item,” (p. 51) meaning that
the item parameters are “group invariant.” Nevertheless, Baker warned (2000) that “the obtained
numerical values will be subject to variation due to sample size, how well-structured the data are,
and the goodness-of-fit of the curve to the data” (p. 54). Moreover, for item estimates to be group
invariant, items have to measure the same construct between samples.
Embretson (1996) proposed several additional ways in which IRT surpasses classical test
theory (CTT): (1) In CTT, items share the same standard error of measurement, which cannot be
generalized to the population. In IRT, items have unique standard errors that generalize to the
population. (2) CTT proposes that longer tests are more reliable, whereas in IRT shorter tests can
be more reliable. (3) In CTT, to compare respondents on two measures, the measures must be
strictly parallel; in IRT differences between comparable tests can be modeled and thus readily
compared even if differences exist between the two tests. (4) In CTT, a person’s relative trait
level is achieved against a score distribution; in IRT, a person’s relative trait level is measured
against an item: persons and items can be located along the same latent trait continuum.
IRT and Social Sciences Research
In the past few decades, IRT experts have made a clear case for the use of IRT over CTT
in the social sciences (Embretson, 1996). Reise and Waller (2009) noted that personality and
psychopathology measurement lagged in its use of IRT behind other fields, such as scholastic
aptitude, professional licensure, or health outcomes measurement, and concluded that the use of
IRT was rather the “exception and not the rule” in personality and psychopathology
measurement, although it has been increasing exponentially in the past few years. The current
study was a step forward in proliferating the use of IRT in the study of human development and
the social sciences.
Study Aims
The primary aim of the current study was to fit an IRT model to the CRPBI scores in
order to (1) obtain a more thorough evaluation of the psychometric properties of the measure, (2)
better understand its dimensionality by choosing the best fitting model (unidimensional, bi-
factor, or MIRT), and (3) explore the comparability of children’s and mothers’ responses. The
ultimate aim of this analysis is to construct a revised and psychometrically improved version of
the CRPBI to be used in future studies that aim to isolate measurement error from discrepant
perceptions about parenting as reported by children and mothers.
Method
Participants
Participants were 240 mother-child dyads; 77.1% of mothers had histories of a depressive
disorders (e.g., Major Depressive Disorder; Dysthymia); the remaining mothers were lifetime
free of psychopathology. All children were enrolled in 6th grade (M = 11.86 years old, SD = 0.56,
54% female). The child sample was 81.5% European American, 14.8% African American, and
the remaining 3.7% were Hispanic, Native American, or reported “Other.” Families were
predominantly working class (e.g., nurse’s aide, sales clerk) to middle class (e.g., store manager,
teacher) with a mean socioeconomic status (Hollingshead, 1975) of 41.67 (SD = 13.29).
Procedure
Over 3,500 families of children in 5th grade from metropolitan public schools were
invited to participate in a study about parents and children. Of the 1,495 families interested in
participating, the 587 mothers who had endorsed either a history of depression, use of
antidepressants, or no history of psychopathology were interviewed further by telephone. Of
these 587 screened, 238 families were excluded because the mothers did not indicate sufficient
symptoms to meet criteria for a depressive disorder (38%), had other psychiatric disorders that
did not also include a depressive disorder (19%), were no longer interested (21%), they or the
target child had a serious medical condition (14%), the target child was in a different grade (6%),
or the family had moved out of the area (2%). Of the 349 mothers interviewed, 109 were
excluded because the mother indicated a history of a psychiatric diagnosis that did not also
include a mood disorder or the mother or child had a serious medical condition. The final sample
of 240 families consisted of 185 mothers who had histories of depressive disorders (e.g., Major
Depressive Disorder; Dysthymia) and 55 mothers who were lifetime free of psychopathology.
All study procedures were approved by the institutional review board for the protection of human
subjects. Mothers provided informed consent, and children signed an assent form. Mothers and
youth were compensated for their time.
Measure
Children's Report of Parental Behavior Inventory (CRPBI) contains 108 items about
parents’ child rearing behaviors and includes 18 subscales representing three dimensions:
acceptance/rejection—the extent to which the parent expresses care and affection for the child;
autonomy/psychological control—the extent to which the parent controls the child through
indirect psychological methods such as inducing guilt, instilling anxiety, and/or withdrawing
love; and firm/lax control—the extent to which the parent consistently enforces compliance by
making rules or threatening punishment. See Appendix A for a complete list of the items. Youth
rated the degree of similarity of the behavior described and their mother, using a 3-point scale (0
= like, 1 = somewhat like, or 2 = not like). Mothers used this 3-point scale to rate how similar
they were to the behaviors described.
The current study focused on two of the three dimensions – acceptance vs. rejection and
psychological autonomy vs. psychological control for the following reasons: (1) These items are
of particular theoretical interest to many researchers, (2) firm control has been the least
pronounced factor and thus most disposable given the aim to shorten the CPRBI, (3) the sample
size cannot reasonably estimate the number of parameters necessary for 108 items.
Design
To ensure a systematic analysis, steps were outlined beforehand. Nevertheless, these
steps were mainly guiding, as certain revisions and reconsiderations were necessary along the
way. Steps are outlined and labeled in the results section and summarized in Figure 1.
Software
Data were organized and stored with Microsoft Excel. SPSS Statistics (v. 20) was used to
conduct descriptive analyses, correlational analyses, t-tests, and EFA. Models were estimated in
Mplus (Muthén & Muthén, 2012, v. 7.31). DIF analyses were run in SAS 9.4. Software code was
provided in the Vanderbilt University course “PSY 8881: Item Response Theory II,” taught by
Sun-Joo Cho, Ph.D.
Results
Step 1
Inspect raw data. The raw data showed that four items were accidentally omitted during
data collection (items 64 through 67), apparently due to a photocopying error. Of the 240 dyads,
twelve were found to have responses for either the mother or child, but not both; these cases
were deleted for better comparability between the groups, leaving 228 dyads to be included in
the analyses presented here. The child and mother data have six and thirteen missing data points,
respectively, seemingly at random, likely due to inattention.
Descriptive analyses. Table 1 shows response frequency by each item; that is, the number
of persons that endorsed each anchor category by item. Table 2 shows mean total scores and
mean scores by factor. (The acceptance vs. rejection factor was broken down by acceptance and
rejection to avoid reverse coding given the lack of confirmation about their orthogonality). Mean
total scores were 1.92 and 1.80 for children and mothers, respectively. Both children and mothers
scored highest on the acceptance factor (child = 2.57, mother = 2.20), followed by psychological
control (child = 1.80, mother = 1.85). Children scored second to lowest on the firm control factor
(1.74) and lowest on the rejection factor (1.35), whereas mothers scored second to lowest on the
rejection factor (1.62) and lowest on the firm control factor (1.48).
Table 3 shows inter-correlations between total and factor scores between and within
groups. Correlations between children and mothers show agreement; correlations within groups
are relevant to dimensionality. In total, child and mother responses were weakly but significantly
correlated, r = .23, p < .01. Between children and mothers, the acceptance, rejection, firm
control, and psychological control factors were also weakly but significantly correlated, r
= .20, .29, .21, and .29, all p < .01.
Within the child factors, all were significantly correlated with each other; acceptance
correlated negatively with the rest. Acceptance was strongly, negatively correlated with rejection
(r = -.69, p < .01) and weakly, negatively correlated with firm control and psychological control
(r = -.14 and -.15, respectively, both p < .05). Rejection was moderately correlated with firm
control and psychological control (r = .52 and .54, respectively, both p < .01). Firm control and
psychological control were also strongly correlated, r = .63, p < .01. Within the mother factors,
all were significantly correlated with each other, except acceptance and firm control, r = .04, ns.
Acceptance was weakly, negatively correlated with rejection (r = -.14, p < .05) and weakly
positively correlated with psychological control, r = .20, ns. Firm control was moderately
correlated with rejection and psychological control, r = .59 and .66, respectively, both p < .01.
Psychological control and rejection were moderately correlated, r = .56, p < .01. The different
correlational patterns between factors within children and mothers suggests that their
dimensional structure might be different. These patterns replicate Armentrout's (1970) findings
that the dimensions are oblique rather than orthogonal.
EFA with 104 items. To evaluate the inter-rater structural reliability of the CRPBI, EFA
results from the present data were compared with Schludermann and Schludermann’s (1970)
analysis.2 As in their analysis, we used the principal axis factoring method and orthogonal
Varimax rotation to analyze the present sample. Missing data were excluded listwise.
Table 4 shows EFA results as percentages of variance accounted for by factors. The
“Parallel Analysis” column presents “percentage of variance accounted for” from simulated
eigenvalues from randomly generated correlation matrices. The online software from which the
percentages were generated was developed for the purpose of comparing more precise values
than the “Eigenvalue > 1” rule of thumb against real datasets (Patil et al., 2007). Eigenvalues, or
2
Schludermann and Schludermann’s (1970) presented data from two undergraduate samples
broken down into male and female and as reporting on either their father or mother’s parenting
behaviors, resulting in eight distinct EFA results. Only EFA results from reports on mothers are
included here. From the four groups reporting on mothers, a mean of the proportion of variance
accounted for by each factor was calculated, weighted by gender sample size.
percentage or variance that exceed the randomly generated ones can be seen as indicating
meaningful factors. The three main factors in the data set of Schludermann and Schludermann’s
(1970) accounted for 67.83% of the variance, but only accounted for 45.22% in the present child
dataset and 39.99% in the mother dataset. This difference is concerning in terms of applying
Schludermann and Schludermann’s (1970) factors or subscales to the present data. The firm
control factor items were removed following this step.
Step 2: Principal Components Analysis to explore the dimensionality of the 74 item CRPBI
The purpose of this principal components analysis (PCA) in this step was to decide which
type of an item response model to use for 74 item CRPBI dataset. PCA was chosen over EFA
because it is theoretically unclear at this point if the items in CRPBI vary together due to a latent
variable or if the responses to the items vary together based on an emergent variable. Moreover
not all items were significantly correlated, which would violate an assumption of EFA, but not
PCA.
The principal components method, using Varimax rotation and suppressing coefficients
lesser than .30, was used (in this section, “factors” or “dimensions” are referred to as
“components” to be consistent with the method used). In the child dataset, the Kaiser-Meyer-
Olkin Measure of Sampling Adequacy was .88 and hence sampling was adequate (i.e. > .60);
Bartlett's Test of Sphericity suggested that factor analysis was appropriate for these data, χ2
(2701) = 8147.77, p < .01. In the mother data, the Kaiser-Meyer-Olkin Measure of Sampling
Adequacy was .79 and hence sampling was adequate (i.e. > .60); the Bartlett's Test of Sphericity
suggested that factor analysis was appropriate for these data, χ2 (2701) = 6242.53, p < .01.
Table 5 shows the eigenvalues, percentage of variance accounted for, and cumulative
percentage for components one through ten by groups, and compared to randomly generated
results specified to this analysis. The child dataset eigenvalues do not meaningfully exceed the
randomly generated eigenvalues past the second component; thus the child datasets was
interpreted as two-dimensional. In the mother dataset, the eigenvalues for components one
through five exceeded the randomly generated results, though in components two to three only
by .03-.04. This difference was not sufficient to adopt a five component structure over two with
the mother dataset, although it does raise questions about either differential dimensionality
and/or item functioning between groups. The scree plots presented in Figures 2 and 3 confirm the
adoption of a two factor structure. Hence, after removing the “firm control” items, the three
CRPBI dimensions were reduced to two, with the first and second components accounting for
slightly less variance after the reduction, suggesting that some firm control items contributed to
them somewhat.
Table 6 shows the rotated component matrix for the CRPBI reduced to 74 items, with
coefficients less than .30 not shown. Component 1 has positive and negative loadings and
component 2 has positive loadings only. Upon inspection, the positively loaded items in
component 1 more or less correspond to items posited as acceptance items by Schludermann and
Schludermann (1970); negative loadings on component 1 correspond more or less to rejection
items. Items loading onto component 2 more or less correspond to psychological control items.
Table 7 summarizes the number of items loaded onto rotated component 1, component 2, both,
or neither. In both groups, most items loaded onto only one component. However, in the child
dataset, 22% of the items loaded onto both components, and 5% loaded onto neither. In the
mother dataset, 11% loaded onto both components and 15% loaded onto neither.
Because we ultimately aimed to compare child and mother results, we needed to choose a
strategy to treat the discrepancies in item loadings on rotated components between groups. We
could delete all items that loaded differently between groups, resulting in considerable item loss
or we could model group differences with a multi-group item response model, if dimensionality
of the groups was comparable enough. Additionally, a number of items loaded onto both
components, suggesting dimensional overlap between the two components. Hence, three initial
modeling approaches were chosen for comparison in determining the dimensionality of the
shortened CRPBI: (1) a unidimensional model (as a control), (2) two separate unidimensional
models, with one for each component, and (3) a two-dimensional model with all items
contributing to both components (within-item or exploratory design). Items that load onto no
component (whether in child or mother dataset) were deleted because they did not contribute to
either of the two underlying dimensions; if they load onto a component in one group only, this
suggests too much differential item functioning and hence they are not useful to this analysis,
which aims to construct a measure that is comparable between children and mothers. Of note, the
component loading discrepancies do not yet reflect IRT treatment and might be minimized in the
IRT analysis. From here forward, “component 1” and acceptance/rejection factor will be referred
to interchangeably, as will “component 2” and psychological control.
To proceed, items not loading onto a component in either group were deleted from
further analyses, thirteen in total (items 4, 8, 20, 22, 27, 29, 34, 38, 48, 74, 90, 97, and 104),
resulting in a further reduced 61 item version of the CRPBI. Though some light has been shed on
the dimensionality of the reduced CRPBI, it is still unclear how the measure should be scored,
hence comparisons were made between several models. A two-parameter unidimensional model
was fit for comparison to ensure that modeling dimensionality in this dataset was statistically
worthwhile. One concern about doing this was the number of items negatively loaded onto
component 1: in unidimensional item response models, all items should be measuring one
underlying, continuous trait or dimension. The fact that items negatively loaded onto component
1 suggests that they should be reverse coded in order to be used in a unidimensional model;
however, they cannot be readily reverse coded because fourteen of these items also loaded
positively onto component 2. Thus, these items are expected to have negative discrimination
parameter estimates in the unidimensional model, which is acceptable at this stage of the analysis
given that reverse coding does not change the value (that is, magnitude) of the parameter
estimates or person scores, only their sign.
We hypothesized that an exploratory MIRT would be most appropriate for these data
because it allows for each item to load on more than one dimension. The additional complexity
of a MIRT model is only useful insofar as the acceptance/rejection and psychological control
dimensions are better predicted when modeled together rather than separately, as is suggested by
the items with meaningful loadings on both components. To determine if the two components are
better understood as two separate subscales, a two-parameter unidimensional model was run for
each component alone, forcing each of the 61 items into a single component (same for both
datasets, based on child data). Where component loadings differed between groups or loaded
onto both components, the item was assigned according to the group with the larger coefficient,
which occurred with 18 items. The comparisons in Step 3 established if the remaining 61 items
should be considered to be one broad construct, two distinct constructs, or a single two-
dimensional construct. See Figures 4 through 7 for path diagrams of the hypothesized
dimensionality of the CRPBI in this study.
Step 3. Interpret and compare parameters and model fit/separation reliability between
unidimensional and the MIRT model

This step used two IRT models: the two-parameter unidimensional graded response
model (2PL) (Cohen et al., 1993; Samejima, 1969) and the exploratory multidimensional Rasch
model for two dimensions (MIRT) (Briggs & Wilson, 2003). Models were estimated using
Mplus (see Appendix B for software input).
The 2PL graded response model was chosen because it is used to analyze items with
ordinal responses or rating scales, such as a Likert-scale. The GRM is particularly appropriate for
the CRPBI because it uses the cumulative category response function (CCRF) to estimate
parameters, which is intended to model responses that are cumulative in nature, as is the
applicability of parenting behavior from 1 to 3 (Samejima, 1969). The 2PL model estimates two
item parameters, discrimination (αi) and difficulty (βi). In the 2PL model, the discrimination
parameter can be understood as an indicator of how well an item differentiates between persons
based on their trait level; in other words, similar to EFA, if the trait were a factor, how well that
item “loads” onto the trait. If an item has a relatively high, positive αi, then this item
discriminates well between low and high trait level persons. If an item has αi close to zero,
respondents are endorsing items unrelatedly to their trait level. If an item has negative αi, then
the item is functioning the opposite of expected (i.e. endorsing “3” when “1” is expected based
on their other responses). For the difficulty parameter, high values indicate higher levels of the
latent trait necessary to endorse a given response and lower values indicate lower levels of the
latent trait necessary to endorse a given response, which follows for MIRT as well.
The MIRT model used here was chosen because it allows for all items to load freely onto
one or more of the dimensions (within item design), which is why it is termed “exploratory.”
This MIRT model estimated two latent trait item parameters θjAR and θjPC (one for each
dimension) and one item location parameter βi. In MIRT, the latent trait item parameters (θj) are
almost perfectly analogous to EFA loadings for each dimension, in other words, how well that
item discriminates between persons within a given dimension. The MIRT latent trait item
parameters are interpreted the same way as the 2PL item discrimination parameters, as is the
difficulty parameter. Thus, the item discrimination parameters in the 2PL model and the latent
trait item estimates in the MIRT models can be compared to determine differences in item
functioning and dimensionality between models and groups. For the purpose of clarity, the latent
trait item parameters from the MIRT model also are referred to as “discrimination.” The standard
error of item parameter estimates shows within what range a parameter estimate will reliably
predict a person’s response, and the significance indicates how statistically meaningful a
parameter estimate is. For additional comparisons, IRT person scores were calculated from all
models, which were mean scores weighted by item functioning, as estimated by the item
parameters.
Model fit. Model fit was not comparable between 2PL and MIRT models because they do
not use the same indices. The two separate 2PL models combined showed better fit results on all
indices than the 61 item 2PL model, although these indices (Log likelihood, AIC, and BIC) are
only comparative, and thus not included here, but are available in Appendix C. For the MIRT
models, we failed to reject the hypothesis of close fit for the child and mother datasets, both
RMSEA = .02, p = 1.00.
Item parameter estimates. Tables 9 and 10 show the discrimination parameter estimate
results for each model for the child and mother data sets, respectively. About half of the
discrimination parameters in the 2PL model in both data sets were negative. Negative
discrimination was not significantly correlated with items being negatively worded or “double-
barreled” (using indicator variables) (Gehlbach, 2015), suggesting multidimensionality, as

confirmed by the EFA, or that reverse coding was needed. In a unidimensional model applied to
multidimensional data (thus violating an assumption), the items related to the dominant factor
will have positive discrimination parameter estimates and the secondary dimension item
discrimination parameters estimates will be negative. This selective dominance is apparent when
comparing the parameter estimates of the child and mother 2PL models, in which the signs of the
estimates are essentially flipped, suggesting, again, differential structure between the two. In the
2PL model, four item parameter estimates were not significant, i.e. not meaningfully contributing
to the model. In unidimensional models with underlying multidimensionality, the dominant
factor will suppress the significance of items unrelated to that trait, thus these items are probably
psychological control items.
In both the child and mother data, the combined 2PL acceptance/rejection and
psychological control models had ten items with negative discrimination (all acceptance/
rejection model items) and none in the psychological control model, suggesting that those items
should be reverse coded if this model is adopted; all item parameter estimates were significant.
Of note, when compared with the MIRT model, a few items were not assigned to the best model
or were ambiguous, although this did not seem to impact the model overall or at the item level.
In the child dataset MIRT model, all item parameter estimates that were significant and
also negative were more or less the same ones as those that were significant and negative in the
combined 2PL model. Because some of negatively discriminating items in the acceptance/
rejection (AR) dimension also contributed to the psychological control (PC) dimension, caution
should be exercised when selecting these items for reverse coding during manual scoring, if
applicable. Specifically, only items with negative parameter estimates that are significantly
greater in the AR dimension than in the PC dimension should be interpreted as strictly “rejection
items” and be reverse coded. In this model, the following items could be reverse coded to
eliminate negative discrimination 2, 7, 13, 23, 49, 70, 82, 85, 94, and 100, although this likely
would lead to negative discrimination in the PC dimension.
Item Separation Coefficient and Separation Index. In IRT, the item separation coefficient
is the ratio of observed or “true” standard deviation to the estimated or model standard error,
which can be understood as a "signal-to-noise ratio," or how well the observed variance is being
explained by the modeled variance. Both were calculated from the discrimination item
parameters estimate standard errors. The item separation index is a measure of reliability of the
model (i.e., not the data), which is calculated from the separation coefficient, and indicates the
reliability of the item parameter estimates. High item separation reliability is likely to occur with
a range of item difficulties and large sample size. Table 10 shows the item separation coefficient
and separation index for each model. The AR and PC combined 2PL model showed the worst
ratio of observed to estimated variance (child = .84, mother = .68) and worst parameter estimate
reliability (child = .41, mother = .31), whereas MIRT showed the best ratio of observed to
estimated variance (child = .98, mother = .98) and best parameter estimate reliability (child
= .49, mother = .49). In both datasets, the MIRT standard error accounted for 98% of the
observed standard deviation. The considerably lower indices from the mother dataset are likely
due to item-dimension matching based on child data in the combined 2PL model, again showing
differential structure between groups. Interestingly, the 2PL model had a better ratio of observed
to estimated variance and parameter estimate reliability than the combined AR and PC 2PL
models, suggesting that two separate dimensions (i.e., subscales) represent the construct worse
than one single construct, making the case against subscales. Based on these results, the MIRT
model is recommended for these items.

Correlations of item discrimination parameter estimates between models and groups are
shown in Table 11. All correlations of primary factor parameter estimates between models were
significant, though mostly moderately sized. These results suggest that the models function
similarly but the correlations are far enough from perfect to justify adding precision with the
more complex MIRT. Table 12 shows the correlations between person scores between the 2PL
and MIRT model. The AR and PC combined 2PL model was not included in Table 12 because
person scores from these cannot be readily compared with the 2PL and MIRT model because
each measures a different construct, nor can they be easily combined into one set of person
scores. As in Table 11 showing item parameter estimate comparisons, all person scores were
significantly correlated but to varying degrees between models and groups, again suggesting that
the scores are not close enough to fail to justify using a more complex model. Therefore, the
MIRT model was chosen to compare the CRPBI dimensional structure and item parameters
between children and mothers.
Step 4. Differences between the child and mother MIRT models: GDIF
"Measurement invariance occurs when the parameters of a measurement model are
statistically equivalent across two or more groups" (Bowen, 2014; p. 1). Measurement non-
invariance at the item level is typically called DIF and measurement non-invariance at the
instrument level can be called global DIF (GDIF) (Suh & Cho, 2014). To validly compare
measurement data between two distinct groups, measurement invariance should be established,
as in our case of comparing between children and mothers, or else the comparison will be
inherently limited.
First, we observed a general pattern of similarity in parameters between groups (steps 2
and 3). Next, we tested for group invariance between groups by using the Chi-Square difference
test of model fit between three models, (1) a configural invariance model, (2) a weak invariance
model, and (c) a strong invariance model, using an Mplus function specifically designed for
testing group invariance (Bowen, 2014). In the configural invariance model, all item parameters
are estimated freely with the same factor structure across groups. In the weak invariance model,
(i.e., the metric model), the discrimination parameters are held constant between groups. In the
strong invariance model, also called the scalar model, discrimination and difficulty parameters
are kept constant between groups. Thus, these models increase in the amount of group invariance
required from the data to fit the model. Equivalent model fit cross models suggests group
invariance. If there is a significant difference in fit between the configural and weak model, then
the discrimination parameters are not group invariant (i.e., nonuniform DIF in items). If there is a
significant difference in fit between the weak and strong model, then both discrimination and
difficulty parameters are not group invariant (i.e., uniform DIF in items). In other words, if the
data fit the strongly invariant model well, then the instrument is group invariant. The Chi-square
test of difference assumes a null hypothesis of no difference, thus a significant Chi-square test of
difference results between two of these models shows non-invariance.
To enable this test, a confirmatory multidimensional model must be used (or else the
models will not converge; see Figure 7 for model path diagram); that is, each item contributes to
only one factor. Items were assigned to either factor based on the MIRT results. Thirty-five items
were assigned to the AR factor (items 1, 2, 7, 10, 13, 14, 15, 17, 23, 31, 35, 36, 40, 43, 50, 51,
53, 57, 59, 60, 63, 69, 71, 76, 82, 83, 84, 85, 86, 87, 93, 96, 99, 100 and 106) and twenty-six
items were assigned to the PC factor (items 11, 18, 25, 28, 33, 39, 41, 44, 45, 52, 54, 56, 61, 70,
73, 78, 79, 81, 88, 89, 91, 92, 94, 103 and 107). The separation reliability of the discrimination
parameter estimates from the observed data was .45 and .44 in the child and mother data,
respectively, thus slightly worse than with the MIRT.
Table 12 shows the results of the Chi-square test of difference in model fit between the
three models. The configural model vs. weak invariance model comparison indicates that
constraining discrimination parameters significantly worsened fit, χ2 (58) = 275.70, p <.001. The
weak invariance model vs. the strong invariance model showed that constraining the difficulty
parameter estimates across groups also worsened fit, χ2 (118) = 1194.09, p <.001. With this
sample, the CRPBI shows uniform DIF, that is, differential functioning of both discrimination
and difficulty parameter estimates across groups.
Step 5. Propose a measurement model for the CPRBI
Based on these results, we should not interpret the CRPBI using a multi-group model or
compare data between children and mothers without evaluating group invariance on the
instrument and item level. Once identified, DIF items should be deleted or accounted for in the
model. CRPBI data should be measured using a multidimensional model, which was the most
reliable model (see Figure 6 for model path diagram). Given the obliqueness of the factors and
hence the loading of certain items onto both factors, the CRPBI should be measured in a way that
allows items to load onto all factors to maximize information extracted from the data, though this
will only be possible in a one group or a multigroup model with constrained parameters or few
items because it typically requires more parameters to be estimated than the data allow. To make
comparisons between multiple informants, researchers should carefully specify a multigroup,
multidimensional model with group invariance.
Discussion
The current study examined the dimensionality of the CRPBI as completed by mothers
and their children. Analyses revealed that the multidimensional models were more appropriate
than the total and separate unidimensional models (analogous to using subscales) in terms of
model reliability and explaining shared variance between items that load onto different factors,
thereby arguing against discrete subscales.
We also found that the AR and PC dimensions were not orthogonal, as had been
suggested by Schludermann and Schludermann (1970). The latent traits underlying the PC
dimension and the AR dimension seemed to vary together somewhat, as shown by shared item
loadings. Interestingly, not reverse coding rejection items showed which PC items varied with
either acceptance or rejection items; this information is substantively useful when hypothesizing
which parenting behaviors might be beneficial or detrimental, because the AR factor is easily
understandable in terms of what positive (i.e. developmentally beneficial) or negative (i.e.
developmentally detrimental) parenting behaviors might be. For example, item 45 ("My mother
keeps a careful check on me to make sure I have the right kind of friends") seems ambiguous in
terms of whether it is beneficial to children or not, yet was posited by Schludermann and
Schludermann (1970) as a PC item on the "intrusiveness" subscale. In our data, item 45 loaded
more or less equally and positively onto both the PC and AR factors, indicating that it was
associated with acceptance. Thus, the presence of negative discrimination was actually useful in
interpreting how respondents perceived individual parenting behaviors, given that we could
differentiate acceptance and rejection items from each other.
Surprisingly, the PC factor discrimination estimates had no negative values; it had been
proposed to range from autonomous to controlling parenting behaviors, suggesting that some
reverse coding would have been needed, if this scale was bipolar. One reason that psychological
control might be more associated with acceptance than rejection in the current sample is that the
child participants were relatively young (6th grade) as compared to samples in other studies, such
as undergraduates. For older adolescents, psychologically controlling parenting behaviors might
be interpreted as more manipulative or intrusive, whereas with young adolescents they might be
considered to be appropriate.
In addition, differences between the current findings and those of the Schludermanns
indicates that the dimensions they proposed did not readily fit the current dataset. One aim of the
current study was to reduce the 108 CRPBI to a more manageable number of items to be used in
subsequent studies. Because the firm control items were not of substantive interest here, though
acknowledging that it is not ideal to “throw” out items, based on Schludermann and
Schludermann’s (1970) analysis, thirty firm control items were removed from the dataset, which
is why Step 1 reduced the CRPBI to 74 items.
A limitation of this study is the small sample size, in particular when estimating large
numbers of parameters, as in the multigroup models. Though these results provide reasonable
guidelines and directions for future study, the numerical values from these analyses, such as DIF
weights, cannot be easily generalized to other samples, even if matched on age.
Researchers should evaluate the overall and item level factorial structure of their CRPBI
data before adopting external subscales, especially if their age range does not match previous
analyses. Researchers also should hypothesize what kinds of groupings in their sample might
cause DIF and evaluate items for it. If DIF is present, researchers should include it in their
measurement model or delete DIF items. When comparing multiple raters using the CRPBI,
researchers might want to adopt a multigroup model to evaluate for GDIF between the groups, in
order to strengthen the validity of their inter-rater comparisons and to minimize measurement
error caused by group non-invariance.

References
Achenbach, T. M. (2006). As others see us: Clinical and research implications of cross-informant
correlations for psychopathology. Current Directions in Psychological Science, 15(2),
94–98.
Achenbach, T. M. (2011). Commentary: Definitely more than measurement error: But how
should we understand and deal with informant discrepancies? Journal of Clinical Child &
Adolescent Psychology, 40(1), 80–86. http://doi.org/10.1080/15374416.2011.533416
Adams, R. J., & Khoo, S. T. (1996). Quest. Melbourne, Australia: Australian Council for
Educational Research.
Armentrout, J. A. (1970). Relationships among preadolescents’ reports of their parents’ child-

rearing behaviors. Psychological Reports, 27(3), 695–700.
Armentrout, J. A., & Burger, G. K. (1972). Children’s reports of parental child-rearing behavior
at five grade levels. Developmental Psychology, 7(1), 44–48.
http://doi.org/http://dx.doi.org.proxy.library.vanderbilt.edu/10.1037/h0032701
Baker, F. B. (2001). The basics of item response theory. For full text: http://ericae. net/irt/baker.
Berg-Nielsen, T. S., Vika, A., & Dahl, A. A. (2003). When adolescents disagree with their
mothers: CBCL-YSR discrepancies related to maternal depression and adolescent self-
esteem. Child: Care, Health and Development, 29(3), 207–213.
http://doi.org/10.1046/j.1365-2214.2003.00332.x
Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in
two or more nominal categories. Psychometrika,37(1), 29-51.
Bowen, N.K. (2014). Testing for differences in measurement (CFA) models using Mplus’s
invariance shortcut code (WLSMV). Unpublished manuscript.
Briggs, D. C. & Wilson, M. (2003). An Introduction to multidimensional measurement using

Rasch models. Journal of Applied Measurement, 4(1), 87-100.
Carlton-Ford, S. L., Paikoff, R. L., & Brooks-Gunn, J. (1991). Methodological issues in the
study of divergent views of the family. New Directions for Child and Adolescent
Development, 1991(51), 87–102. http://doi.org/10.1002/cd.23219915107
Chen, T. A., O’Connor, T. M., Hughes, S. O., Beltran, A., Baranowski, J., Diep, C., &
Baranowski, T. (2015). Vegetable parenting practices scale. Item response modeling
analyses. Appetite, 91, 190–199. http://doi.org/10.1016/j.appet.2015.04.048
Cohen, A. S., Kim, S. H., & Baker, F. B. (1993). Detection of differential item functioning in the
graded response model. Applied psychological measurement, 17(4), 335-350.
Cross, H. J. (1969). College students’ memories of their parents: A factor analysis of the CRPBI.
Journal of Consulting and Clinical Psychology, 33(3), 275–278.
De Los Reyes, A. (2011). Introduction to the special section: More than measurement error:
Discovering meaning behind informant discrepancies in clinical assessments of children
and adolescents. Journal of Clinical Child & Adolescent Psychology, 40(1), 1–9.
http://doi.org/10.1080/15374416.2011.533405
De Los Reyes, A., Goodman, K. L., Kliewer, W., & Reid-Quinones, K. (2010). The longitudinal
consistency of mother-child reporting discrepancies of parental monitoring and their
ability to predict child delinquent behaviors two years later. Journal of Youth and
Adolescence, 39(12), 1417–1430.
Edelbrock, C., Costello, A. J., Dulcan, M. K., Kalas, R., & Conover, N. C. (1985). Age
differences in the reliability of the psychiatric interview of the child. Child Development,
56(1), 265. http://doi.org/10.2307/1130193
Embretson, S. E. (1996). The new rules of measurement. Psychological Assessment, 8(4), 341–
349. http://doi.org/http://dx.doi.org.proxy.library.vanderbilt.edu/10.1037/1040-
3590.8.4.341
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, N.J.:
Routledge.
Gehlbach, H. (2015). Seven Survey Sins. The Journal of Early Adolescence,

0272431615578276. http://doi.org/10.1177/0272431615578276
Gerlsma, C., Emmelkamp, P. M. G., & Arrindell, W. A. (1990). Anxiety, depression, and
perception of early parenting: a meta-analysis. Clinical Psychology Review, 10(3), 251–
277. http://doi.org/10.1016/0272-7358(90)90062-F
Guion, K., Mrug, S., & Windle, M. (2009). Predictive value of informant discrepancies in reports
of parenting: relations to early adolescents’ adjustment. Journal of Abnormal Child
Psychology, 37(1), 17–30. http://doi.org/10.1007/s10802-008-9253-5
Hazzard, A., Christensen, A., & Margolin, G. (1983). Children’s perceptions of parental
behaviors. Journal of Abnormal Child Psychology, 11(1), 49–59.
http://doi.org/10.1007/BF00912177
Hollingshead, A. B. (1975). Four factor index of social status. Unpublished manuscript, Yale
University, New Haven, CT.
Jessop, D. J. (1982). Topic variation in levels of agreement between parents and adolescents. The
Public Opinion Quarterly, 46(4), 538–559.
Kawash, G. F., & Clewes, J. L. (1988). A factor analysis of a short form of the CRPBI: Are
children’s perceptions of control and discipline multidimensional? Journal of
Psychology, 122(1), 57.
Kouros, C. D., & Garber, J. (2014). Trajectories of individual depressive symptoms in

adolescents: Gender and family relationships as predictors. Developmental
Psychology, 50(12), 2633–2643. http://doi.org/10.1037/a0038190
Lorber, M. F., Xu, S., Slep, A. M. S., Bulling, L., & O’Leary, S. G. (2014). A new look at the
psychometrics of the Parenting Scale through the lens of Item Response Theory. Journal
of Clinical Child & Adolescent Psychology, 43(4), 613–626.
http://doi.org/10.1080/15374416.2014.900717
Margolies, P. J., & Weintraub, S. (1977). The revised 56-item CRPBI as a research instrument:
Reliability and factor structure. Journal of Clinical Psychology, 33(2), 472–476.
http://doi.org/10.1002/1097-4679(197704)33:2<472::AID-JCLP2270330230>3.0.CO;2-S
McLeod, B. D., Wood, J. J., & Weisz, J. R. (2007). Examining the association between parenting
and childhood anxiety: A meta-analysis. Clinical Psychology Review, 27(2), 155-172.
Muthén, L. K., & Muthén, B. O. (1998-2011). Mplus User's Guide. Sixth Edition. Los Angeles,
CA: Muthén & Muthén.
Raskin, A., Boothe, H. H., Reatig, N. A., Schulterbrandt, J. G., & Odle, D. (1971). Factor
analyses of normal and depressed patients’ memories of parental behavior. Psychological
Reports, 29(3), 871–879. http://doi.org/10.2466/pr0.1971.29.3.871
Reise, S. P., & Waller, N. G. (2009). Item Response Theory and clinical measurement. Annual
Review of Clinical Psychology, 5(1), 27–48.
http://doi.org/10.1146/annurev.clinpsy.032408.153553
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores
(Psychometric Monograph No. 17). Richmond, VA: Psychometric Society. Retrieved
from http://www.psychometrika.org/journal/online/MN17.pdf
Schaefer, E. S. (1965a). A configurational analysis of children’s reports of parent behavior.

Journal of Consulting Psychology, 29(6), 552–557.
Schaefer, E. S. (1965b). Children’s Reports of Parental Behavior: An Inventory. Child

Development, 36(2), 413–424. http://doi.org/10.2307/1126465
Schmidt, P. C. (1969). The revised Child’s Report of Parent Behavior Inventory: A factor
analytic study and an investigation of the relationship of the CRPBI to the ability to
abstract in sixth graders. ProQuest Information & Learning, US.
Schludermann, E., & Schludermann, S. (1970). Replicability of factors in Children’s Report of

Parent Behavior (CRPBI). The Journal of Psychology, 76(2), 239–249.
http://doi.org/10.1080/00223980.1970.9916845
Schwarz, J. C., Barton-Henry, M. L., & Pruzinsky, T. (1985). Assessing child-rearing behaviors:
A comparison of ratings made by mother, father, child, and sibling on the CRPBI. Child
Development, 56(2), 462–479. http://doi.org/10.2307/1129734
Suchman, N. E., Rounsaville, B., DeCoste, C., & Luthar, S. (2007). Parental control, parental
warmth, and psychosocial adjustment in a sample of substance-abusing mothers and their
school-aged and adolescent children.Journal of Substance Abuse Treatment, 32(1), 1–10.
http://doi.org/10.1016/j.jsat.2006.07.002
Suh, Y., & Cho, S. J. (2014). Chi-Square difference tests for detecting differential functioning in
a multidimensional IRT Model a Monte Carlo Study. Applied Psychological
Measurement, 38(5), 359-375.
Patil, Vivek H., Surendra N. Singh, Sanjay Mishra,and D. Todd Donavan (2007), "Parallel
Analysis Engine to Aid Determining Number of Factors to Retain [Computer software].
Available from http://smishra.faculty.ku.edu/parallelengine.htm; Utility developed as part
of Patil, Vivek H., Surendra N. Singh, Sanjay Mishra, and Todd Donovan (2008),
“Efficient Theory Development and Factor Retention Criteria: A Case for Abandoning
the ‘Eigenvalue Greater Than One’ Criterion,” Journal of Business Research, 61 (2),
162-170.
Wilson, M. (2005). Constructing measures: An Item Response modeling approach. Mahwah, NJ:
Lawrence-Erlbaum. Referred to as Wilson.
Wilson, S., & Durbin, C. E. (2010). Effects of paternal depression on fathers’ parenting
behaviors: A meta-analytic review. Clinical Psychology Review, 30(2), 167–180.
http://doi.org/10.1016/j.cpr.2009.10.007
Yap, M. B. H., Pilkington, P. D., Ryan, S. M., & Jorm, A. F. (2014). Parental factors associated
with depression and anxiety in young people: A systematic review and meta-analysis.
Journal of Affective Disorders, 156, 8–23. http://doi.org/10.1016/j.jad.2013.11.007
Zaidman-Zait, A., Mirenda, P., Zumbo, B. D., Wellington, S., Dua, V., & Kalynchuk, K. (2010).
An item response theory analysis of the Parenting Stress Index-Short Form with parents
of children with autism spectrum disorders: Parenting Stress Index item analysis. Journal
of Child Psychology and Psychiatry, 51(11), 1269–1277. http://doi.org/10.1111/j.1469-
7610.2010.02266.x
Table 1
Category response frequency (proportion by item for raw data) on the CRPBI
Child Mother
"Somewhat "Somewhat
"Not True "True about "Not True "True about
True about True about
about Mom" Mom" about Me" Me"
Item Mom" Me"
1 14 (.06) 69 (.28) 160 (.66) 180 (.79) 44 (.19) 5 (.02)
2 19 (.08) 59 (.24) 165 (.68) 20 (.09) 120 (.52) 89 (.39)
3 12 (.05) 103 (.42) 128 (.53) 151 (.66) 76 (.33) 3 (.01)
4 13 (.05) 69 (.28) 161 (.66) 170 (.74) 55 (.24) 5 (.02)
5 130 (.54) 82 (.34) 31 (.13) 17 (.07) 91 (.40) 122 (.53)
6 41 (.17) 109 (.45) 92 (.38) 53 (.23) 150 (.65) 27 (.12)
7 14 (.06) 40 (.17) 189 (.78) 5 (.02) 30 (.13) 194 (.85)
8 149 (.61) 73 (.30) 21 (.09) 10 (.04) 32 (.14) 188 (.82)
9 114 (.47) 93 (.38) 36 (.15) 18 (.08) 118 (.51) 94 (.41)
10 79 (.33) 97 (.40) 67 (.28) 37 (.16) 135 (.59) 58 (.25)
11 60 (.25) 95 (.39) 88 (.36) 39 (.17) 122 (.53) 69 (.30)
12 134 (.55) 77 (.32) 32 (.13) 7 (.03) 28 (.12) 195 (.85)
13 16 (.07) 43 (.18) 184 (.76) 2 (.01) 49 (.21) 179 (.78)
14 29 (.12) 83 (.34) 131 (.54) 110 (.48) 104 (.45) 16 (.07)
15 23 (.10) 124 (.51) 96 (.40) 84 (.37) 121 (.53) 25 (.11)
16 62 (.26) 113 (.47) 68 (.28) 20 (.09) 90 (.39) 120 (.52)
17 9 (.04) 37 (.15) 197 (.81) 202 (.88) 25 (.11) 3 (.01)
18 26 (.11) 93 (.38) 124 (.51) 106 (.46) 89 (.39) 35 (.15)
19 122 (.50) 90 (.37) 31 (.13) 4 (.02) 66 (.29) 160 (.70)
20 35 (.14) 108 (.44) 100 (.41) 145 (.63) 72 (.31) 13 (.06)
21 103 (.42) 103 (.42) 37 (.15) 13 (.06) 113 (.49) 104 (.45)
22 108 (.45) 103 (.43) 31 (.13) 17 (.07) 63 (.27) 150 (.65)
23 21 (.09) 62 (.26) 160 (.66) 3 (.01) 50 (.22) 177 (.77)
24 52 (.22) 134 (.55) 56 (.23) 42 (.18) 121 (.53) 67 (.29)
25 119 (.49) 80 (.33) 44 (.18) 17 (.08) 82 (.36) 129 (.57)
26 180 (.74) 44 (.18) 18 (.07) 2 (.01) 18 (.08) 210 (.91)
27 34 (.14) 117 (.48) 92 (.38) 90 (.39) 126 (.55) 14 (.06)
28 122 (.50) 89 (.37) 32 (.13) 9 (.04) 55 (.24) 166 (.72)
29 8 (.03) 23 (.10) 212 (.87) 3 (.01) 4 (.02) 223 (.97)
30 182 (.75) 45 (.19) 16 (.07) 3 (.01) 4 (.02) 223 (.97)
31 13 (.05) 49 (.20) 181 (.75) 158 (.69) 69 (.30) 3 (.01)
32 36 (.15) 110 (.45) 97 (.40) 43 (.19) 92 (.40) 95 (.41)
33 76 (.31) 104 (.43) 63 (.26) 32 (.14) 96 (.42) 102 (.44)
34 43 (.18) 120 (.49) 80 (.33) 25 (.11) 78 (.34) 127 (.55)
35 41 (.17) 102 (.42) 100 (.41) 152 (.66) 75 (.33) 3 (.01)
36 27 (.11) 55 (.23) 161 (.66) 114 (.50) 92 (.40) 24 (.10)
37 92 (.38) 125 (.51) 26 (.11) 7 (.03) 53 (.23) 170 (.74)
38 5 (.02) 37 (.15) 201 (.83) 215 (.94) 11 (.05) 3 (.01)
39 86 (.36) 108 (.45) 48 (.20) 26 (.11) 117 (.51) 87 (.38)
40 18 (.07) 107 (.44) 118 (.49) 131 (.57) 95 (.41) 4 (.02)
41 119 (.49) 84 (.35) 40 (.17) 7 (.03) 37 (.16) 186 (.81)

42 66 (.27) 126 (.52) 51 (.21) 11 (.05) 96 (.42) 123 (.54)
43 15 (.06) 59 (.24) 169 (.70) 166 (.72) 55 (.24) 9 (.04)
44 22 (.09) 91 (.37) 130 (.54) 36 (.16) 137 (.60) 57 (.25)
45 41 (.17) 104 (.43) 98 (.40) 81 (.35) 130 (.57) 19 (.08)
46 68 (.28) 125 (.51) 50 (.21) 17 (.07) 99 (.43) 114 (.50)
47 127 (.52) 91 (.37) 25 (.10) 3 (.01) 43 (.19) 184 (.80)
48 3 (.01) 21 (.09) 219 (.90) 2 (.01) 8 (.04) 220 (.96)
49 171 (.70) 61 (.25) 11 (.05) 69 (.60) 36 (.31) 10 (.09)
50 16 (.07) 53 (.22) 174 (.72) 138 (.60) 81 (.35) 11 (.05)
51 23 (.10) 109 (.45) 111 (.46) 124 (.54) 96 (.42) 10 (.04)
52 27 (.11) 60 (.25) 156 (.64) 8 (.04) 38 (.17) 184 (.80)
53 17 (.07) 77 (.32) 149 (.61) 139 (.60) 88 (.38) 3 (.01)
54 121 (.50) 92 (.38) 30 (.12) 6 (.03) 35 (.15) 189 (.82)
55 147 (.61) 74 (.31) 22 (.09) 9 (.04) 77 (.34) 144 (.63)
56 120 (.49) 83 (.34) 40 (.17) 8 (.04) 49 (.21) 173 (.75)
57 16 (.07) 45 (.19) 182 (.75) 125 (.55) 86 (.38) 18 (.08)
58 206 (.85) 29 (.12) 8 (.03) 1 (.00) 3 (.01) 226 (.98)
59 12 (.05) 48 (.20) 183 (.75) 193 (.84) 34 (.15) 3 (.01)
60 25 (.10) 92 (.38) 126 (.52) 101 (.44) 111 (.49) 17 (.07)
61 24 (.10) 99 (.41) 120 (.49) 10 (.04) 109 (.47) 111 (.48)
62 126 (.52) 83 (.34) 34 (.14) 4 (.02) 30 (.13) 196 (.85)
63 22 (.09) 73 (.30) 148 (.61) 197 (.86) 29 (.13) 3 (.01)
68 124 (.51) 89 (.37) 30 (.12) 6 (.03) 23 (.10) 201 (.87)
69 12 (.05) 67 (.28) 164 (.68) 31 (.14) 127 (.55) 72 (.31)
70 38 (.16) 87 (.36) 118 (.49) 157 (.68) 70 (.30) 3 (.01)
71 12 (.05) 40 (.17) 191 (.79) 6 (.03) 39 (.17) 185 (.80)
72 54 (.22) 110 (.45) 79 (.33) 5 (.02) 41 (.18) 184 (.80)
73 142 (.58) 67 (.28) 34 (.14) 167 (.73) 62 (.27) 1 (.00)
74 102 (.42) 96 (.40) 45 (.19) 23 (.10) 90 (.39) 117 (.51)
75 151 (.62) 80 (.33) 12 (.05) 1 (.00) 22 (.10) 206 (.90)
76 12 (.05) 54 (.22) 177 (.73) 27 (.12) 123 (.54) 80 (.35)
77 18 (.07) 89 (.37) 136 (.56) 34 (.15) 95 (.42) 100 (.44)
78 46 (.19) 90 (.37) 107 (.44) 29 (.13) 146 (.64) 55 (.24)
79 145 (.60) 68 (.28) 29 (.12) 5 (.02) 23 (.10) 202 (.88)
80 131 (.54) 89 (.37) 23 (.10) 168 (.73) 60 (.26) 2 (.01)
81 85 (.35) 106 (.44) 52 (.21) 105 (.46) 114 (.50) 11 (.05)
82 18 (.07) 55 (.23) 170 (.70) 20 (.09) 76 (.33) 134 (.58)
83 12 (.05) 58 (.24) 173 (.71) 8 (.04) 52 (.23) 170 (.74)
84 28 (.12) 118 (.49) 97 (.40) 3 (.01) 39 (.17) 188 (.82)
85 29 (.12) 66 (.27) 148 (.61) 21 (.09) 72 (.31) 137 (.60)
86 14 (.06) 58 (.24) 171 (.70) 2 (.01) 28 (.12) 200 (.87)
87 18 (.07) 105 (.43) 120 (.49) 177 (.77) 51 (.22) 2 (.01)
88 173 (.71) 50 (.21) 20 (.08) 43 (.19) 102 (.44) 85 (.37)
89 128 (.53) 77 (.32) 38 (.16) 4 (.02) 45 (.20) 181 (.79)
90 7 (.03) 15 (.06) 221 (.91) 209 (.91) 20 (.09) 1 (.00)
91 134 (.55) 65 (.27) 44 (.18) 141 (.61) 86 (.37) 3 (.01)

92 173 (.71) 50 (.21) 20 (.08) 2 (.01) 15 (.07) 213 (.93)
93 18 (.07) 69 (.28) 156 (.64) 10 (.04) 64 (.28) 156 (.68)
94 12 (.05) 35 (.14) 196 (.81) 222 (.97) 7 (.03) 1 (.00)
95 160 (.66) 66 (.27) 17 (.07) 9 (.04) 34 (.15) 187 (.81)
96 7 (.03) 58 (.24) 178 (.73) 5 (.02) 45 (.20) 180 (.78)
97 36 (.15) 103 (.42) 104 (.43) 131 (.57) 92 (.40) 7 (.03)
98 30 (.12) 125 (.51) 88 (.36) 5 (.02) 27 (.12) 198 (.86)
99 8 (.03) 82 (.34) 153 (.63) 4 (.02) 11 (.05) 215 (.94)
100 131 (.54) 89 (.37) 23 (.10) 208 (.90) 20 (.09) 2 (.01)
101 103 (.42) 103 (.42) 37 (.15) 6 (.03) 28 (.12) 196 (.85)
102 113 (.47) 88 (.36) 42 (.17) 47 (.20) 105 (.46) 78 (.34)
103 110 (.45) 99 (.41) 34 (.14) 202 (.88) 27 (.12) 1 (.00)
104 8 (.03) 17 (.07) 218 (.90) 11 (.05) 71 (.31) 148 (.64)
105 183 (.75) 48 (.20) 12 (.05) 9 (.04) 77 (.34) 143 (.62)
106 17 (.07) 57 (.24) 169 (.70) 16 (.07) 117 (.51) 96 (.42)
107 139 (.57) 67 (.28) 37 (.15) 8 (.04) 46 (.20) 176 (.77)
108 105 (.43) 112 (.46) 26 (.11) 10 (.04) 15 (.07) 205 (.89)
Note. These counts reflect all respondents, including those with missing data within dyads.
Table 2
Mean Total and Factor Scores by Group
Factor
Group Total Acceptance Rejection Firm control Psych. Control
Child
M (SD) 1.92 (.17) 2.57 (.36) 1.35 (.37) 1.74 (.21) 1.80 (.31)
Mother
M (SD) 1.80 (.12) 2.20 (.16) 1.62 (.19) 1.48 (.15) 1.85 (.20)
Note. Factors as reported by Schludermann and Schludermann (1970).

Table 3
Correlations between Total and Factor Raw Scores for Children and Mothers
Mothe
Child r
Tot Acc Rej FC PC Tot Acc Rej FC
Child
1. Total
2. Acceptance .20**
3. Rejection .42** -.69**
4. Firm control .79** -.14* .52**
5. Psych. Control .85** -.15* .54** .63**
Mother
6. Total .23** -.08 .21** .21** .25**
7. Acceptance -.04 .20** -.22** -.10 -.08 .42**
8. Rejection .26** -.13* .29** .28** .25** .66** -.14*
9. Firm control .18** -.22** .32** .21** .22** .81** .04 .59**
10. Psych. Control .25** -.08 .21** .22** .28** .90** .20** .56** .66**
Note. * p < .05. ** p < .01
Table 4
EFA Comparison between Groups before Rotation: All Items
S&S Child Mother Parallel Analysis
% of s
2
Cum. % % of s
2
Cum. % % of s2 Cum. % % of s2 Cum. %
F1 36.73 36.73 18.85 18.85 13.41 13.41 2.26 2.26
F2 18.11 54.84 8.3 27.15 7.4 20.81 2.11 4.37
F3 12.99 67.83 4.1 31.24 4.02 24.83 2.03 6.4
F4 - - 2.42 33.67 2.52 27.35 1.95 8.34
F5 - - 2.21 35.87 2.33 29.67 1.87 10.21
F6 - - 2.05 37.93 2.23 31.91 1.81 12.02
F7 - - 1.99 39.92 2.13 34.04 1.75 13.77
F8 - - 1.84 41.76 2.06 36.1 1.7 15.47
F9 - - 1.77 43.53 2.01 38.11 1.64 17.11
F10 - - 1.69 45.22 1.89 39.99 1.59 18.7
F11 - - … … … … ... …
Note. % of s2= percentage of variance accounted for by factor; Cum. % = cumulative percentage;
S&S = results from Schludermann and Schludermann (1970). Dashes indicate results not
provided.
Table 5
EFA Comparison between Groups before Rotation: Reduced (74) Items
Child Mother Parallel Analysis
Componen % of Cum. % of Cum. % of Cum.
t λ s2 % λ s2 % λ s2 %
17.7 12.1 2.3
1 9 24.04 24.04 0 16.36 16.36 7 2.48 2.48
2.2
2 6.80 9.20 33.24 6.10 8.25 24.61 5 2.35 4.82
2.1
3 2.12 2.87 36.11 2.19 2.96 27.57 6 2.23 7.06
2.0
4 1.97 2.66 38.77 2.12 2.86 30.43 9 2.17 9.22
2.0
5 1.74 2.35 41.11 2.07 2.79 33.22 3 2.09 11.31
1.9
6 1.63 2.20 43.31 1.92 2.59 35.82 6 2.01 13.33
1.9
7 1.59 2.15 45.47 1.78 2.40 38.22 1 1.96 15.29
1.8
8 1.55 2.10 47.57 1.75 2.36 40.58 6 1.91 17.19
1.8
9 1.45 1.96 49.52 1.61 2.18 42.76 1 1.85 19.05
1.7
10 1.37 1.85 51.38 1.54 2.09 44.85 6 1.80 20.85
Note. λ = Eigenvalue; % of % of s2 = percentage of variance accounted for by component; Cum.
% = cumulative percentage.
Table 6
Rotated Component Matrix: Reduced (74) Items
Child Mother
Item 1 2 1 2
1 .64 -- .40 --
2 -.46 -- -.35 --
7 -.57 -- -.42 --
10 -- .39 -- .45
11 -- .49 -- .57
13 -.51 -- -.70 --
14 .55 -- .59 --
15 .42 -- .53 --
17 .71 -- .53 --
18 .50 -- .32 .41
23 -.45 .44 -.41 --
25 -- .35 -- .47
28 -- .46 -- .34
31 .66 -- .72 --
33 .36 .34 -- .60
35 .55 -- .39 --
36 .43 -- -- .54
39 -- .50 -.36 .43
40 .57 -- .37 --
41 -- .42 -- .46
43 .61 -- .52 --
44 -.38 .60 -- .44
45 .33 .41 -- .45
49 -.48 .40 -.34 --
50 .59 -- .35 --
51 .51 -- .50 .37
52 -.44 .56 -.34 .44
53 .59 -- .48 --
54 -- .62 -- .51
56 -- .52 -- .52
57 -.56 -- .58 --
59 .73 -- .58 --
60 .68 -- .50 --
61 -.31 .48 -.34 .34

63 .60 -- .44 --
69 .65 -- .46 --
70 -.33 .51 -.37 .39
71 -.59 .39 -.59 --
73 -- .53 -- .51
76 .75 -- .53 --
78 -- .41 -- .50
79 -- .55 -- .57
81 -- .51 -- .48
82 -.45 .52 -.51 --
83 .62 -- .52 --
84 .51 -- -- .42
85 -.38 .44 -.51 --
86 .66 -- .59 --
87 .41 -- .31 --
88 -- .53 -- .34
89 -- .48 -- .42
91 -- .61 -- .47
92 -.39 .61 -- .44
93 .67 -- .46 --
94 -.31 .42 -.37 --
96 .66 -- .42 --
99 .59 -- .39 --
100 -.35 .46 -.52 .32
103 -- .40 -- .44
106 .63 -- .60 --
107 -- .61 -- .60
Note. Coefficients for items loading on both factors in bold. Dashes (--) indicate coefficients
< .30
Table 7
Number of Items Loaded by Component and Group (74 items)
Mother
Child
f % f %
Component 1 35 47 32 43
Component 2 19 26 23 31
Both 16 22 8 11
Neither 4 05 11 15
Total 74 100 74 100

Note. Each item only counted once.
Table 8
Child Dataset: Discrimination Parameter Estimates for the 2PL, Combined 2PL, and MIRT
models with 61 Items
2PL AR + PC 2PL MIRT
AR Factor PC Factor
Ite
αi θjAR θjPC
m SE p F αi SE p SE p SE p
A (.06
1
1.99 (.31) .00 R 2.08 0.31 .00 .71 ) .00 -.08 (.08) .31
A (.08
2
-1.32 (.25) .00 R -1.22 0.23 .00 -.50 ) .00 .26 (.08) .00
A (.08
7
-1.85 (.33) .00 R -1.77 0.31 .00 -.62 ) .00 .22 (.09) .01
(.09
10 PC
-.10 (.13) .45 .52 0.15 .00 .22 ) .01 .44 (.07) .00
(.11
11 PC
-.43 (.16) .01 1.08 0.18 .00 .11 ) .32 .56 (.07) .00
A (.09
13
-1.71 (.27) .00 R -1.58 0.26 .00 -.54 ) .00 .30 (.09) .00
A (.07
14
1.65 (.26) .00 R 1.59 0.27 .00 .60 ) .00 -.23 (.07) .00
A (.07
15
.88 (.17) .00 R .99 0.19 .00 .50 ) .00 .00 (.07) .96
A (.04
17
2.46 (.37) .00 R 2.90 0.43 .00 .87 ) .00 .04 (.08) .58
A (.07
18
.74 (.20) .00 R .99 0.19 .00 .67 ) .00 .38 (.07) .00
A (.10
23
-1.57 (.23) .00 R -1.30 0.21 .00 -.43 ) .00 .46 (.08) .00
(.09
25 PC
-.26 (.14) .07 .64 0.17 .00 .12 ) .16 .41 (.07) .00
(.10
28 PC
-.47 (.14) .00 .89 0.17 .00 .06 ) .50 .52 (.07) .00
A (.07
31
2.74 (.43) .00 R 2.71 0.45 .00 .75 ) .00 -.18 (.09) .05
(.08
33 PC
.34 (.17) .04 .32 0.14 .03 .50 ) .00 .42 (.07) .00
A (.06
35
1.19 (.20) .00 R 1.27 0.20 .00 .63 ) .00 .01 (.07) .92
(.08
36 PC
.57 (.19) .00 .08 0.17 .64 .56 ) .00 .27 (.08) .00
(.10
39 PC
-.43 (.15) .00 1.01 0.18 .00 .16 ) .11 .59 (.07) .00
A (.06
40
1.66 (.25) .00 R 1.81 0.25 .00 .68 ) .00 -.07 (.08) .38
41 -.49 (.15) .00 PC .88 0.17 .00 .05 (.10 .64 .48 (.07) .00
)
A (.06
43
1.49 (.27) .00 R 1.76 0.29 .00 .74 ) .00 .08 (.07) .27
(.12
44 PC
-1.64 (.25) .00 2.11 0.32 .00 -.32 ) .01 .64 (.07) .00
(.08
45 PC
.18 (.15) .23 .50 0.16 .00 .48 ) .00 .51 (.07) .00
A (.09
49
-1.73 (.28) .00 R -1.45 0.26 .00 -.47 ) .00 .42 (.08) .00
A (.06
50
1.70 (.25) .00 R 1.85 0.27 .00 .72 ) .00 .00 (.07) .97
A (.06
51
.79 (.18) .00 R .97 0.17 .00 .64 ) .00 .24 (.07) .00
(.12
52 PC
-1.74 (.25) .00 2.24 0.36 .00 -.38 ) .00 .61 (.07) .00
A (.05
53
1.71 (.24) .00 R 1.78 0.23 .00 .69 ) .00 -.08 (.08) .34
(.12
54 PC
-.98 (.18) .00 1.97 0.27 .00 -.07 ) .56 .71 (.06) .00
(.11
56 PC
-.90 (.18) .00 1.41 0.23 .00 -.16 ) .14 .54 (.07) .00
A (.09
57
-1.92 (.29) .00 R -1.74 0.29 .00 -.60 ) .00 .29 (.09) .00
A (.04
59
2.59 (.42) .00 R 3.03 0.47 .00 .87 ) .00 -.01 (.06) .89
A (.04
60
1.32 (.26) .00 R 1.71 0.26 .00 .80 ) .00 .18 (.07) .01
(.10
61 PC
-1.18 (.20) .00 1.45 0.24 .00 -.28 ) .01 .49 (.08) .00
A (.05
63
1.38 (.24) .00 R 1.59 0.24 .00 .72 ) .00 .08 (.08) .27
A (.04
69
1.92 (.29) .00 R 2.16 0.29 .00 .80 ) .00 .07 (.06) .26
A (.11
70
-1.20 (.19) .00 R -2.27 0.40 .00 -.28 ) .01 .53 (.07) .00
(.10
71 PC
-2.76 (.49) .00 1.40 0.23 .00 -.61 ) .00 .43 (.09) .00
(.10
73 PC
-.43 (.16) .01 1.21 0.21 .00 .18 ) .07 .65 (.06) .00
A (.04
76
2.82 (.45) .00 R 3.32 0.53 .00 .88 ) .00 .00 (.05) .97
(.09
78 PC
-.03 (.16) .87 .74 0.18 .00 .31 ) .00 .50 (.07) .00
(.11
79 PC
-1.10 (.22) .00 1.73 0.30 .00 -.18 ) .12 .64 (.08) .00
81 -.32 (.15) .04 PC .92 0.18 .00 .13 (.10 .17 .57 (.06) .00
)
A (.12
82
-2.01 (.33) .00 R -1.55 0.25 .00 -.43 ) .00 .58 (.08) .00
A (.05
83
1.82 (.28) .00 R 1.96 0.27 .00 .71 ) .00 -.07 (.08) .41
A (.05
84
.86 (.18) .00 R 1.03 0.18 .00 .61 ) .00 .16 (.08) .04
A (.10
85
-1.43 (.23) .00 R -1.18 0.20 .00 -.37 ) .00 .45 (.08) .00
A (.06
86
2.09 (.30) .00 R 2.19 0.34 .00 .76 ) .00 -.06 (.08) .44
A (.07
87
.69 (.17) .00 R .85 0.18 .00 .50 ) .00 .14 (.08) .08
(.11
88 PC
-.88 (.20) .00 1.42 0.24 .00 -.04 ) .70 .63 (.07) .00
(.10
89 PC
-.69 (.17) .00 1.15 0.21 .00 -.05 ) .60 .55 (.07) .00
(.09
91 PC
-.76 (.16) .00 1.48 0.23 .00 .01 ) .87 .68 (.06) .00
(.12
92 PC
-1.89 (.30) .00 2.64 0.49 .00 -.33 ) .01 .69 (.08) .00
A (.05
93
2.12 (.33) .00 R 2.39 0.35 .00 .77 ) .00 -.09 (.07) .18
A (.12
94
-1.50 (.31) .00 R -1.18 0.27 .00 -.32 ) .01 .52 (.09) .00
A (.05
96
2.52 (.36) .00 R 2.65 0.40 .00 .77 ) .00 -.13 (.08) .13
A (.05
99
1.29 (.26) .00 R 1.60 0.26 .00 .77 ) .00 .25 (.07) .00
A (.10
100
-1.22 (.21) .00 R -1.00 0.18 .00 -.34 ) .00 .49 (.08) .00
(.09
103 PC
-.06 (.15) .71 .64 0.15 .00 .35 ) .00 .51 (.06) .00
A (.06
106
2.03 (.32) .00 R 2.11 0.32 .00 .73 ) .00 -.11 (.09) .21
(.12
107 PC
-1.20 (.22) .00 1.94 0.27 .00 -.16 ) .19 .68 (.07) .00
Note. 2PL = two-parameter graded response model with 61 items; AR + PC 2PL = two separate
two-parameter graded response models combined, as indicated by AR = “acceptance/rejection”
model (36 items) and PC = “psychological control” model (25 items). MIRT = exploratory
multidimensional Rasch model for two factors. AR/PC Factor = EFA factors/dimensions. αi =
item discrimination parameter estimate. θjAR, θjPC = item latent trait parameter estimates for the
“acceptance/rejection” dimension and the “psychological control” dimension, respectively.
Table 9
Mother Dataset: Discrimination Parameter Estimates for the 2PL, Combined 2PL, and MIRT
models with 61 Items
2PL AR + PC 2PL MIRT
AR Factor PC Factor
Ite
αi θjAR θjPC
m SE p F αi SE p SE p SE p
- (.25 A (.26 (.08 -.0 (.10
1
1.20 ) .00 R 1.24 ) .00 .53 ) .00 7 ) .47
(.19 A (.20 -.4 (.08 (.08
2
.94 ) .00 R -.89 ) .00 0 ) .00 .18 ) .03
(.28 A - (.30 -.5 (.08 (.10
7
1.58 ) .00 R 1.61 ) .00 9 ) .00 .13 ) .23
(.17 (.16 (.09 (.07
10 PC
.39 ) .02 .85 ) .00 .04 ) .65 .50 ) .00
(.22 (.26 -.2 (.10 (.07
11 PC
1.04 ) .00 1.65 ) .00 1 ) .04 .60 ) .00
(.41 A - (.48 -.8 (.04 -.0 (.08
13
2.20 ) .00 R 2.91 ) .00 7 ) .00 8 ) .32
- (.24 A (.23 (.05 -.0 (.08
14
1.54 ) .00 R 1.63 ) .00 .67 ) .00 9 ) .24
(.21 A (.21 (.06 (.07
15
-.83 ) .00 R 1.23 ) .00 .62 ) .00 .25 ) .00
- (.39 A (.45 (.07 -.1 (.10
17
2.20 ) .00 R 2.31 ) .00 .73 ) .00 3 ) .17
(.17 A (.15 (.09 (.06
18
-.20 ) .26 R .55 ) .00 .44 ) .00 .50 ) .00
(.22 A - (.22 -.5 (.07 (.10
23
1.08 ) .00 R 1.12 ) .00 1 ) .00 .08 ) .43
(.18 (.19 -.0 (.11 (.07
25 PC
.73 ) .00 1.11 ) .00 9 ) .42 .52 ) .00
(.23 (.23 -.2 (.09 (.08
28 PC
.85 ) .00 .98 ) .00 4 ) .01 .38 ) .00
- (.39 A (.47 (.04 (.09
31
1.98 ) .00 R 2.91 ) .00 .88 ) .00 .14 ) .11
(.17 (.21 (.10 (.05
33 PC
.39 ) .02 1.25 ) .00 .16 ) .10 .71 ) .00
- (.19 A (.20 (.08 -.1 (.09
35
1.01 ) .00 R .92 ) .00 .45 ) .00 4 ) .11
(.18 (.19 (.10 (.06
36 PC
.23 ) .21 .99 ) .00 .23 ) .02 .66 ) .00
(.21 (.20 -.3 (.10 (.08
39 PC
1.23 ) .00 1.21 ) .00 6 ) .00 .44 ) .00
- (.18 A (.19 (.07 -.1 (.08
40
1.02 ) .00 R .95 ) .00 .42 ) .00 7 ) .04
41 .96 (.24 .00 PC 1.31 (.24 .00 -.1 (.11 .09 .54 (.07 .00
) ) 8 ) )
- (.24 A (.25 (.06 -.1 (.09
43
1.48 ) .00 R 1.65 ) .00 .65 ) .00 0 ) .23
(.19 (.21 -.2 (.09 (.08
44 PC
1.01 ) .00 1.29 ) .00 8 ) .00 .47 ) .00
(.17 (.15 (.09 (.06
45 PC
.02 ) .90 .69 ) .00 .33 ) .00 .53 ) .00
(.28 A (.28 -.4 (.11 (.11
49
1.20 ) .00 R -.89 ) .00 0 ) .00 .36 ) .00
(.17 A (.17 (.07 (.09
50
-.81 ) .00 R .86 ) .00 .45 ) .00 .02 ) .81
(.22 A (.19 (.08 (.09
51
-.58 ) .01 R 1.06 ) .00 .66 ) .00 .52 ) .00
(.31 (.29 -.4 (.11 (.10
52 PC
1.67 ) .00 1.52 ) .00 0 ) .00 .51 ) .00
- (.21 A (.21 (.06 -.0 (.07
53
1.13 ) .00 R 1.22 ) .00 .57 ) .00 1 ) .85
(.25 (.29 -.0 (.11 (.07
54 PC
.69 ) .01 1.34 ) .00 3 ) .79 .61 ) .00
(.20 (.26 -.1 (.10 (.08
56 PC
1.00 ) .00 1.47 ) .00 6 ) .10 .58 ) .00
- (.23 A (.25 (.05 -.0 (.06
57
1.39 ) .00 R 1.54 ) .00 .66 ) .00 1 ) .94
- (.37 A (.42 (.05 (.04
59
2.02 ) .00 R 2.30 ) .00 .75 ) .00 .00 ) .97
(.21 A (.20 (.07 (.08
60
-.63 ) .00 R 1.02 ) .00 .62 ) .00 .36 ) .00
(.21 (.18 -.3 (.09 (.08
61 PC
1.13 ) .00 .96 ) .00 8 ) .00 .36 ) .00
- (.29 A (.30 (.08 (.09
63
1.24 ) .00 R 1.35 ) .00 .60 ) .00 .02 ) .85
- (.22 A (.22 (.07 -.1 (.09
69
1.35 ) .00 R 1.32 ) .00 .54 ) .00 8 ) .04
(.19 A - (.59 -.3 (.09 (.08
70
1.17 ) .00 R 2.91 ) .00 9 ) .00 .39 ) .00
(.53 (.20 -.8 (.05 -.0 (.11
71 PC
2.34 ) .00 1.08 ) .00 2 ) .00 2 ) .87
(.17 (.17 (.09 (.06
73 PC
.36 ) .03 .88 ) .00 .16 ) .09 .55 ) .00
- (.29 A (.28 (.06 (.09
76
1.05 ) .00 R 1.41 ) .00 .68 ) .00 .17 ) .06
(.17 (.20 (.05 (.06
78 PC
.51 ) .00 .98 ) .00 .00 ) .94 .53 ) .00
(.24 (.26 (.11 (.07
79 PC
.68 ) .00 1.54 ) .00 .02 ) .86 .68 ) .00
81 .44 (.16 .01 PC .91 (.17 .00 .07 (.10 .48 .54 (.06 .00
) ) ) )
(.45 A - (.42 -.5 (.09 (.10
82
2.29 ) .00 R 1.76 ) .00 9 ) .00 .38 ) .00
- (.26 A (.26 (.06 (.09
83
1.06 ) .00 R 1.24 ) .00 .63 ) .00 .08 ) .37
(.15 A (.15 (.09 (.06
84
.00 ) 1.00 R .31 ) .04 .33 ) .00 .51 ) .00
(.35 A - (.32 -.6 (.08 (.11
85
1.69 ) .00 R 1.56 ) .00 0 ) .00 .20 ) .08
- (.65 A (.74 (.07 -.0 (.11
86
2.88 ) .00 R 3.23 ) .00 .83 ) .00 6 ) .57
(.18 A (.18 (.09 -.2 (.08
87
-.79 ) .00 R .72 ) .00 .32 ) .00 3 ) .00
(.41 (.49 -.3 (.12 (.12
88 PC
1.14 ) .01 1.29 ) .01 1 ) .01 .46 ) .00
(.17 (.20 -.1 (.10 (.08
89 PC
.74 ) .00 1.04 ) .00 1 ) .30 .45 ) .00
(.25 (.27 -.1 (.12 (.08
91 PC
.81 ) .00 1.18 ) .00 3 ) .27 .56 ) .00
(.33 (.35 -.4 (.11 (.10
92 PC
1.80 ) .00 1.62 ) .00 6 ) .00 .49 ) .00
- (.19 A (.21 (.06 (.07
93
1.03 ) .00 R 1.16 ) .00 .54 ) .00 .00 ) .97
(.29 A - (.28 -.4 (.09 (.09
94
1.49 ) .00 R 1.22 ) .00 9 ) .00 .31 ) .00
- (.42 A (.41 (.09 -.2 (.11
96
1.76 ) .00 R 1.62 ) .00 .59 ) .00 2 ) .04
(.27 A (.29 (.08 (.10
99
-.75 ) .01 R 1.08 ) .00 .57 ) .00 .32 ) .00
(.29 A - (.27 -.5 (.08 (.09
100
1.86 ) .00 R 1.53 ) .00 7 ) .00 .36 ) .00
(.21 (.25 -.0 (.11 (.08
103 PC
.64 ) .00 1.06 ) .00 5 ) .67 .53 ) .00
- (.29 A (.31 (.06 (.08
106
1.36 ) .00 R 1.56 ) .00 .70 ) .00 .05 ) .55
(.24 (.35 -.2 (.10 (.07
107 PC
1.18 ) .00 2.04 ) .00 0 ) .04 .65 ) .00
multidimensional Rasch model for two factors. AR/PC Factor = EFA factors/dimensions. αi =
item discrimination parameter estimate. θjAR, θjPC = item latent trait parameter estimates for the
“acceptance/rejection” dimension and the “psychological control” dimension, respectively.
Table 10
Item Separation Coefficient and Index (Reliability)
Separation
Coefficient Index
Child
2PL 0.91 .45
AR + PC 2PL .89 .44
MIRT AR .99 .50
MIRT PC .99 .50
Mother
2PL .89 .44
AR + PC 2PL .88 .44
MIRT AR .99 .50
MIRT PC .99 .49
multidimensional Rasch model for two factors. Only significant discrimination parameter
standard errors were included in the calculation. Larger values indicate better fit and reliability.
Separation coefficient = ratio of observed to estimated variance. Separation index = item
parameter estimate reliability.
Table 11
Correlation Matrix of Item Discrimination Parameter Estimates between Models
1 2 3 4 5 6 7
Child
1. 2PL -
2. AR + PC
2PL .64** -
3. MIRT AR .97** .65** -
4. MIRT PC -.83** -.35** -.71** -
Mother
5. 2PL -.90** -.55** -.85** .82** -
6. AR + PC
2PL .53** .85** .56** -.27* -.60** -
7. MIRT AR .90** .58** .89** -.75** -.96** .64** -
8. MIRT PC -.48** -.15 -.34** .78** .58** -.03 -.40**
Note. **p < .01. 2PL = two-parameter graded response model with 61 items; AR + PC 2PL =
two separate two-parameter graded response models combined, as indicated by AR =
“acceptance/rejection” model (36 items) and PC = “psychological control” model (25 items).
MIRT AR = exploratory multidimensional Rasch model AR dimension. MIRT PC = exploratory
multidimensional Rasch model PC dimension.
Table 12
Correlation Matrix of Person Scores
1 2 3
Child
1. 2PL -
2. MIRT 0.94 -
Mother
3. 2PL -0.40 -0.32 -
4. MIRT 0.36 0.31 -0.93
Note. *p < .05. **p < .01. Sign not meaningful. 2PL = two-parameter graded response model
with 61 items. MIRT = exploratory multidimensional Rasch model for two factors.
Table 13
Chi-Square Test of Difference in Model Fit
Model Comparison
χ2 df p
Metric v. Configural
275.70 58 <.001
Scalar v. Configural
1452.97 176 <.001
Scalar v. Metric
1194.09 118 <.001
Figures
Step 1
Inspect raw data.
Shorten to 74 items.

Step 2
Principal Components Analysis to explore the dimensionality of the 74 item CRPBI.
Shorten to 61 items.

Step 3
Interpret and compare parameters and model fit between unidimensional and the
multidimensional model
Choose multidimensional model.

Step 4
GDIF
The CPRBI is not group invariant across children and mothers.

Step 5
Propose a measurement model for the CPRBI.
This CRPBI data should be interpreted with a multidimensional IRT model.
Figure 1. Sequence of analyses.

Figure 2. Scree Plot for Child Dataset (74 items)

Figure 3. Scree Plot for Mother Dataset (74 items)

Figure 4. Path diagram for the unidimensional measurement model, where θj is the latent trait; xi
are data points or items; and parenting is the measurement outcome.
Figure 5. Path diagram for the two separate unidimensional measurement models, where θAR is
the latent trait acceptance/rejection and θPC is the latent trait psychological control; xi are data
points or items; and parenting is the measurement outcome.
Figure 6. Path diagram for the multidimensional measurement model, where θAR is the latent trait
acceptance/rejection and θPC is the latent trait psychological control, both contributing to all
items; xi are data points or items; and parenting is the measurement outcome.
Figure 7. Path diagram for the multigroup, multidimensional comfirmatory measurement model,
where θAR is the latent trait acceptance/rejection and θPC is the latent trait psychological control;
xic are child data points or items and xim are mother data points or items; and parenting is the
measurement outcome.

Karen Sova Thesis Final

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Karen Sova Thesis Final

Uploaded by

Copyright:

Available Formats

IRT Analysis of the CRPBI

Running head: IRT ANALYSIS OF THE CRPBI

Evaluation of a Measure of Parent Behavior: An Item Response Theory Approach to

Dimensionality and Informant Agreement

Mentor: Judy Garber, Ph.D.

Thesis completed in partial fulfillment of the requirements of the Honors Program in

Psychological Sciences in Peabody College, Vanderbilt University

The Children's Report of Parental Behavior Inventory is a widely used three-dimensional

interest to clinical and developmental psychologists, as well as psychometricians. Researchers

exploratory factor analysis, principal components analysis, unidimensional and multidimensional

across different populations.

which is a model-based framework for the evaluation of psychometric measures. In particular,

comparability of parallel child and mother responses on the CRPBI?

Parenting behaviors have been found to be associated with psychopathology in both

retrospective perceptions of parenting.” Wilson and Durbin (2010) conducted a meta-analysis of

distinct self-report measures of parenting. In another study of parenting in relation to

internalizing symptoms in adolescents, Yap et al. (2014) identified nineteen measures of

measures of parenting, of which only 25 had some psychometric information.

Three studies have applied IRT to measures of parenting. In a sample of parents of

Laxness subscales, in a sample of parents of 3- to 7-year-old children. Chen et al. (2015)

Children’s Report of Parent Behavior Inventory

in inconsistent formats and hence difficult-to-generalize reports of reliability and validity. No

IRT analysis of the CRPBI has been conducted to date.

Dimensionality of the CRPBI

orthogonal, which justifies the separate scoring of subscales. In a sample of undergraduates,

In a sample of fifth and sixth graders, Armentrout (1970) reported intercorrelations

two-dimensional ("Love-Hostility" and "Control-Autonomy") measurement model for children

Parent and Child Agreement

developmental psychologists and psychometricians, particularly regarding whether these

discrepancies constitute measurement error or reflect substantive differences (Achenbach, 2006,

colleagues (2010) suggested that parent-adolescent discrepancies on reports of parental

monitoring might prospectively predict child delinquency. Moreover, parenting is a significant

alike than mothers and fathers rated each other.

Item Response Theory

coded behavioral events or computerized tasks, if the data structure is appropriate.

one-, two-, three- or more parameters; nonparametric or parametric; unidimensional or multi-

dimensional; and exploratory (i.e., descriptive) or explanatory (i.e., confirmatory).

distribution from which a sample is drawn, commonly the normal distribution.

“explanatory”) or to multiple dimensions where the delineation between subscales is unknown or

unclear (within item, also referred to as “exploratory”).

can be modeled as multi-group or multi-level. Multi-group models define subsets of persons or

structures, such as classrooms within schools within counties.

across groups that are matched on their trait levels.

two-group confirmatory Rasch model to determine if responses to the CRPBI should be

interpreted as a total, as subscales, or as multi-dimensional (Figure 7) (Suh and Cho, 2014).

A major advantage of IRT is group invariance, which allows for measures to be

invariant, items have to measure the same construct between samples.

IRT and Social Sciences Research

the social sciences.

perceptions about parenting as reported by children and mothers.

antidepressants, or no history of psychopathology were interviewed further by telephone. Of

youth were compensated for their time.

they were to the behaviors described.

To ensure a systematic analysis, steps were outlined beforehand. Nevertheless, these

Sun-Joo Cho, Ph.D.

respectively, seemingly at random, likely due to inattention.

= .20, .29, .21, and .29, all p < .01.

that the dimensions are oblique rather than orthogonal.

control factor items were removed following this step.

Schludermann (1970); negative loadings on component 1 correspond more or less to rejection

to interchangeably, as will “component 2” and psychological control.

estimates or person scores, only their sign.