Professional Documents
Culture Documents
By
Presented to
Dr. Dennis O. Dumrique
Polytechnic University of the Philippines
In Partial Fulfillment
of the Requirements for the Course
Curriculum Theories, Principle, and
Instructional Designs
By
Dinnes A. Masubay
October 2016
RESEARCH PROBLEMS
The main purpose of this study was to examine the utility of word-problem
CBMs in predicting math outcomes for third-grade students. In particular, the
researchers (Swanson et. Al. 2014) wanted to investigate the extent to which
word-problem CBMs were predictive of high-stake criterion measures of math word-
problem. Therefore, the California STAR (California’s Standardized Testing and
Reporting measure) math test (California Department of Education, 2009) and two
norm-referenced tests (the word-problem subtests from the Key Math and
Comprehensive Math Abilities Test) were used as criterion measures. Third-grade
students were selected because word-problems become a major point of emphasis
in third grade math curricula.
RESEARCH QUESTIONS
Students had 2 min to work on the CBM version they were given. The 2-min
time limit was selected because this was assumed to capture the formatting of
timed tests in high-stakes assessment as well as norm-referenced measures. To
assist in controlling for possible order effects, one of three presentation orders was
randomly assigned to each of the classrooms. Although a comparison of the three
presentation orders showed no significant order effect favoring one presentation
order, presentation order was included as a covariate in the subsequent analyses.
While the CBM ROI across the three time points emerged as a significant
predictor of performance on both criterion measures, there were no statistically
significant differences between the slopes of at-risk students and students who
were not at risk. We have two explanations for these findings.
First, the standard deviations for the slope estimates of the respective groups
were rather large in comparison with the means (SDs = 2.00–3.60), indicating
substantial variance associated with these slope estimates. This outcome is likely
because only three data points were used to estimate slope in this study. Christ
(2006) demonstrated that the standard error of the estimates (SEEs) decreased
continuously when more data points were considered in calculating rates of
improvement and therefore, with more sessions, differences between risk groups
may have emerged. Second, our ability to accurately measure growth may be a
function of our methodology. In this study, we elected to use the total number of
correct responses as the outcome variable. Others (e.g., Foegen et al., 2008) have
used alternative scoring methodologies that reward students for accurately
completing parts of the word-problem process (i.e., identifying correct numbers,
choosing the correct algorithmic, etc.). Rewarding students for properly executing
parts of the word-problem process would likely increase the range of possible scores
and be more sensitive to growth. Future research in this area should examine these
possibilities. Despite these issues with measurement error, however, it is important
to note that the CBM slope emerged as a significant predictor of the criterion
measures. This is an important finding as it demonstrates that we are not only able
to measure growth in word-problem skills in an efficient manner but that growth
rates can give valuable information as it pertains to end-of-year success in math.
Thus, as teachers make the shift to incorporating more activities related to math
word-problem in their lesson plans, word problem CBMs hold promise for providing a
means to evaluate the effectiveness of the instruction. It is our recommendation
that in addition to CBM scores, ROI can provide more information for teachers to
assess their students’ progress.
The results below are organized by the research questions that guided the study.
Students were grouped by ability status. Those with scores below the 25th
percentile on the California STAR or word-problem composite (CMAT and Key Math)
criterion measures were considered at risk. A one-way ANOVA was computed to
compare the rates of improvement (ROI) of at-risk students with the ROI of students
who were not at risk for both criterion measures. (All assumptions of ANOVA was
tested and met.). The researchers found no significant differences in ROI emerged
between the two groups as a function of risk status on the STAR; F (1, 141) = 1.16,
p > .05, or the problem-solving composite, F (1, 141) = 0.83, p > .05. For the total
sample, the mean ROI was 0.54 items per week, with a standard deviation of 1.44.
Results from the previous hierarchical regression model indicated that CBM
level and CBM slope were the only significant predictors of the STAR measure, while
calculation, CBM level, and CBM slope were all significant predictors of the word
problem composite. Therefore, these variables were included in the logistic
regression models, and the non-significant predictors from the previous analysis
(i.e., estimation, reading comprehension) were excluded.
The models specified for the STAR measure and word problem composite
produced AUC of .80 and .83, respectively. This means that when students were
identified as members of the not-at-risk group as a function of their initial CBM level
and slope, they yielded scores that were greater than students who were identified
as at risk 80% of the time for the STAR measure, and 83% of the time for the word
problem composite. The AUC indicates the extent to which the model is able to
discriminate between members of the at-risk group and not-at-risk group. The
observed values of .80 and .83 indicate that the models are discriminating at levels
greater than chance (e.g., Pepe, Longton, & Janes, 2009).
MAJOR FINDINGS
Large-scale reviews of the math literature (e.g., Gersten, Chard, et al., 2009;
NMAP, 2008) have highlighted the need to utilize formative assessment practices in
schools to improve math education. Formative assessment practices have been
most effective when teachers use performance assessments to evaluate specific
academic skills, and subsequently use those data to make instructional changes;
effects are strengthened further when guidance is given to teachers on using
assessment data to make instructional changes (Gersten, Chard, et al., 2009). Word-
problem CBMs hold the promise of assisting in this process because they can
provide low-inference information to teachers on students’ word-problem skills and
be used in a repeated fashion. This study examined the extent to which word-
problem CBMs predict math achievement, specifically criterion referenced measures
of word-problem, and the California STAR standardized test, beyond that of
traditional measures such as calculation, estimation, and reading abilities. This
investigation uncovered that word-problem CBM accounted for approximately one
quarter of the variance in the STAR test and over one half of the variance in the
word-problem composite measure. Predictive correlations with the STAR math test
and word-problem composite ranged from .46 to .62 (similar to correlations reported
in other studies investigating word-problem tasks; for example, Fuchs et al., 2011;
Fuchs et al. 2012; Jitendra et al., 2005), providing early evidence of predictive
validity for word-problem CBM. The researchers acknowledged that this is an initial
step in the validation process and that further research will be necessary to provide
more empirical support to the psychometric properties of word-problem CBMs.
SUGGESTIONS
In this study, the only word-problem accuracy was assessed by the CBM
measure. The addition of other problems such as calculation (e.g., addition,
subtraction) may be more representative of what is taught in schools and on high-
stakes tests. However, because of the limited research on word-problem CBM
measures, one of the purposes of this study was to try to determine whether word-
problem CBM predicted word-problem performance and the California STAR high-
stakes test beyond that of calculation skills.
The 2-min time limit that students had to complete the CBM measure is
questionable. This may have assessed other areas that are related to word-problem,
including reading fluency, processing speed, and working memory (e.g., Andersson,
2007; Swanson & Beebe-Frankenberger, 2004; Vilenius-Tuohimaa, Aunola, & Nurmi,
2008) rather than word-problem-solving ability. That is, students who have better
reading comprehension, faster phonological processing speed, and/or more working
memory capacity may be able to answer each question more quickly and answer
more questions. However, one advantage of the time limit is that it can be easily
implemented by teachers because it takes little time to administer. Despite the
limitations of the 2-min administration time limit addressed in the previous section,
findings from the current study suggest that 8- to 10-min samples of word problem
may not be necessary for the purposes of screening. The word-problem CBM was
able to distinguish between students who were at risk for word-problem difficulties
and students with relatively low risk of having word problem difficulties, all within a
2-min time frame. Future research in this area might examine how much time is
necessary to obtain an adequate sample of word-problem skills for the purposes
screening, and progress monitoring. Measures that require less administration time
also take less instructional time away from teachers; they are therefore more likely
to be accepted by teachers.