Maggin Et Al 2015 A Systematic Evidence Review of The Check in Check Out Program For Reducing Student Challenging

573630
research-article2015
PBIXXX10.1177/1098300715573630Journal of Positive Behavior InterventionsMaggin et al.
Literature Review
Journal of Positive Behavior Interventions
A Systematic Evidence Review of the

2015, Vol. 17(4) 197–208
© Hammill Institute on Disabilities 2015
Reprints and permissions:
Check-In/Check-Out Program for sagepub.com/journalsPermissions.nav
DOI: 10.1177/1098300715573630
Reducing Student Challenging Behaviors jpbi.sagepub.com
Daniel M. Maggin, PhD1, Jamie Zurheide, MA1,

Kayci C. Pickett, MA1, and Sara J. Baillie, MA1
Abstract
Three-tiered models of prevention are increasingly being adopted by schools to address the behavioral needs of students.
A critical component of multitiered systems are secondary interventions used with students in need of behavioral support
but who are not candidates for individualized interventions. Despite the importance of secondary interventions, questions
remain regarding which approaches have sufficient empirical support to warrant their use. The purpose of this review
was, therefore, to examine the research underlying the Check-In/Check-Out (CICO) program, a widely used secondary
intervention, to determine the strengths, limitations, and generality of the accumulated research. The What Works
Clearinghouse (WWC) procedures for evaluating single-case and group-based research were applied with results indicating
mixed support for the program. Specifically, there were a sufficient number of single-case research studies to deem
the CICO program as evidence-based, while the group-based research had no demonstrated effects. These findings are
discussed in terms of future research on the CICO program and the broader implications for selecting and implementing
secondary interventions in school settings.
Keywords
check-in/check-out, evidence-based practice, positive behavioral interventions and supports, secondary interventions,
systematic review
The three-tiered model of prevention has become increas- of secondary interventions has important implications for
ingly prevalent in schools to address the behavioral needs of the overall strength of multitiered systems of behavioral
students. These systems emphasize the use of data-based support and prevention.
decision-making procedures to increase the efficiency of The importance of secondary interventions to the overall
identifying students in need of additional behavioral sup- functioning of these three-tiered systems of behavioral sup-
ports beyond those provided to all students (Sugai & Horner, port leads to important questions regarding which approaches
2009). The stratification of behavioral supports into distinct should be adopted. Broadly speaking, secondary interven-
tiers allows for school personnel to allocate resources more tions represent a diverse set of procedures for addressing the
readily to those students with greater behavioral needs. As variety of behavioral challenges demonstrated by students
such, intermediary interventions are a critical element of the (Hawken, Adolphson, Macleod, & Schumann, 2009). As
three-tiered model because these methods are used to such, school personnel charged with intervention selection
ensure students who are non-responsive to primary preven- must weigh several factors, including the behavior or skill
tion strategies are efficiently provided additional supports being targeted and the resources available, to ensure the
and monitoring (Ervin, Schaughency, Matthews, Goodman, adopted strategy can be successfully implemented (Mitchell,
& McGlinchey, 2007). Ideally, these secondary interven- Stormont, & Gage, 2011). Moreover, school personnel must
tions consist of a standard set of procedures and are con- be cognizant of federal initiatives that emphasize the use of
tinuously available within a school to provide eligible
1
students with prompt access to needed supports (Scott, University of Illinois at Chicago, USA
Alter, Rosenberg, & Borgmeier, 2010). Moreover, second- Corresponding Author:
ary interventions assist school personnel in distinguishing Daniel M. Maggin, University of Illinois at Chicago, 1040 W Harrison St.,
between students in need of moderate levels of support and Chicago, IL 60607, USA.
those who require more intensive, individualized behavior Email: dan.m.maggin@gmail.com
plans. As such, the adoption and successful implementation Action Editor: Lee Kern
198 Journal of Positive Behavior Interventions 17(4)
practices with sufficient empirical support (Individuals With evidence base. First, researchers investigating the effects of
Disabilities Education Act [IDEA], 2004; No Child Left CICO have questioned whether the student’s behavioral func-
Behind [NCLB], 2002). It is therefore essential for practitio- tion serves to moderate the effects of the program (e.g.,
ners to consider the extent to which selected practices are Hawken, MacLeod, & Rawlings, 2007). There seems to be
supported through rigorous and appropriate research meth- consensus that CICO is most appropriate for students seeking
ods. The present review provides an evaluation of the adult or peer attention (Campbell & Anderson, 2011), although
research underlying check-in/check-out, a widely used sec- there has yet to be a formal analysis across studies to assess
ondary intervention to assist with describing the strengths, the validity of these claims. Second, there is a need to evaluate
limitations, and generality of the accumulated research for the consistency with which the procedures and components of
this program. the program have been implemented across studies. Variation
in the implementation of program components or low levels
of implementation fidelity would raise questions regarding
Check-In/Check-Out
the comparability of the research. This issue is particularly
The Check-In/Check-Out (CICO) program has emerged as important given recent CICO research investigating the
a model secondary intervention for students who do not effects of adapting, modifying, or supplementing the core pro-
respond to universal, preventive methods (Crone, Hawken, cedures and components of the program (e.g., Swoszowski,
& Horner, 2010). Based on the principles of contingency Jolivette, Fredrick, & Heflin, 2012; Turtura, Anderson, &
management, the premise of CICO is to provide students Boyd, 2014). Third, the CICO program is unique, in that it has
with more frequent and structured access to positive conse- been researched using both single-case and group-based
quences contingent on the demonstration of appropriate experimental methods. Conducting parallel reviews of the
behavior. The procedures of CICO have been outlined in a research from these methodological paradigms can be chal-
program manual (Crone et al., 2010) and include five core lenging because each is based on a unique set of principles
components including (a) the morning check-in during and procedures (Gast, 2010). Given the inherent methodologi-
which the target student meets with a member of the school cal differences, it is necessary to interpret the results within
faculty and receives positive, non-contingent attention and the context of both the review findings and the research para-
encouragement to meet the specified behavioral expecta- digms used. These aspects of the CICO research base will be
tions; (b) the daily behavior point card, which is given to the addressed within the context of the current review.
student during the morning check-in and provides school
personnel with a means for monitoring the extent to which
students are meeting the behavioral expectations; (c) struc- Review Purpose
tured teacher feedback which is provided to students The purpose of this review was to evaluate the research
throughout the school day at regularly scheduled intervals underlying the CICO program to determine its strengths,
and is delivered through both verbal interaction and point limitations, and generality. The following research ques-
card ratings; (d) the afternoon check-out during which the tions have been developed to guide the present review:
student’s point card is reviewed to determine the percentage
of points earned with a reward such as verbal praise or a Research Question 1: Does the single-case and group-
small tangible item delivered contingent on whether the stu- based experimental research provide sufficient evidence
dent met their goal; and (e) a home–school collaboration that the CICO program is effective to warrant classifica-
component in which the student carries their point card tion as an evidence-based practice?
home, which is then signed by the parent or guardian. The Research Question 2: What student and setting charac-
point card system, therefore, provides school personnel teristics have been associated with successful demon-
with a structured method for giving behavioral feedback to strations of the CICO program?
students and also assists school personnel in collecting data Research Question 3: Have the core procedural compo-
to evaluate the effectiveness of the intervention. Moreover, nents of the CICO program been used consistently and
there have been numerous reports investigating the pro- with fidelity across research studies?
gram’s effectiveness, which have contributed to its wide-
spread use and adoption (e.g., Fairbanks, Sugai, Guardino,
& Lathrop, 2007; Filter et al., 2007; McIntosh, Campbell, Method
Carter, & Dickey, 2009; Simonsen, Myers, & Briere, 2011).
Given the burgeoning research on CICO, there is a need to
Study Identification Procedures
evaluate the evidence underlying the program to determine The studies for the present review were identified using
whether and under what conditions the practice can be con- (a) an electronic database search of seven widely used
sidered effective. educational and social sciences databases including
Three issues have emerged within the CICO research Dissertation Abstracts, Educational Resources Information
with each having important implications for evaluating the Center (ERIC), ProQuest, PsychINFO, PubMed, Scopus,
Maggin et al. 199
and Sociological Abstracts using the two names associated reports with 79 student cases. Each student case was evalu-
with the program: “Check-In Check-Out” and “Behavior ated on design and evidence standards independently. The
Education Program”; (b) an ancestral search of the refer- total number of studies providing evidence was then used as
ence lists of eligible studies; (c) a review of the citation the basis for evaluating the overall strength of evidence per
lists of relevant review articles and the program manual; the WWC guideline. The specific procedures for evaluating
and (d) a review of the citation list by the program devel- the single-case research are described in the following
opers to ensure all relevant citations were identified. These sections.
search procedures led to the identification of 30 research
reports deemed potentially eligible for review. Application of single-case design standards. The WWC sin-
gle-case design standards consist of six criteria (Kratoch-
will et al., 2010). The initial five design criteria were scored
Eligibility Criteria on a dichotomous scale (i.e., present or not present). These
The 30 research reports identified through the initial search include the following: (a) The independent variable was
were reviewed using the following five inclusion criteria: (a) systematically manipulated with the researcher determining
the CICO program was used regardless of whether it was when and how the conditions changed, (b) each dependent
implemented with another intervention or adapted to broaden variable was measured repeatedly over time, (c) a measure
its potential reach, (b) the study utilized an experimental or of interobserver agreement was reported for each dependent
quasi-experimental research design based on either group- variable for no lesser than 20% of sessions, (d) the reported
or single-case research methods, (c) the study was conducted level of interobserver agreement was equal to or greater
with students enrolled in kindergarten through 12th grade, than 80% for percentage agreement indices, and (e) the
(d) the study was conducted in a school setting, and (e) the case had at least three attempts to demonstrate an interven-
outcome variable was related to problematic student behav- tion effect. Failure to meet any of these criteria led to the
ior or academic engagement. Application of these criteria classification of the case as not meeting design standards.
led to the removal of 8 reports that were not conducted using Those cases that met these five criteria were then subjected
experimental methods. The remaining 22 studies were then to the sixth design standard which relates to the number of
reviewed for the present report. data points per phase. Consistent with the recommendations
of the WWC, this criterion is scored on a three-point scale
and provides the context for distinguishing between cases
Study Coding Procedures that meet with and without reservations. Specifically, cases
The present review contained both single-case and group- are determined to meet design standards if there are five
based research studies. Because these research paradigms or more data points per phase, meet design standards with
have distinct conceptual underpinnings and methodological reservations if there are three or four data points per phase,
features, the studies from these methodologies were and not meet design standards if there are fewer than three
reviewed and summarized separately. All coding was com- data points per phase.
pleted by the secondary authors with discrepancies reviewed
by the lead author who made a final determination on the Application of single-case evidence standards. The student
appropriate code. The procedures used to review the eligi- cases within each study meeting design standards with or
ble studies for each methodological framework were based without reservations were then visually analyzed to evalu-
on recommendations from the What Works Clearinghouse ate the strength of evidence. For the present study, visual
(WWC). These procedures use a bifurcated process, which analyses were conducted using a protocol that prompted
includes (a) an initial evaluation of the methodological coders to consider the range of data features used to assess
rigor for eligible studies and (b) a subsequent assessment of single-case research. Specifically, visual analysts examined
the strength of observed intervention effects for those stud- the within- and between-phase data patterns to make judg-
ies deemed sufficiently rigorous. The classification as an ments regarding intervention effects. Within-phase analyses
evidence-based practice was then made based on the num- were used to assess the level, trend, and variability of the
ber of studies meeting design standards and providing evi- data within each phase to determine whether there was a
dence. For the present review, the studies providing sufficient basis to make predictions regarding future per-
evidence of effectiveness were examined further to evaluate formance. Between-phase analyses were used to deter-
the conditions and contexts under which CICO was found mine whether there were discriminable differences across
to be effective. The specific procedures and analyses for phases and whether they reflected behavioral improve-
each methodological framework are described in the fol- ment. Finally, visual analysts were prompted to consider
lowing sections. the ratio of phase contrasts with and without effects. Per
the WWC evidence standards, a ratio greater than 3:1 indi-
Single-case research coding procedures and analyses. The 22 cates that the case provides evidence without reservations,
eligible studies included a total of 17 single-case research a ratio of 3:1 indicates that the case provides evidence with
reservations, and a ratio of less than 3:1 indicates that the estimates for each index were computed using multilevel
case provides no evidence. The visual analysts incorporated modeling procedures, which accounted for variability at
the information pertaining to the number of data points, the the student and study levels.
within- and between-phase data patterns, and the ratio of
effects to non-effects to determine the overall strength of Group-based research coding procedures and analyses. The 22
the visual evidence. The evidence standards, therefore, are eligible studies included a total of five group-based research
based on the traditional processes of visual analysis such as reports. Consistent with the review of single-case research,
the examination and comparison of within- and between- the coding of methodological quality was followed by an
phase data and specific guidelines outlined in the WWC evaluation of the strength of evidence for those studies
guidelines such as the number of data points and ratio of meeting design standards with and without reservations.
effects to non-effects. Each student case was then classi- The evaluation of methodological quality and estimation of
fied as providing evidence, evidence with reservations, or effect size metrics to assess the strength of findings are
no evidence. The WWC considers an intervention to be described in the following sections.
evidence-based if there are at least 20 student cases provid-
ing evidence with or without reservations from five differ- Application of group-based design standards. The WWC
ent studies conducted by three independent research teams. group-based design standards consist of five criteria (What
The strength of these results was further examined through Works Clearinghouse, 2013). The initial five criteria
computing a ratio of those cases meeting design standards include (a) the random assignment of individuals to inter-
that either provided or did not provide evidence. vention and control, (b) the reporting of overall and differ-
ential rates of attrition that are within acceptable limits, (c)
Quantitative evaluation of single-case research. The visual the demonstration that students remaining and leaving the
analyses were supplemented with quantitative methods study are comparable on conceptually relevant variables,
to provide a more transparent evaluation of intervention (d) the demonstration that students in treatment and control
effects. It is important to note that there remains a lack conditions are comparable on conceptually relevant vari-
of consensus regarding the most appropriate statistical ables, and (e) the use of outcome variables that are suffi-
methods to evaluate single-case research data (Shadish, ciently valid and reliable, and were collected using the same
Rindskopf, & Hedges, 2008). As such, visual analyses procedures for both groups. Design classifications were
were used as the sole basis for classifying studies into evi- based on the extent to which the studies met these criteria
dence categories and the statistical analyses were applied with those using random assignment and demonstrating low
only for supplemental purposes. Moreover, the unsettled levels of attrition considered to meet standards without res-
nature of statistical analyses of single-case data led to ervations; those studies not using random assignment but
the use of four metrics. The four metrics were applied to having established group equivalence considered meeting
data extracted using the Engague software program (M. standards with reservations; and those studies that are non-
Mitchell, 2007). The four metrics included (a) the non- randomized without establishing group equivalence consid-
overlap of all pairs (NAP), which provides a summary of ered not meeting standards.
data overlap between all the data points within the phases
being compared, with a value of 65% or below associ- Application of group-based evidence standards. The mean
ated with a weak effect, between 66% and 92% associated and standard deviations for the treatment and control groups
with a moderate effect, and 93% and above considered a of studies meeting design criteria were extracted. This infor-
strong effect (Parker & Vannest, 2009); (b) the improve- mation was then used to calculate a SMD effect size, which
ment rate difference (IRD), which also provides a sum- is computed by subtracting the intervention group’s mean
mary of data overlap between phases with values of 70% on the dependent variable from the control group’s mean on
or above associated with a large effect, between 60% the dependent variable and dividing by the pooled standard
and 70% considered moderate, and below 60% associ- deviation of the dependent variable. The resulting index
ated with small effects; (c) the Busk and Serlin (1992) no provides an estimate of the difference between the groups
assumptions standardized mean difference (SMD) metric, in standard deviation units. Per the WWC guidelines, effect
which provides a standardized measure of the difference sizes of .25 or larger with demonstrated statistical signifi-
between baseline and intervention data with values greater cance at .05 are considered substantively important. Cor-
than 2.0 considered clinically significant; and (d) the raw rections for misaligned analyses and multiple outcomes
data multilevel analysis (RDMA) procedures described by were implemented as needed and described below within
Van den Noortgate and Onghena (2008), which also pro- the context of the review process. The WWC considers a
vides a standardized estimate of the difference between practice to be evidence-based if there are two or more stud-
baseline and intervention phases and is assessed with the ies meeting standards with positive effects that are statisti-
same interpretative guidelines used for the SMD. Overall cally significant.
Maggin et al. 201
Coding study characteristics. The studies meeting evidence Single-case evidence evaluation. Following the application of
standards with or without reservations were then reviewed design standards, the graphed data were visually analyzed.
on a variety of study characteristics to identify the condi- The visual analyses were conducted on each eligible depen-
tions to which the effects of CICO might apply. These char- dent variable reported for all student cases. Results of the
acteristics were coded into domains related to specific visual analyses indicated that 28 (80%) of the 35 cases
features of the sample, setting, and program implementa- meeting design standards provided visual evidence of inter-
tion. Items pertaining to the sample and setting domain vention effects. As such, there was a 4:1 ratio of cases meet-
related to general demographic characteristics of the sample ing design standards providing visual evidence to those not.
(e.g., gender, age, and grade), the identified function of stu- The reason all of the cases met evidence standards with res-
dent behavior, and the type of settings in which the study ervations was either due to too few data points present
was conducted. Items included within the program imple- within phases or the presence of non-effects. It is also worth
mentation domain related to the specific CICO components noting that the seven cases not meeting visual evidence
implemented and the methods for collecting fidelity data. standards did not provide any indication of harmful effects.
These items were applied to each student case included in As such, there was no evidence that the CICO program led
single-case reports and each study for group-based reports. to increases in problematic behavior or decreases in aca-
demic engagement. Rather, the data patterns indicated the
presence of null effects. The 28 cases meeting design stan-
Coder Reliability
dards were drawn from a total of 8 studies across 4 indepen-
Eight studies (36%) were randomly selected for recoding by dent research teams indicating that the practice can be
a second rater. This random selection of studies included labeled as evidence-based according to the WWC single-
five single-case and three group-based studies, which repre- case design standards.
sented approximately 29% and 60% of the eligible reports
from these methodologies, respectively. A series of percent- Single-case quantitative evaluation. The visual analyses were
age agreement indices were calculated to evaluate the con- augmented with the computation of quantitative indices of
sistency of coder ratings for each criterion and domain with treatment effect for cases providing evidence with reserva-
agreement for single-case and group-based research reports tions in accordance with the WWC standards. Results of
computed separately. The mean percentage agreement these analyses were mixed with those based on data overlap
across all items for the single-case coding protocol was 94% indicating a moderate intervention effect and those based on
with agreement on items relating to the design (M = 99%, a more traditional statistical framework indicating a weak
range = 94%-100%), evidence (M = 89%, range = 84%– effect. For problem behavior, the NAP and IRD estimates
100%), and study characteristics (M = 93%, range = 87%– were found to be 83% (SE = .03) and 62% (SE = .05),
100%) indicating strong agreement across coders. Overall respectively, with both found to be statistically significant
mean percent agreement for items of the group-based proto- at the p < .05 level. The SMD for problem behavior was
col was 96% with agreement on items relating to the design found to be 1.46 (SE = .27), while the RDMA was 1.38 (SE
(M = 100%, range = 100%–100%), evidence (100%, range = .21) with both being statistically significant at the p <
= 100%–100%), and study characteristics (88%, range = .001 level. The results for academic engagement were simi-
82%–96%) also indicating strong agreement across coders. lar with NAP and IRD estimates of 88% (SE = .02) and
74% (SE = .05), respectively, with both significant at the p
< .001 level. The SMD for academic engagement was esti-
Results mated at 1.77 (SE = .14) and the RDMA was estimated to
be 1.91 (SE = .67) with both significant at the p < .01 level.
Evaluation of Single-Case Research Reports These quantitative results are consistent with the visual evi-
Single-case design evaluation. A schematic outlining the dence evaluations indicating moderate levels of support.
application of the WWC design and evidence standards is Specifically, the visual analyses of baseline data confirmed
provided in Figure 1. The 17 single-case studies eligible for that all cases providing evidential support merited interven-
review included a total of 79 students. Forty-four (56%) tion given the levels of problem behavior, though many also
student cases were classified as not meeting design stan- had moderate variability at baseline. This variability would
dards. The most common reasons for their exclusion related have an effect on both the overlap and parametric statistics.
to issues with interobserver agreement (n = 29) and not pro- That is, the increased variability in baseline increases the
viding a sufficient number of data points per phase to meet chance of overlapping data points for the overlap methods
design standards (n = 15). This application of design stan- and the error terms in the parametric statistics.
dards resulted in the reduction of the original pool from 79
to 35 student cases and 17 to 9 studies eligible for evidence Study characteristics evaluation of single-case research. Fol-
review. lowing the application of the WWC evidence standards,
Total Studies Eligible for Review
n = 22
Design Evaluation
A total of 17 single-case studies with
A total of 5 group-based studies were
79 student cases were eligible for
eligible for review
review
Meets Design Standards with Reservations

Meets Design Standards Does Not Meet Design Standards
3 single-case studies with 10 student cases
6 single-case studies with 25 student cases 8 single-case studies with 44 student cases
1 group-based study
1 group-based study 3 group-based studies
Evidence Evaluation
Evidence Evidence with Reservations No Evidence
0 single-case studies with 0 student cases 8 single-case studies with 28 student cases 9 single-case studies with 51 student cases
0 group-based studies 0 Group-based studies 5 group-based studies
Figure 1. Schematic overview of the What Works Clearinghouse (WWC) design and evidence standards as applied to the database
of single-case and group-based research eligible for review.
Note. Consistent with the guidelines, single-case research studies were reviewed at the case level and group-based studies reviewed at the study level.
Those studies not meeting design standards were deemed as not providing evidence in accordance with the WWC standards.
the 28 cases providing evidence with reservations were disabilities (7%). In addition, the function of student
reviewed to assess the potential generality of the research. behavior was identified for all the 28 students found to be
An overview of study characteristics is provided in Table 1. responsive. The most common function of behavior was
A majority of the students for whom the intervention was either teacher or peer attention (n = 20, 71%) with some
found to be effective were enrolled in a school with a student behavior hypothesized to be maintained through
three-tiered model of behavior support (n = 22, 79%). escape from challenging academic work (n = 8, 29%). For
Moreover, 20 (71%) were found to be exposed to CICO in those students with attention-maintained behaviors, the
general education with the other students enrolled in either behavior of 11 students was maintained solely through
special education classrooms (n = 5, 18%) or schools (n = adult attention, 4 only through peer attention, 2 through a
3, 11%). In terms of demographics, most of the students combination of peer and adult attention, with the remain-
responding to the program were also found to be male (n ing 3 not specified. Interestingly, there were four students
= 24, 86%) with students ranging from kindergarten in studies meeting design criteria for whom functional
through 11th grade. With respect to disability status, sev- assessment data indicated escape-maintained problem
eral students did not have a disability (n = 17, 61%), and behavior and who did not respond to CICO. This repre-
others had learning disabilities (n = 5, 18%), emotional sents a much larger proportion of non-responding students
and behavioral disorders (n = 3, 11%), speech and lan- in studies meeting design standards than those with atten-
guage impairment (n = 1, 4%), and two had unspecified tion-maintained problem behavior.
Maggin et al. 203
Table 1. Overview of Study Characteristics for Single-Case Research Studies Meeting Evidence Standards With Reservations.
Student Dependent Age Primary Behavioral

Study case variable (year) Grade disability function CICO program
Campbell and Anderson Joe DB 10 N/R None PA Adapted, function-based
(2008) support
Kyle DB 10 N/R None PA Adapted, function-based
support
Campbell and Anderson Kyle AE, DB N/R 2nd LD AA Standard
(2011) Nick AE, DB N/R 5th SLI AA Standard
Paul AE, DB N/R 5th LD AA Standard
Hawken and Horner Jalen DB 13 6th LD PA Standard
(2003) Martin DB 13 6th None AA, PA Standard
Ryan DB 13 6th LD AA, PA Standard
Scott DB 12 6th None PA Standard
Mong, Johnson, and Lauren DB 8 3rd None AA Standard
Mong (2011) Andrew DB 8 3rd None AA Standard
Pam DB 9 3rd None AA Standard
Stanley DB 8 3rd None AA Standard
Swain-Bradway (2009) Donovan AE 15 10th None EDW Adapted, study skills class
Joy AE 16 11th NS EDW Adapted, study skills class
Lee AE 14 9th NS EDW Adapted, study skills class
Ricky AE 14 9th None EDW Adapted, study skills class
Travis AE 15 10th None EDW Adapted, study skills class
Swoszowski, Jolivette, Daniel DB 12 7th EBD A, NS Standard
Fredrick, and Heflin Leo DB 14 9th EBD A, NS Standard
(2010) Tyrone DB 12 6th EBD A, NS Standard
Todd, Campbell, Meyer, Chad DB N/R 1st None AA Standard
and Horner (2008) Eric DB N/R Kindergarten None AA Standard
Kendall DB N/R 2nd None AA Standard
Trevor DB N/R 3rd LD AA Standard
Turtura, Anderson, and Katie AE, DB N/R 7th None EDW Adapted, modified
Boyd (2014) expectations
Nick AE, DB N/R 6th grade None EDW Adapted, modified
expectations
Toby AE, DB N/R 8th grade None EDW Adapted, modified
expectations
Note. CICO = Check-In/Check-Out; DB = disruptive behavior; N/R = not reported; PA = peer attention; AE = academic engagement; LD = learning
disability; AA = adult attention; SLI = speech and language impairment; EDW = escape from difficult work; EBD = emotional and behavioral disorder;
A, NS = attention, not specified.
The evaluation of single-case research studies con- implemented with generally positive results although only
cluded with an implementation analysis to determine three provided fidelity for each program component.
whether the program was consistently administered across There was variability observed in the percentage of ses-
studies. An overview of the implementation analysis is sions for which fidelity data were collected ranging from
provided in Table 2. All eight studies with cases meeting nearly 35% to approximately 10% of sessions for each
evidence standards with reservations reported using the participant. The procedures for collecting fidelity data
core CICO components of (a) implementing a student also varied across studies. Specifically, three of the studies
check-in with an adult at the beginning of the day, (b) pro- used direct observation, four used permanent product data,
viding students with a daily point card with behavioral and one used a combination of both methods. Researchers
expectations, (c) having teachers provide feedback to stu- using direct observation methods were more likely to
dents using the point card, and (d) implementing a student evaluate the accuracy of the fidelity data by reporting
check-out with an adult at the end of the day. Moreover, interobserver agreement statistics with each demonstrat-
seven of the studies contained information pertaining to ing high rates of agreement across observers. Finally,
the extent to which the core program components were additional components were used to augment the CICO
Table 2. An Overview of Implementation Fidelity Reporting and Methods for Single-Case Studies Meeting Evidence Standards With
Reservations.
Fidelity results and methods Fidelity reported for specific components
Reported Proportion Data Morning Daily Teacher Afternoon Home

Study fidelity of sessions collection IOAa check-in point card feedback check-out collaboration
Campbell and Anderson 100% 34% Direct N/R 100% 100% 100% 100% N/R
(2008) observation
Campbell and Anderson 97% 27% Direct 97% N/R N/R N/R N/R N/R
(2011) observation
Hawken and Horner 85% 17% Permanent N/R 83% 92% 92% 75% 67%
(2003) product
Mong, Johnson, and 92% 33% Bothb 100% N/R N/R N/R N/R N/R
Mong (2011)
Swain-Bradway (2009)c 28% 10% Permanent N/R N/R N/R N/R N/R N/R
product
Swoszowski, Jolivette, 94% 22% Direct 99% N/R N/R N/R N/R N/R
Fredrick, and Heflin observation
(2010)
Todd, Campbell, Meyer, N/R N/R N/R N/R N/R N/R N/R N/R N/R
and Horner (2008)
Turtura, Anderson, and 92% N/R Permanent N/R 96% 100% 93% 88% 84%
Boyd (2014) product
Note. N/R = not reported.

a
IOA = Interobserver agreement with percentages referring to reported level of interobserver agreement for fidelity checks. bUsed direct observation
for check-in and check-out and permanent product for the other procedures. cValues are estimated from available data.
program in three of the studies providing evidence with standards with reservations. The fifth study randomly
reservations. Adaptations included the use of function- assigned students to either the intervention or comparison
based reinforcement for appropriate behavior in the class- group and met all remaining criteria for group experimental
room (Campbell & Anderson, 2011), providing academic studies (Cheney et al., 2009). As a result, there were two
assistance (Swain-Bradway, 2009), and modifying behav- studies meeting the design criteria with quasi-experimental
ioral expectations to be more specific to the student meeting with reservations and randomized trial meeting
(Turtura et al., 2014). These modifications to the standard without reservations.
CICO protocol were most prevalent in studies that found
the intervention to be effective for students with escape- Group-based evidence evaluation. Following the application
maintained behaviors indicating that such adaptations of the design standards, the posttest effect sizes for the
might be used to broaden the reach of the program. studies meeting design criteria were calculated for all
behavioral outcome variables from the data available in the
research report. Before describing these results, however, it
Evaluation of Group-Based Research Reports is necessary to review the analytic procedures used for
Group-based design evaluation. A schematic overview of the each study. First, the authors of the qualifying quasi-exper-
application of design and evidence standards is provided in imental study reported SMD effect sizes based on change
Figure 1. The application of the WWC group-based stan- scores (Simonsen et al., 2011). The use of change scores
dards revealed that four of the five studies did not use ran- presents both conceptual and statistical challenges that
dom assignment of subjects and, therefore, were reviewed make it less ideal than the SMD, particularly with pretest
as quasi-experimental designs. Three of these quasi-experi- group equivalence established (McGaw & Glass, 1980).
mental designs did not provide information on the pretest Because the authors reported pretest and posttest means
comparison of students resulting in their classification as and standard deviations, the more reliable and interpretable
not meeting design standards. The final quasi-experimental SMD effect sizes could be readily computed and were used
design provided information pertaining to pretest means, for the present review. Results of these analyses revealed
standard deviations, and sample sizes, allowing for analysis that the SMD effect sizes ranged from small for social
(Simonsen et al., 2011). Results revealed that there were no skills ratings (d = .02, 95% CI [−0.61, 0.65]) to moderate
pretest differences between the treatment and comparison (d = .40, 95% CI [−0.24, 1.03]) with none of the estimates
groups leading the study to be classified as meeting design indicating a statistically significant difference. Second, the
Maggin et al. 205
authors of the randomized trial reported results separately while the group-based research did not. These discrepant
for students who were responsive to the intervention and findings require careful consideration of the characteristics
those who were not (Cheney et al., 2009). The reported of both the intervention and the research methods used to
means and standard deviations for these groups were, evaluate the CICO program. For instance, the single-case
therefore, pooled and then compared with those of the con- research indicated that the student’s behavioral function
trol group. Moreover, the significance tests associated with moderated the effectiveness of CICO (Hawken et al.,
the pretest and posttest effect sizes were further adjusted for 2007). That is, students with attention-maintained problem
clustering due to the misalignment between the unit of ran- behaviors were more likely to respond to the standard
dom assignment and the unit of analysis using the proce- CICO protocol than those with escape-maintained behav-
dures developed by Hedges (2007). Finally, the Benjamini iors. Unfortunately, this differential responding for stu-
and Hochberg (1995) correction for multiple comparisons dents based on behavioral function was not formally
was used because information for several outcome mea- investigated in either of the two group-based studies meet-
sures was reported in each of the qualifying studies. The ing design standards. It is worth noting that Simonsen et al.
comparison of the clustered corrected probability values (2011) did establish that both groups were comprised of an
with those derived from the correction for multiple com- equivalent number of students with both attention- and
parisons revealed that no differences were present in the escape-maintained problem behaviors though differential
analyses. Specifically, there were eight posttest compari- responding was not formally investigated. Moreover,
sons reported with teacher-reported measures of internaliz- because both the group-based studies investigated the
ing behavior problems (p = .013) the only one that was effects of the standard CICO protocol, it is possible that the
found to be below the p < .05 threshold following the clus- effectiveness of the program was diluted due to the inclu-
tering correction. This probability value was then compared sion of students with both attention- and escape-maintained
with the recommended threshold associated with the cor- behavioral functions. In fact, descriptive group-based
rection for multiple comparisons, which was p < .006. research has provided some support to the importance of
Because the probability value associated with the analyses considering behavioral function when using the CICO pro-
was larger than the one associated with the correction for gram (e.g., McIntosh et al., 2009). Future experimental
multiple comparisons, there was no evidence of an inter- group-based research might provide a context to verify the
vention effect. None of the other cluster corrected estimates role of function within the standard CICO framework with
were found to be below the p < .05 threshold indicating no more rigorous studies that exclusively sample students with
posttest differences between groups. Given that the group- attention-maintained behavior problems.
based research on the CICO program produced no positive
findings, the evaluation of study characteristics and imple- Function-based adaptations. The results of the present review
mentation analysis for these studies was not conducted. seem to indicate that the standard CICO procedures tend to
be most effective for students with attention-maintained
behavioral problems. However, there was a subset of stud-
Discussion
ies demonstrating that students with escape-maintained
The purpose of the present review was to describe the behavior problems responded positively to the intervention
strengths, limitations, and generality of the research under- when it was adapted. For instance, Turtura et al. (2014)
lying the CICO program. The results of the review were made adjustments to the standard set of procedures to better
mixed with several single-case studies providing evidence align the intervention framework for students with escape-
with reservations and moderate visual and quantitative sup- maintained behavioral issues. Specifically, these authors
port, while none of the group-based studies provided evi- developed behavioral expectations that directly addressed
dence due to a lack of rigor and results. Drawing meaningful issues of escape and avoidance. Results of this study were
conclusions from these results requires careful consider- positive with all students demonstrating reductions in dis-
ation of these research paradigms while also taking into ruptive behavior and increases in work completion. These
account other important characteristics of the intervention preliminary findings are encouraging, and although they
and research. The principal findings of the review are sum- require additional research, might be meaningful for school
marized in the following sections with implications for personnel seeking standard adaptations for increasing the
practice subsequently described. pool of students for whom CICO might work.
Implementation of core components. The research indicating

Summary of Principal Findings
that adaptations of the standard CICO components might
Mixed results across research methods. The principal finding increase the generality of the program must be considered
of the present review was that the single-case research pro- within the context of the extent to which those components
vided sufficient empirical support for the CICO program, were implemented. As such, it is a notable strength of the
CICO research base that nearly all authors documented the to maximize the potential benefits of the intervention.
fidelity with which core components were implemented. Specifically, school personnel should be aware that the
Fidelity was consistently high; however, only two of the research consistently suggests that the standard CICO pro-
studies contained fidelity data for all standard components tocol is most likely to be successful for students whose
of CICO. Parental signatures and afternoon checkout were behavior is maintained through access to adult attention
not implemented as consistently as other core components, (e.g., Campbell & Anderson, 2011). As such, the standard
which raises questions regarding whether these components approach to CICO is likely applicable for only a subset of
are needed given the general positive findings of the single- students. Considered within the context of secondary inter-
case research. There is currently insufficient information to ventions, which are meant to address the needs of a majority
determine the role of these components, which might be of students not responding to preventive supports, school
investigated through subsequent research. As such, all core personnel should prioritize the enrollment of students with
components should continue to be implemented until there attention-maintained behavior problems. Moreover, school
is sufficient information to determine which components personnel should also strongly consider implementing
are necessary for success of the intervention. assessments of behavioral function within the decision-
making process to maximize the potency of the interven-
tion. Fortunately, there are assessments of behavioral
Limitations
functioning that are both brief and with good psychometric
Findings of this review should be considered with the fol- properties (March et al., 2000). The information drawn
lowing limitations in mind. First, although every effort was from these assessments can also be used to fashion the
made to retrieve all studies pertaining to the use of the CICO program to meet the needs of students with escape-
CICO program, it is possible that some studies remained maintained behavior problems or identify another interven-
unidentified. Second, although the use of the WWC criteria tion that addresses these student issues. It is worth noting
to evaluate the methodological and evidential strength of that the positive findings related to adapting the CICO pro-
included studies provides some assurance that only the gram for students with escape-maintained behavior prob-
most rigorous studies provided the basis for assessing the lems requires more research to fully validate these
generality of CICO, recent research has indicated that adaptations though the early returns are encouraging.
rubrics used to evaluate methodological strength often do The results of the present review revealed that none of
not agree. As such, the results of the present review should the group-based research contributed to the evidence base.
be viewed with the understanding that the use of other These results are certainly concerning given that secondary
rubrics with different criteria and scoring procedures might tier interventions are meant to be implemented across stu-
lead to disparate findings (Maggin, Briesch, Chafouleas, dents rather than represent an intensive, individualized
Ferguson, & Clark, 2014). Third, the reliance on visual intervention (Hawken et al., 2007). As discussed previ-
analysis to evaluate single-case research reports remains the ously, however, there might have been sampling issues that
preferred method, although visual analyses lack transpar- attenuated the potential effects of the CICO program.
ency because the graphed data are not readily available to Specifically, the samples within the group-based research
the reader (Kratochwill et al., 2013). The visual analyses were generally not screened for their behavioral function,
were supported with supplemental quantitative metrics; which has been identified as an important moderating vari-
however, the absence of a reliable measure is a limitation able in the single-case research. Potential reasons for the
(Shadish et al., 2008). lack of findings of the group-based research were described
earlier with particular attention paid to the need to consider
student function as a possible moderator of intervention
Practical Implications
success. In addition, the lack of intervention effects in the
The results of the current investigation indicate that the group-based research might relate to the reduced levels of
CICO program can be considered an effective secondary control that researchers have when using these types of
intervention when used to directly address the student’s designs as compared with single-case research methods
behavioral function. In fact, the procedures seem to be (Gast, 2010). Research has shown that as the control over
robust even outside of a multitiered system of behavioral the variables diminishes, so does the strength of the inter-
support when considering the function of behavior. The vention (Hulleman & Cordray, 2009). As such, school per-
capacity of the CICO program to lead to improved student sonnel might consider factors to increase their control over
behavior irrespective of a broader system of support is the delivery of the intervention. For instance, the use of a
encouraging and likely relates to the general effectiveness brief functional assessment prior to enrolling students into
of contingency management programs broadly (Fisher & CICO would increase control over which students would
Bouxsein, 2011). These are indeed positive findings, but receive the intervention. Other examples of factors under
school personnel must consider the totality of the research the control of school personnel include the delivery of
Maggin et al. 207
teacher training, coaching, and feedback to ensure the inter- of applied behavior analysis (pp. 335–355). New York, NY:
vention is being delivered as intended or the development Guilford Press.
of specific guidelines for the administration of the program. Gast, D. L. (2010). Scientific research in educational and clinical set-
The development of these structures would be designed to tings. In D. L. Gast (Ed.), Single subject research methodology
in behavioral science (pp. 20-31). New York, NY: Routledge.
increase the control of school personnel over the interven-
Hawken, L. S., Adolphson, S. L., Macleod, K. S., & Schumann,
tion. As it currently stands, the research on CICO is promis-
J. (2009). Secondary-tier interventions and supports. In W.
ing, but additional considerations are needed to increase its Sailor, G. Dunlap, G. Sugai, & R. Horner (Eds.), Handbook
effectiveness across students. of positive behavior support (pp. 395–420). New York, NY:
Springer.
Declaration of Conflicting Interests Hawken, L. S., & Horner, R. H. (2003). Evaluation of a targeted
The author(s) declared no potential conflicts of interest with respect intervention within a schoolwide system of behavior support.
to the research, authorship, and/or publication of this article. Journal of Behavioral Education, 12, 225–240.
Hawken, L. S., MacLeod, S. K., & Rawlings, L. (2007). Effects of
the Behavior Education Program (BEP) on office discipline
Funding
referrals of elementary school students. Journal of Positive
The author(s) received no financial support for the research, Behavior Interventions, 9, 94-101.
authorship, and/or publication of this article. Hedges, L. V. (2007). Correcting a significance test for clustering.
Journal of Educational and Behavioral Statistics, 32, 151–179.
References Hulleman, C. S., & Cordray, D. S. (2009). Moving from the
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false dis- lab to the field: The role of fidelity and achieved relative
covery rate: A practical and powerful approach to multiple intervention strength. Journal of Research on Educational
testing. Journal of the Royal Statistical Society, 57, 289–300. Effectiveness, 2, 88–110.
Busk, P. L., & Serlin, R. C. (1992). Meta-analysis for single- Individuals With Disabilities Education Act (IDEA), 20 U.S.C. §
case research. In T. R. Kratochwill & J. R. Levin (Eds.), 1400 (2004).
Single-case research design and analysis: New directions Kratochwill, T. R., Hitchcock, J., Horner, R. H., Levin, J. R.,
for psychology and education (pp. 187–212). Hillsdale, NJ: Odom, S. L., Rindskopf, D., & Shadish, W. R. M. (2010).
Lawrence Erlbaum. Single case designs technical documentation. Retrieved from
Campbell, A., & Anderson, C. M. (2008). Enhancing effects of http://ies.ed.gov/ncee/wwc/pdf/wwc_scd.pdf
check-in/check-out with function based support. Behavioral Kratochwill, T. R., Hitchcock, J. H., Horner, R. H., Levin, J. R.,
Disorders, 33, 233–245. Odom, S. L., Rindskopf, D. M., & Shadish, W. R. (2013).
Campbell, A., & Anderson, C. M. (2011). Check-in/check-out: Single-case intervention research standards. Remedial and
A systematic evaluation and component analysis. Journal of Special Education, 34, 26-28.
Applied Behavior Analysis, 44, 315–326. Maggin, D. M., Briesch, A. M., Chafouleas, S. M., Ferguson, T.
Cheney, D. A., Stage, S. A., Hawken, L. S., Lynass, L., Mielenz, D., & Clark, C. (2014). A comparison of rubrics for identify-
C., & Waugh, M. (2009). A 2-year outcome study of the ing empirically supported practices with single-case research.
check, connect, and expect intervention for students at risk Journal of Behavioral Education, 23, 287–311. doi:10.1007/
for severe behavior problems. Journal of Emotional and s10864-013-9187-z
Behavioral Disorders, 17, 226–243. March, R. E., Horner, R. H., Lewis-Palmer, T., Brown, D., Crone,
Crone, D. A., Hawken, L. S., & Horner, R. H. (2010). Responding D., Todd, A. W., & Carr, E. (2000). Functional assess-
to problem behavior in schools: The Behavior Education ment checklist for teachers and staff (FACTS). Eugene, OR:
Program (2nd ed.). New York, NY: Guilford Press. Educational and Community Supports.
Ervin, R. A., Schaughency, E., Matthews, A., Goodman, S. D., McGaw, B., & Glass, G. V. (1980). Choice of the metric for
& McGlinchey, M. T. (2007). Primary and secondary pre- effect size in meta-analysis. American Educational Research
vention of behavior difficulties: Developing a data-informed Journal, 17, 325-337.
problem-solving model to guide decision making at a school- McIntosh, K., Campbell, A. L., Carter, D. R., & Dickey, C. R.
wide level. Psychology in the Schools, 44, 7–18. (2009). Differential effects of a tier two behavior intervention
Fairbanks, S., Sugai, G., Guardino, D., & Lathrop, M. (2007). based on function of problem behavior. Journal of Positive
Response to intervention: Examining classroom behavior Behavior Interventions, 11, 82–93.
support in second grade. Exceptional Children, 73, 288–310. Mitchell, B. S., Stormont, M., & Gage, N. A. (2011). Tier two
Filter, K. J., McKenna, M. K., Benedict, E. A., Horner, R. H., interventions implemented within the context of a tiered pre-
Todd, A., & Watson, J. (2007). Check in/check out: A post vention framework. Behavioral Disorders, 36, 241–261.
hoc evaluation of an efficient, secondary-level targeted inter- Mitchell, M. (2007). Engauge digitizer: A free open-source soft-
vention for reducing problem behaviors in schools. Education ware to extract data points from a graphed image. Available
and Treatment of Children, 30, 69–84. from http://digitizer.sourceforge.net/
Fisher, W. W., & Bouxsein, K. J. (2011). Developing function- Mong, M. D., Johnson, K. N., & Mong, K. W. (2011). Effects
based reinforcement procedures for problem behavior. In W. of check-in/check out on behavioral indices and mathematics
W. Fisher, C. C. Piazza, & H. S. Roane (Eds.), Handbook generalization. Behavioral Disorders, 36, 225–240.
No Child Left Behind (NCLB) Act of 2001, Pub. L. No. 107-110, high school Behavior Education Program (Doctoral disserta-
§ 115, Stat. 1425 (2002). tion, 10262). Retrieved from University of Oregon Libraries,
Parker, R. I., & Vannest, K. (2009). An improved effect size Eugene.
for single-case research: Nonoverlap of all pairs. Behavior Swoszowski, N. C., Jolivette, K., Fredrick, L. D., & Heflin, L. J.
Therapy, 40, 357–367. (2012). Check in/check out: Effects on students with emotional
Scott, T. M., Alter, P. J., Rosenberg, M., & Borgmeier, C. (2010). and behavioral disorders with attention-or escape-maintained
Decision-making in secondary and tertiary interventions of behavior in a residential facility. Exceptionality, 20, 163–178.
school-wide systems of positive behavior support. Education Todd, A. W., Campbell, A. L., Meyer, G. G., & Horner, R. H.
and Treatment of Children, 33, 513–535. (2008). The effects of a targeted intervention to reduce problem
Shadish, W. R., Rindskopf, D. M., & Hedges, L. V. (2008). The behaviors elementary school implementation of check in–check
state of the science in the meta-analysis of single-case experi- out. Journal of Positive Behavior Interventions, 10, 46–55.
mental designs. Evidence-Based Communication Assessment Turtura, J. E., Anderson, C. M., & Boyd, R. J. (2014). Addressing
and Intervention, 2, 188–196. task avoidance in middle school students: Academic behav-
Simonsen, B., Myers, D., & Briere, D. E. (2011). Comparing a ior check-in/check-out. Journal of Positive Behavior
behavioral check-in/check-out (CICO) intervention to stan- Interventions, 16, 159-167.
dard practice in an urban middle school setting using an Van den Noortgate, W., & Onghena, P. (2008). A multilevel
experimental group design. Journal of Positive Behavior meta-analysis of single-subject experimental design studies.
Interventions, 13, 31–48. Evidence-Based Communication Assessment and Intervention,
Sugai, G., & Horner, R. H. (2009). Responsiveness-to-intervention 2, 142–151.
and school-wide positive behavior supports: Integration of What Works Clearinghouse. (2013, April 25). Procedures and stan-
multi-tiered system approaches. Exceptionality, 17, 223–237. dards handbook (Version 3.0). Washington, DC. Retrieved
Swain-Bradway, J. L. (2009). An analysis of a secondary level inter- from http://ies.ed.gov/ncee/wwc/pdf/reference_resources/wwc_
vention for high school students at risk of school failure: The procedures_v3_0_draft_standards_handbook.pdf

Maggin Et Al 2015 A Systematic Evidence Review of The Check in Check Out Program For Reducing Student Challenging

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Maggin Et Al 2015 A Systematic Evidence Review of The Check in Check Out Program For Reducing Student Challenging

Uploaded by

Copyright:

Available Formats

573630

A Systematic Evidence Review of the

Reducing Student Challenging Behaviors jpbi.sagepub.com

Daniel M. Maggin, PhD1, Jamie Zurheide, MA1,

Total Studies Eligible for Review

Meets Design Standards with Reservations

Evidence Evidence with Reservations No Evidence

0 group-based studies 0 Group-based studies 5 group-based studies

Student Dependent Age Primary Behavioral

Fidelity results and methods Fidelity reported for specific components

Reported Proportion Data Morning Daily Teacher Afternoon Home

Note. N/R = not reported.

Implementation of core components. The research indicating

You might also like