You are on page 1of 20

688655

research-article2016
BDXXXX10.1177/0198742916688655Behavioral DisordersRoyer et al.

Article

Behavioral Disorders
A Systematic Review of the 2017, Vol. 42(3) 89­–107
© Hammill Institute on Disabilities 2016

Evidence Base for Instructional Reprints and permissions:


sagepub.com/journalsPermissions.nav
DOI: 10.1177/0198742916688655
https://doi.org/10.1177/0198742916688655
Choice in K–12 Settings journals.sagepub.com/home/bhd

David J. Royer, MS1, Kathleen Lynne Lane, PhD, BCBA-D1,


Emily D. Cantwell, MEd1, and Mallory L. Messenger, MEd2

Abstract
We conducted this systematic literature review to explore the current evidence base of instructional choice, a low-intensity,
teacher-delivered strategy to support academic engagement and decrease challenging behaviors. Specifically, we focused
on school-based settings, applying quality indicators (QIs) and evidence-based standards of the Council for Exceptional
Children (CEC). Included articles met five criteria: (a) independent variables included instructional choice; (b) dependent
variables included behavior (e.g., disruptive behavior, problem behavior, aggression), time on task/academic engaged time,
and/or academic performance (e.g., task initiation, completion, accuracy, fluency); (c) interventions occurred with school-age
students in traditional school settings; (d) the study followed an experimental design; and (e) the article was published in English
in a refereed journal. Twenty-five articles (26 studies) from 13 journals met inclusion criteria. Findings indicated providing
students instructional choices increased desired academic behavior while decreasing instances of disruptive behavior. Of the
26 studies, three met all QIs, with nine additional studies addressing 80% or more. Due to small participant numbers, effect
sizes, and other factors in these studies, we classified instructional choice into the CEC evidence-based category of insufficient
evidence. We conclude with a discussion of limitations and directions for future inquiry.

Keywords
choice, evidence-based practice, instructional choice, quality indicators

Since the Individuals With Disabilities Education interventions provided to the few students with the greatest
Improvement Act (IDEA; 2004), more and more schools have risk elements, who need more than Tier 1 and Tier 2 provi-
implemented tiered systems to support academic, behavioral, sions (Fairbanks, Sugai, Guardino, & Lathrop, 2007).
and social needs of all students (Prasse et al., 2012; Sailor, Within tiered systems, many educators seek low-intensity
2015). Such tiered systems include academic response to strategies to support student academic and behavioral suc-
intervention (RtI; Fuchs & Fuchs, 2006), positive behavioral cess, including instructional choice, scaffolding independent
interventions and supports (PBIS; Horner & Sugai, 2015), work, peer tutoring, increasing opportunities to respond, and
and comprehensive, integrated, three-tiered (Ci3T; Lane, behavior-specific praise (Niesyn, 2009; Simonsen, Fairbanks,
Oakes, & Menzies, 2014) models of prevention, all working Briesch, Myers, & Sugai, 2008), and student-level strategies
to intervene at the first sign of student difficulty using system- such as self-monitoring and behavior contracts (Lane,
atic data-based decision making. These graduated systems of Menzies, Bruhn, & Crnobori, 2011). Instructional choice is a
support generally include three levels. At Tier 1, supports versatile strategy that can be incorporated into daily planning
such as research- or evidence-based core curricula, effective at Tier 1 and be used as an intervention for students who need
classroom management and organization, and effective teach- Tier 2 support (Lane, Menzies, Ennis, & Oakes, 2015).
ing strategies; school-wide expectations reinforced with Instructional choice occurs when “the student is provided
behavior-specific praise; and validated social skills curricula with two or more options, is allowed to independently select
(e.g., antibullying, antidrug, character building) are provided an option, and is provided with the selected option . . . within
to all students. For students needing additional support as evi-
denced, for example, by scoring at moderate or high risk on a 1
University of Kansas, Lawrence, USA
universal behavior screener (e.g., Student Risk Screening 2
Miami University, Oxford, OH, USA
Scale; Drummond, 1994) or scoring below grade level
Corresponding Author:
during academic benchmarking, Tier 2 supports are avail- David J. Royer, University of Kansas, 1122 W. Campus Rd., 521,
able, usually within small groups or as other low-intensity Lawrence, KS, USA.
interventions. Tier 3 supports are intensive, individualized Email: david.royer@ku.edu
90 Behavioral Disorders 42(3)

naturally occurring classroom events” (Jolivette, Stichter, & Schraw, 2000; Jolivette, Stichter, Sibilsky, Scott, & Ridgley,
McCormick, 2002, p. 28). Dibley and Lim (1999) separated 2002), increasing choice making to promote self-determina-
types of choice into between-task choices (e.g., which assign- tion (Algozzine, Browder, Karvonen, Test, & Wood, 2001),
ment to complete first, “Do you want to make a video or do and assessing choice as an intervention to increase academic
a debate to show what you learned?”) and within-task choices skills (e.g., time on task, task completion, accuracy) and/or
(e.g., which materials to use, where to work, with whom to decrease problem behavior (Kern et al., 1998). Studies
work). Offering a choice of reinforcer to work toward after a involving choice have been conducted in the family home
task is completed (e.g., “When you complete your assign- (Rispoli et al., 2013), residential facility (Ramsey et al.,
ment, would you like iPad time, to run an errand for me, or to 2010), isolated public school classroom (Smeltzer, Graff,
tell the class a joke?”) is another form of choice paired with Ahearn, & Libby, 2009), and university-affiliated school/
instruction that can provide extra motivation to increase task clinic for severe problem behaviors (Kern et al., 2002).
completion rate, task accuracy, and appropriate on-task Although choice making has been studied with students of
behavior (Lannie & Martens, 2008; Mechling, Gast, & various ages and disabilities in these more controlled set-
Cronin, 2006; Skerbetz & Kostewicz, 2015). tings, less inquiry has been conducted examining instruc-
In theory, offering choice helps all students develop self- tional choice in typical classroom contexts within or outside
esteem, self-determination, and feelings of control and inde- of tiered systems of supports (e.g., Skerbetz & Kostewicz,
pendence in life that in turn become positive behavioral 2013).
supports toward preventing problem behaviors (Shogren, Cordova and Lepper (1996) studied the effects of contex-
Faggella-Luby, Bae, & Wehmeyer, 2004). Historically, stu- tualization, personalization, and choice on learning outcomes
dents with disabilities have had limited opportunities in daily for 70 fourth- and fifth-grade students in general education
life to make choices compared with typically developing using variations of a computer game to teach the order of
same-age peers (Ramsey, Jolivette, Patterson, & Kennedy, mathematical operations (e.g., parentheses, exponents, mul-
2010; Skerbetz & Kostewicz, 2013), particularly students tiplication and division, addition and subtraction). Although
with emotional and/or behavior disorders (EBD) who strug- choices provided within the computer game were not rele-
gle to negotiate the school context successfully (Walker, vant to instruction, adding choice resulted in significantly
Forness, & Lane, 2014). Providing choice-making opportu- greater motivation and learning. Prusak (2004) also studied
nities to students with disabilities equates to more chances to choice within a general education context, examining how
increase independence and quality of life (Nota, Ferrari, choice of activities in physical education class affected vari-
Soresi, & Wehmeyer, 2007; Rispoli et al., 2013), sense of ous measures of motivation for 1,110 seventh- and eighth-
empowerment and self-determination (Kern, Bambara, & grade girls in 42 classes across five schools in two districts.
Fogt, 2002; Rispoli et al., 2013; Shogren et al., 2004), and Results showed the choice group had greater intrinsic moti-
internal locus of control (Ramsey et al., 2010; Romaniuk & vation and sense of self-regulation, and less feelings of being
Miltenberger, 2001). extrinsically controlled (Cohen’s d ≥ 0.928). Prusak sug-
Providing students with opportunities to make choices gested having activity choices increased responsibility,
may also lessen reliance on punishment and extinction allowed students to find activities more relevant to their
(Harding, Wacker, Berg, Barretto, & Rankin, 2002). Although interests, and improved enjoyment of the physical activity.
the connection between opportunities for choice, higher rates A few studies have examined the effects of choice making
of task engagement, and lower rates of problem behavior is when working with students with EBD, a group of students
not clear, evidence is growing to show choice has supple- whose internalizing and externalizing behavior patterns
mentary effects not elucidated by preference alone (Skerbetz often pose substantial challenges for general and special edu-
& Kostewicz, 2013). It is possible providing choice increases cators alike (Walker et al., 2014). Dunlap et al. (1994) worked
student “buy-in,” meaning students take more ownership with kindergarten and fifth-grade students in self-contained
over self-selected assignments and tend to be more engaged, classrooms, providing a menu of spelling tasks or books to
more likely to finish tasks, and have less opportunities to choose from. Compared with when teachers made choices
behave inappropriately. It is not known whether the act of for students, student choice increased the percentage of task
choosing is empowering enough to motivate increased task engagement and completion while lowering disruptive
engagement, whether choosing a preferred task or reinforcer behavior. Cole, Davenport, Bambara, and Ager (1997)
is what increases engagement, or whether it is a combination assisted three boys in a university-affiliated laboratory school
of the two. Katz and Assor (2007) posited choice is motivat- for students with EBD and autism spectrum disorder (ASD),
ing because of the element of control it presents students, but providing choice of vocational tasks. Effects of choice did
only when selections are relevant, limited (not overwhelm- not differ significantly from when preferred tasks were
ing in number), and culturally congruent. assigned, but task engagement was significantly lower when
Research on choice has included evaluating student pref- nonpreferred tasks were assigned. These results added to the
erences (King & Kostewicz, 2014), determining teacher- debate of whether having choice, accessing preferred tasks,
provided opportunities for student choice (Flowerday & or both make more difference. Jolivette, Wehby, Canale, and
Royer et al. 91

Massey (2001) provided three 7-year-old boys the opportu- which differs markedly from research conducted in highly
nity to choose the order to complete three math worksheets controlled clinical settings (Brown, 1992). In our analyses,
in a self-contained public school classroom for students with we acknowledge the formidable task of researching strate-
internalizing EBD, resulting in increased appropriate behav- gies in applied settings by employing weighted evaluation
ior and task engagement for two boys, whereas the third criteria established by Lane, Kalberg, and Shepcaro (2009).
maintained high engagement and a low rate of problem This approach involves considering a study methodologi-
behavior across conditions (see also Ramsey et al., 2010). cally sound if it meets 80% of QIs. Our research questions
These studies demonstrated choice can be an effective strat- were twofold.
egy to help students with EBD increase desirable academic
habits while decreasing disruptive behavior, likely due to Research Question 1: To what extent did instructional
increased autonomy and control in their environment choice studies address CEC (2014) QIs?
(Jolivette et al., 2001; Ramsey et al., 2010). Research Question 2: What is the nature of the evidence
Knowing instructional choice can improve outcomes for base supporting instructional choice according to CEC
students with EBD in controlled settings, students with ASD (2014) guidelines, applying an 80% minimum criterion
at home (Rispoli et al., 2013), students with pervasive devel- for methodologically sound studies (Lane et al., 2009)?
opmental disorder—not otherwise specified and fragile X
syndrome in a 1:1 instructional classroom (Smeltzer et al.,
2009), and typically developing students (Cordova & Lepper, Method
1996; Prusak, 2004), we sought to determine the evidence Search Procedures
base of choice for educators seeking low-intensity strategies
to support academic and behavioral success for all learners The first author began by searching electronic databases to
(e.g., typically developing, at risk of EBD or learning dis- identify instructional choice intervention studies with pre-
abilities, with identified disabilities) in typical pre-K–12 set- kindergarten through Grade 12 students, with the search rep-
tings. We focused on traditional school settings believing licated with 100% reliability by the third author. Included
instructional choice to be a versatile practice easily applied databases were Academic Search Complete, Education
to daily instruction (e.g., whole class, small group, and/or Abstracts, Educational Resources Information Center
individually) in general or special education with a range of (ERIC), ProQuest Research Library, PsycARTICLES,
students, including those with and at risk for EBD. PsychINFO, and Psychology and Behavioral Sciences
Collection. Boolean search terms were used to include all
possible combinations and derivatives of choice* AND
Purpose (instruction* OR academic OR assign* OR task* OR acti-
We conducted this systematic literature review to examine the vit*) AND (“time on task” OR “task completion” OR “aca-
evidence base of instructional choice for increasing academic demic engagement” OR “problem behavio*” OR “disruptive
success and/or reducing problem behaviors in traditional pre- behavio*” OR “destructive behavio*” OR “behavio* prob-
K–12 educational settings. Too often strategies are either (a) lem*” OR “inappropriate behavio*” OR “challenging
not researched with sufficient breadth or rigor to establish behavio*” OR “behavio* deficits” OR “aggressive behavio*”
them as evidence-based practices (EBPs) or (b) existing EBPs OR “undesirable behavio*” OR “task initiation” OR “on
are not adopted and implemented with fidelity (Slavin, 2008). task behavio*” OR “academic completion” OR “academic
Teachers may benefit from professional learning on simple, accuracy” OR “academic performance” OR “fluency”).
low-intensity strategies with enough empirical support to
suggest that, if implemented with integrity, they will yield
Article Selection
desired shifts in student performance. To this end, we
employed quality indicators (QIs) and standards set forth by Next, we read 1,413 article titles and abstracts detected in the
the Council for Exceptional Children (CEC; 2014) to deter- first step to determine which articles should be read in full to
mine whether instructional choice could be considered an see whether inclusion criteria (described subsequently) were
EBP for increasing academic success and/or reducing prob- met. Articles were excluded when the title or abstract did not
lem behaviors in traditional pre-K–12 educational settings. include choice making in an education context, if it was not
Meeting rigorous methodological checks are essential to an intervention study, or when participants were not in a tra-
examining individual studies and overall bodies of evidence, ditional pre-K–12 setting. A second reader reviewed all title
both of which are necessary components to determine and abstract coding to assess reliability. This check resulted
whether a strategy, practice, or program is indeed an EBP. in 96.74% agreement (κ = .68, 95% CI = [.59, .77], indicat-
CEC’s (2014) QI standards are highly stringent (i.e., having ing substantial agreement; Cohen, 1960; Landis & Koch,
to meet 100% of QIs for a study to be considered method- 1977). We used Microsoft Excel to list all titles and abstracts,
ologically sound) given the complexities of actually achiev- coding each article as “1” (include) or “0” (exclude) for
ing these standards when conducting school-based inquiry, meeting criteria, with disagreements resolved by another
92 Behavioral Disorders 42(3)

electronic search. A second author hand searched two of the


five journals to examine reliability. With the trend toward
online journal access, many volumes were not physically
available in either university’s library stacks and were “hand
searched” by clicking through each journal’s online version,
issue by issue. One additional article was identified in this
step (Killu, Clare, & Im, 1999), illustrating the importance of
hand searching as part of a systematic literature review. Hand
search IRR was 99.83% with one disagreement discussed
and resolved by establishing consensus (κ = .82, 95% CI =
[.69, .94], indicating near perfect agreement; Cohen, 1960;
Landis & Koch, 1977).
Next, two authors conducted independent ancestral
searches of the 38 included articles’ references (n = 862) and
identified seven potential studies for inclusion (IRR =
99.07%; κ = .96, 95% CI = [.93, .99], indicating near perfect
agreement). After obtaining and reading the full articles, we
determined none met inclusion criteria, with 100% agree-
ment between authors.
Inclusion criteria were then refined based on articles read
during search steps and our having gained an understanding
of available instructional choice literature. For example, we
initially included any study with choice of reinforcer until
reading articles, where reinforcer choice was offered after
task completion and in others, offered before tasks began.
We ultimately included studies where choice of reinforcer
was offered before a task began, as choice could then serve
as an antecedent intervention able to affect dependent vari-
Figure 1.  Flow diagram illustrating search procedures and article
inclusion. ables (DVs). Of the 38 articles initially identified, 13 were
removed using updated inclusion criteria due to (a) choice of
reinforcer given after task completion, (b) choice included
author. A total of 82 articles from 43 journals were selected option to not work on a task (e.g., take a break), (c) discrete
for possible inclusion (see Figure 1, search and article selec- trial training method, (d) intervention agent was technology
tion flow diagram). (e.g., computer program offered choices), and (e) interven-
Then, we read the 82 articles in full to determine which tion did not take place in a traditional school setting or was
met inclusion criteria, again having a second reader review unable to be determined. A total of 25 articles from 13 jour-
each article. Interrater reliability (IRR) was 82.93% (κ = .66, nals were retained.
95% CI = [.49, .82], indicating substantial agreement). Finally, we contacted the corresponding author of each
Disagreements arose when choice was imbedded within a included study and the editors of the journals from which
larger research design (e.g., functional assessment-based studies were obtained, and asked whether they were aware of
intervention, self-monitoring, timing of choice of reinforcer), any studies accepted for peer review, in press, or published
when the research design was vague or there were no clear recently that would not have been found by our searches. No
method and results sections, and when the intervention set- additional articles were identified.
ting was not explicitly stated. A third reader resolved dis-
agreements, resulting in 37 articles from 13 journals meeting
inclusion criteria.
Inclusion Criteria
Following reading articles in full, the first, third, and To be included in this review, studies had to meet five
fourth authors conducted a hand search of two major univer- conditions:
sity library stacks. Journals publishing two or more included
studies (Behavioral Disorders, Education and Treatment of 1.  The independent variable (IV) was some form of
Children, Journal of Applied Behavior Analysis, Journal of instructional choice provided by an adult, defined as “the stu-
Behavioral Education, and Journal of Experimental dent is provided with two or more options, is allowed to inde-
Education) were hand searched to verify electronic search pendently select an option, and is provided with the selected
results and identify additional articles missed by the option” (Jolivette, Stichter, & McCormick, 2002, p. 28). We
Royer et al. 93

further defined choice as across-task choice (choose which task components was 90.55% (range = 33.33%–100%) and for
to complete or the order to complete assigned tasks), within- studies was 92.26% (range = 81.82%–100%). Overall,
task choice (options of how to complete an assigned task), or Cohen’s κ was .87 (95% CI = [.83, .91]), indicating near per-
choice of reinforcement for when task is completed. Options to fect agreement (Landis & Koch, 1977). Because all four
not complete a task (e.g., take a break) or providing choice of authors were contributors to Lane, Royer, et al. (2015), a
reinforcement after task completion were not considered to be researcher at another university familiar with using CEC
instructional choice. Studies where computers or other technol- (2014) QIs agreed to be a third rater to ensure objectivity and
ogy devices provided choices (e.g., Cordova & Lepper, 1996) provided reliability for reconciled coding, which matched
were excluded as we were interested in choice as a low-inten- 100%.
sity strategy directly implemented by classroom teachers.
2.  The DV included one or a combination of behavior Training.  The first author was initially trained in QI coding
(e.g., disruptive behavior, problem behavior, aggression), by the second author during a summer university course in
time on task/academic engaged time, or academic perfor- single-case research design (SCRD). Training consisted of
mance (e.g., task initiation, completion, accuracy, fluency). reading multiple sources for QIs (i.e., CEC, 2014; Gast &
3.  Interventions occurred with school-age students in Ledford, 2014; Gersten et al., 2005; Horner et al., 2005;
traditional pre-K–12 school settings ranging along the spec- Kennedy, 2005) and compiling a QI matrix for single-case
trum of services, including general education, resource class- methodology (Lane, Common, Royer, & Muller, 2014) later
rooms or pull-out instruction, self-contained classrooms, and revised to include QIs for group comparison methodology.
magnet/specialty buildings. Studies conducted in residential We defined each QI more succinctly (see following sec-
treatment centers (e.g., Ramsey et al., 2010), home settings tions) to enhance uniformity between raters and add clarity
(e.g., Harding, Wacker, Berg, Winborn-Kemmerer, & Lee, for readers, with the goal of coding procedures minimizing
2009), or clinics resembling classroom settings (e.g., two of the use of inference, basing decisions on what was explicitly
the four students in Rispoli et al., 2013) were excluded along published in each study. Practice articles were then coded,
with those using discrete trial training, as they were highly results compared, and disagreements discussed and resolved.
controlled settings, varying substantially from traditional The first and second authors, plus a researcher working on a
school settings. review for increasing opportunities to respond (Common et
4.  The study used a single case or group comparison al., 2016), completed additional training in spring by com-
experimental or quasi-experimental research design. paring independently coded SCRD articles not included in
Nonexperimental (e.g., case study, descriptive study, meta- this study with criterion set at three consecutive articles with
analysis) studies on instructional choice were coded as an IRR ≥ 85%. For the final three practice SCRD articles, mean
article of relevance for reference. IRR was 92.42% (range = 86.36%–95.45%). We next coded
5.  The article was published in English in a peer- practice group design articles in the same manner, resulting
reviewed journal. Theses, dissertations, books, and chapters in a mean IRR of 97.10% (range = 91.30%–100%).
were excluded from review as the rigor of their review pro-
cess could not be determined. QI 1.0. Context and setting.  According to CEC (2014), to meet
QI 1.1, a study had to describe critical features of context/
setting relevant to the review (e.g., geographic region, type
Coding Procedures for QIs of school, type of classroom, physical layout). Although we
Articles meeting inclusion criteria were read and coded by the did not set a quantitative criterion as to how many features
first and second authors for the presence or absence of QIs of were required to meet QI 1.1, we required sufficient detail to
methodologically sound interventions. We applied the eight allow other researchers to replicate the study. IRR for QI 1.1
QIs stipulated in CEC’s (2014) Standards for Evidence-Based was 96.15%.
Practices in Special Education related to context and setting,
participants, intervention agent, description of practice, QI 2.0. Participants.  The participants QI contained two com-
implementation fidelity, internal validity, outcome measures/ ponents. To meet 2.1, a study had to describe relevant partici-
DVs, and data analysis. Coding procedures for each QI are pant demographics, such as grade level, age, gender, race/
reviewed below, with the guiding principle being, “Did the ethnicity, socioeconomic status, and/or language status. We
article contain enough information to enable replication?” considered this component met if at least one element was
First and second authors compared coding, resolved discrep- reported. To meet 2.2, a study had to describe participant dis-
ancies (n = 45 of 728), and calculated IRR for articles and ability or risk status and method of determination (i.e., who
each QI component by dividing the sum of agreements by the applied what criteria). For this review, it was not sufficient to
number of applicable components or by 26 (number of stud- only state a diagnosis (e.g., emotionally disturbed, attention
ies; Dunlap et al., 1994, contained two studies), respectively, deficit disorder), and when describing the method used to
times 100 to obtain a percentage. Mean IRR for QI establish the diagnosis, we required authors to state who
94 Behavioral Disorders 42(3)

(e.g., multidisciplinary team, family physician) applied what intervention (e.g., beginning, middle, end of the intervention
specific criteria (e.g., Diagnostic and Statistical Manual of period), and (b) for each interventionist, each setting, and
Mental Disorders; 4th ed.; DSM-IV; American Psychiatric each participant or other unit of analysis” (CEC, 2014, p. 4).
Association, 1994; IDEA). For example, having a learning To add clarity to QI 5.3, a member of the CEC workgroup
disability as determined by a multidisciplinary team’s appli- who developed the 2014 standards indicated almost any
cation of district, state, or federal criteria was sufficient, but mention of assessing implementation fidelity at different
teacher nomination to a classroom for students with behavior times would suffice (B. Cook, personal communication,
concerns was not, unless quantitative information from sys- March 14, 2015), in contrast to a study only mentioning
tematic procedures enabling replication was provided. We broad assessment without time points. For example, a study
also considered intellectual disability (ID) sufficient when that assessed 30% of sessions for implementation fidelity
paired with an IQ score. When all students in a classroom would not meet 5.3, but a study that assessed 30% of each
were participants, we considered QI 2.2 met, as students with phase for each participant would. We further clarified studies
disabilities were not the focus of such studies. IRR for QI 2.1 did not have to report a measure of fidelity for each condi-
was 100% and 2.2 was 84.62%. tion/phase if an aggregate measure of fidelity was reported
from sufficient time points throughout the study. IRR was
QI 3.0. Intervention agent. To meet QI 3.1, a study had to 100% for QI 5.1, 96.15% for 5.2, and 88.46% for 5.3.
describe the intervention agent’s role (e.g., teacher,
researcher, paraprofessional, parent; electronic devices were QI 6.0. Internal validity.  To meet QI 6.1, researchers had to
excluded) and background (e.g., race/ethnicity, educational demonstrate control and systematic manipulation of the IV.
background, licensure). This component was met for our For our review, we determined this QI could only be met if
review if at least one background variable was mentioned in implementation fidelity (QI 5.1) was established, allowing
addition to the agent’s role. To meet QI 3.2, a study had to (a) the possibility of experimental control (Carroll et al., 2007;
describe specific training the intervention required or quali- Slaughter, Hill, & Snelgrove-Clarke, 2015). To meet QI
fications (e.g., special education teaching credential) required 6.2, baseline conditions had to be described, though alter-
to implement the intervention and (b) indicate the interven- nating treatment designs were not required to have a base-
tion agent achieved them at some criterion. If the interven- line (though still recommended; Gast & Ledford, 2014). To
tion agent was the study author and designed the method and meet QI 6.3, authors had to indicate participants had no or
measures (e.g., Stenhoff, Davey, & Lignugaris-Kraft, 2008), very limited access to intervention components (i.e., choice)
he or she would not need to be “trained” how to implement in control/comparison conditions or during baseline and
the intervention, so QI 3.2 was coded as met. Although offer- withdrawal phases. We required this to be obvious by phys-
ing instructional choice might first appear to some people to ical location (groups at different schools) or explicitly
not need training, in practice, we found training was required stated in some manner (e.g., “no choice-making opportuni-
to confirm understanding and for systematic implementation ties were provided during baseline”; “choices were removed
with consistent integrity (e.g., Lane, Royer, et al., 2015). IRR during withdrawal phase”), and articles that only indicated
for QI 3.1 was 76.92% and 84.62% for 3.2. choices were provided during intervention phases did not
meet QI 6.3. QI 6.4 applied only to group designs, and
QI 4.0. Description of practice. QI 4.1 required studies to required researchers to describe how participants were
describe intervention procedures and intervention agent assigned to groups, and if not done so randomly, to describe
actions (or cite a source where such information may be procedures used to ensure functional equivalence of groups
found) with enough detail to be replicated. Description of (e.g., matched on various scores and/or demographics, sta-
materials used was required to meet QI 4.2, or an appropriate tistically control for differences).
citation. IRR was 96.15% for QI 4.1 and 100% for QI 4.2. SCRD studies were required to employ a design that
allowed for the possibility of three demonstrations of experi-
QI 5.0. Implementation fidelity.  To meet QI 5.1, a study had to mental effect at three different times (e.g., A-B-A-B with-
assess and report implementation fidelity related to faithful- drawal, multiple baseline, alternating treatment, changing
ness using a direct, reliable measure such as an observation criterion) to meet QI 6.5. When study authors stated use of a
checklist of critical intervention components (e.g., teacher particular design (e.g., A-B-A-B), we confirmed graphed
offered students a choice between X and Y; teacher allowed results matched (e.g., was not A-B-A). To meet QI 6.6,
students to choose order of tasks). For 5.2, studies had to SCRD studies with a baseline phase were required to include
assess and report implementation fidelity related to dosage at least three baseline data points unless justification other-
or exposure also using a direct, reliable measure. In general, wise was provided (e.g., ethical considerations, counterther-
we coded QI 5.2 met if 5.1 was met, dosage was described apeutic trend). QI 6.7 was met when the SCRD controlled for
(not necessarily reported by checklist), and a graph of DV common threats to internal validity and allowed alternative
data was present. To meet QI 5.3, a study had to assess fidel- rationalizations for results to be sensibly ruled out (e.g., A-B-
ity “(a) regularly throughout implementation of the A-B withdrawal, multiple baseline, alternating treatment,
Royer et al. 95

changing criterion). In this review, we again required QI 5.1 as an evidence-based practice, potentially evidence-based
(implementation fidelity) be established as a prerequisite for practice, mixed evidence, insufficient evidence, or negative
QI 6.7, as an SCRD could only control for threats to internal effects. Studies can be considered for EBP classifications
validity if it was implemented with integrity. QI 6.8 was met when they are methodologically sound, which for our review
for group studies when overall attrition across groups was meant meeting 80% of group or SCRD QIs. Such studies are
low, which we defined as <30%, and QI 6.9 was met when then classified as having positive, neutral or mixed, or nega-
between-group attrition was low, defined as <10%, or statis- tive effects based on effect sizes for group designs (e.g.,
tically controlled for. IRR scores for QIs 6.1 to 6.9 were Cohen’s d ≥ 0.25 = positive, d ≤ −0.25 = negative, −0.25 <
92.31%, 100%, 73.08%, 66.67%, 86.96%, 91.30%, 95.65%, d < 0.25 = neutral or mixed) or for SCRD the number and
100%, and 100%, respectively. proportion of participants: With a minimum of three total
cases, effects are positive when at least 75% of cases dem-
QI 7.0. Outcome measures/DVs.  To meet QI 7.1, outcomes onstrate a therapeutic change as a result of a functional rela-
of the intervention had to be socially valid, which for this tion between the IV and DVs, effects are negative if at least
review could be demonstrated by a social validity ques- 75% demonstrate countertherapeutic changes, and effects
tionnaire or if a strong case was made in the study’s intro- are neutral or mixed when changes occur in either direction
duction or discussion. For QI 7.2, a study had to define for less than 75% of cases. An EBP must have positive
DVs and describe a direct, reliable measurement system effects in (a) two randomized group studies with 60 or more
used with enough detail to allow replication (Cook et al., participants, (b) four nonrandomized group studies with 120
2015). To meet QI 7.3, all outcome measures had to be or more participants, (c) five SCRDs with 20 or more par-
reported (clearly graphed data were sufficient). To meet QI ticipants, (d) one randomized group study with 30 or more
7.4, frequency and timing of outcome measures had to be participants plus three SCRDs with 10 or more participants,
appropriate, meaning for SCRD a minimum of three data or (e) two nonrandomized group studies with 60 or more
points per phase were collected or otherwise justified (e.g., participants plus three SCRDs with 10 or more participants.
ethical considerations, countertherapeutic trend) and for No studies can have negative effects and the ratio of positive
group designs, assessment measures were conducted rela- to neutral or mixed effects must be 3:1 or greater. For poten-
tively near the intervention chronologically. QI 7.5 was tially EBP, mixed evidence, and negative effects classifica-
met when reliability of outcome measures was evidenced, tion requirements, please see CEC (2014). A practice is
such as by test–retest reliability or interobserver agree- considered for the insufficient evidence category when there
ment (IOA) ≥80% or κ ≥ 60%. For this review, a mean IOA are not enough studies to meet criteria for any of the previ-
was acceptable provided no reported IOA scores fell below ous evidence-based categories.
60%. If the low end IOA range was consistently <70% After coding QIs, we applied an 80% criterion (Lane
when using total agreement (Kennedy, 2005), QI 7.5 was et al., 2009) to determine which studies were methodologi-
not considered met because total agreement often results in cally sound. This method gave recognition to articles that
inflated percentages as opposed to interval-by-interval met most QI components by proportionally weighting
agreement (Cooper, Heron, & Heward, 2007). QI 7.6 was scores to contribute to a composite score. For example, QI
applicable to group designs and met when validity was evi- 7.0 has five components for SCRD, so each component
denced, such as through content or construct validity. IRR contributes 20% to the total score for QI 7.0. If one com-
for QIs 7.1 to 7.6 were 100%, 100%, 96.15%, 96.15%, ponent of QI 7.0 was not met, rather than score the QI as
80.77%, and 100%, respectively. zero, weighted scores for remaining components contrib-
ute to a composite score of .80 (.20 + .20 + .20 + .20 = .80),
QI 8.0. Data analysis. QI 8.1 and 8.3 applied only to group meeting the 80% weighted coding criterion. We considered
designs, and were met when appropriate data analyses were a total score of 6.40 (.80 × 8 QIs = 6.40) as indicating suf-
utilized to compare group performance and when an effect size ficient rigor to categorize the study as methodologically
statistic was reported (or enough data to where an effect size sound.
could be calculated), respectively. QI 8.2 applied to SCRD,
which required a graph of DV data from all phases allowing
visual analysis techniques (i.e., mean, level, trend, overlap,
Results
consistency of data patterns across phases) for making conclu- This systematic review of the literature sought to determine
sions about experimental control. IRR for QI 8.1 and 8.2 was the extent instructional choice interventions in traditional
100% and 33.33% for QI 8.3 (see “Discussion” section). K–12 school settings (no pre-K studies were identified)
adhered to the eight CEC (2014) QIs, and to what degree
studies meeting QIs support instructional choice as an EBP
Evaluation Procedures for Determining EBP
following CEC guidelines. Dunlap et al. (1994) conducted
After coding articles, we applied CEC (2014) criteria to two experiments, which we coded separately, bringing the
determine whether instructional choice could be categorized total number of studies reviewed to 26.
96 Behavioral Disorders 42(3)

QI 1.0. Context and Setting dot-to-dot worksheets, math, stringing beads, pattern blocks,
wiping trays), and how tasks were presented to participants
Results revealed all studies met QI 1.0, sufficiently describ- on a student choice board for the choice condition and on a
ing context and setting. All studies reported type of program teacher’s choice board for the yoked no-choice condition,
or classroom, and all but three studies reported type of with each board ending in free time. Hua et al. (2014) pro-
school. Additional descriptors were sometimes described, vided detailed descriptions of materials and procedures for
with eight studies (30.77%) reporting area type (e.g., large each 5-min session at the end of math instruction, including
urban city) and 10 studies (38.46%) reporting geographic size of worksheets and paper slips used and how multiplica-
location (see Table 1 for descriptive statistics). tion problems were generated.

QI 2.0. Participants QI 5.0. Implementation Fidelity


A large variety of participant demographics were reported Overall, 11 studies (42.31%) met all components of QI 5.0.
(see Table 1 for a full listing of student characteristics). All Twelve studies (46.15%) met QI 5.1 implementation fidelity
studies met QI 2.1 participant demographics, and 15 studies adherence, 23 studies (88.46%) met QI 5.2 implementation
(57.69%) additionally met QI 2.2 disability status (see Figure fidelity dosage, and 11 studies (42.31%) met QI 5.3 assess-
2). Studies met QI 2.2 when they described the disability (or ing and reporting implementation fidelity regularly through-
risk status) and provided a description of who applied what out the study. Jolivette et al. (2001) videotaped all sessions
criteria to make the determination. For example, Seybert, for two observers to later assess for teacher implementation
Dunlap, and Ferro (1996) met QI 2.2 as IQ scores were of procedures. Rispoli et al. (2013) described a procedural
reported along with ID diagnoses; Romaniuk et al. (2002) task analysis for at least 30% of sessions across all condi-
followed teacher nominations with functional analyses; and tions and described frequency and length of each session
Cole and Levinson (2002) conducted observations to con- (i.e., intervention dosage). Lane, Royer, et al. (2015) met QI
firm high rates of challenging behaviors reported by teach- 5.1 with component checklists for each phase and condition,
ers’ nominations. In studies where the whole class participated including one for baseline and withdrawal phases to ensure
(Bicard, Ervin, Bicard, & Baylot-Casey, 2012; Clifford, (a) baseline components remained in effect during choice
1975; Myrow, 1979; Patall, Cooper, & Wynn, 2010), QI 2.0 conditions and (b) addition of choice was the only change in
was considered met. intervention phases.

QI 3.0. Intervention Agent QI 6.0. Internal Validity


Seven studies (26.92%) met both components of QI 3.0. Eleven studies (42.31%) met all applicable components of
Twelve (46.15%) sufficiently described the role and back- QI 6.0. Thirteen studies (50.00%) met QI 6.1 having system-
ground of the intervention agent to meet QI 3.1, and eight atically manipulated the IV, all met QI 6.2 having described
(30.77%) reported required training to meet QI 3.2. In most control/comparison or baseline conditions, and 17 (65.38%)
studies (n = 18; 69.23%), intervention agents were either the met QI 6.3 avoiding baseline or control/comparison contami-
classroom teacher, paraeducator, or a school therapist, nation. All group design studies described assignment to
whereas in other studies, experimenters implemented choice groups and met QI 6.4, whereas two (66.67%) met 6.8 and
interventions. In Myrow (1979), the author and graduate stu- 6.9 for acceptable attrition across and between groups. For
dent implemented the study, so training was not required as SCRD, 21 (91.30%) met QI 6.5 having used an experimental
they designed the intervention. Hua, Lee, Stansbery, and design that allowed for demonstration of experimental effects
McAfee (2014) described how teacher training used a model, at three different time points, all met QI 6.6 having included
prompt, check method with procedural checklist, meeting QI three data points in all baseline phases, and 11 (47.83%) met
3.2. Lane, Royer, et al. (2015) provided a table of intervention QI 6.7 with an experimental design appropriate for control-
agent demographics to meet QI 3.1 and detailed the remote ling for threats to internal validity.
training method (e.g., voiced-over presentations, quizzes, Smeltzer et al. (2009) and Hua et al. (2014) used alternat-
data collection video practice, teleconferences) used to reach ing treatment designs that do not require baseline, but each
criterion before intervention implementation, meeting QI 3.2. no-choice condition was described and reported at least three
data points so QI 6.2 and 6.6 were met. For QI 6.3, avoiding
contamination of control/comparison or baseline, Cole and
QI 4.0. Description of Practice Levinson (2002) used treatment integrity data to show no-
All studies met QI 4.1 intervention procedures and agent choice conditions only had directive prompts and no choice
actions whereas 25 (96.15%) also met QI 4.2 materials. For questions; Patall et al. (2010) kept students unaware of the
example, Smeltzer et al. (2009) described the activity prefer- treatment condition classmates were in, and directed them to
ence assessment used to identify low-preference tasks (e.g., not discuss aspects of the study.
Table 1.  Descriptive Results of Instructional Choice Intervention Studies.
Cole, Davenport, Bambara,
QI Clifford (1975) Myrow (1979) Dunlap et al. (1994) Seybert, Dunlap, and Ferro (1996) and Ager (1997) Powell and Nelson (1997) Vaughn and Horner (1997)

1.0 18 fifth-grade and 16 sixth-grade Nine English classes at Urbana HS, Public ES, self-contained SpEd HS in suburbs of large metropolitan University-affiliated laboratory GE classroom during afternoon Two public ES SpEd classrooms
classes from 15 school systems Illinois classes for students with ED area in southeastern United States; school for students with ELA with teacher and one to
in Iowa large vocational classroom or EBD and autism: 75 two observers and 23 students
courtyard on school campus students with serious ED
and 25 with severe ID
and PB
2.0 811 students (410 boys, 401 ~200 11th- and 12th-grade students Study 1 = Wendall & Sven, fifth Three students with ID: Scott = 14 Three boys: Abe = 12 yo, Evan = 7 yo second-grade boy Four students, 7–12 yo, ID,
girls), plus partial data for 46 in English; 133 completed all tasks; grade; Study 2 = Ahmad, yo, Bob = 15 yo, Maria = 21 yo also Ben = 13 yo, Sam = 11 with ADHD Down syndrome and ID,
more students; classes chosen teachers volunteered full classes K; students referred by deaf and with a physical disability yo; informal observations ASD and moderate to severe
based on teacher volunteers of students based on department homeroom teacher confirmed teacher reports ID, Angelman syndrome and
head invitation of off-task and DB moderate to severe ID
3.0 Classroom teachers; training = Author and graduate student Study 1 unclear; Study 2 = Exp; no background variables or Exp; no background variables Teacher; no background variables Teachers; no background variables
instruction manual research assistant behavioral consultant; no training described or training described or training described or training described
background variables or
training described
4.0 Four formats of vocabulary study Six reading packet options at Two 15-min independent Two tasks 7 min each for each 15-min sessions with a 5-min Three assignment choices each 10-min sessions, either choice
lists with 20 words; next day 11th-grade reading difficulty level; seatwork sessions, English or session; representative materials for break between; choices session included spelling lists, between two higher preference
= multiple-choice quiz on half on day 2 choice and no-choice spelling; menu of six to 10 four to six vocational tasks were of tasks included stapling, silent reading, grammar and tasks or choice between two
the words situations reversed academic activities placed on a table, students were sealing envelopes, bagging punctuation exercises, and lower preference tasks
asked to select one each session utensils, collating paper, and others
stuffing folders with paper
5.0 Not described Not described Not described Tracked exp interactions Not described Not described Videotaped and coded on
(redirections) only computers
6.0 Group pre–post experimental Group pre–post quasi-experimental Wendall and Ahmad = A-B- Multiple-baseline design across Alternating treatment design; A-B-A-B design; no academic A-B-A-B design, with choice
design with random crossover design; high attrition A-B; Sven = A-B-A; during participants with reversal: A-B-A students experienced three choices given during baseline conditions compared with those
assignment of classes to choice rate >30% no-choice phases teacher design (Bob) or A-B-A-B design conditions each day during and withdrawal phases where the teacher selected the
or no-choice conditions selected assignments or book (Scott and Maria) vocational period same tasks
7.0 DVs = multiple-choice vocabulary DVs = 25-item multiple-choice tests, DVs = task engagement and DVs = % of intervals with PB and task DVs = task engagement, DB, DV = undesirable behavior DV = PB/DB; IOA for 39%–48% of
quizzes, task-liking measure, personal causation scale, affective DB; IOA for 38%–57% of engagement; IOA for 30%–34% of work productivity, accuracy; recorded by 10-s momentary sessions; SV not reported
perceived-learning measure, measure, continuing motivation sessions; SV not reported sessions; SV not reported 32% of experimental time sampling; IOA for 70% of
retention measure; SV not measure, study time; IOA and SV sessions assessed for IOA; sessions; SV not reported
reported not reported SV not reported
8.0 Choice decreased performance Three of four choice conditions Greater % of task completion Introduction of choice reduced mean Task engagement comparable Undesirable behaviors decreased Mean rates of PB per minute
and did not affect attitude; scored higher than no-choice and intervals with task rate of problem behaviors; task for assigned preferred and during choice conditions during choice between low-
only with vocabulary format conditions on test; significant engagement during choice engagement increased during choice choice tasks; productivity preference tasks were lower,
3 did students in the choice main effect of choice on affect, phases and lower % of DB conditions; interest and happiness had variable results with high-preference tasks,
condition score higher than motivation, and time spent intervals affect measures were higher or the rates of PB were variable but
students in the no-choice studying, but not on personal same in choice conditions relatively lower regardless of
condition causation choice or no choice

(continued)

97
98
Table 1.  (continued)
Jolivette, Wehby, Canale, and Daly, Garbacz, Olson, Persampieri,
QI Killu, Clare, and Im (1999) Massey (2001) Cole and Levinson (2002) Kern, Bambara, and Fogt (2002) Romaniuk et al. (2002) Carson and Eckert (2003) and Ni (2006)

1.0 Public MS with 250 students; Public ES; self-contained SpEd University-affiliated laboratory University-affiliated private MS for Public elementary school, Northeastern urban ES (K–6), 515 Public MS; small room next to
self-contained SpEd classrooms classroom for students with school for students with EBD students with severe behavioral classrooms serving children students, 53.7% White, 41.5% SpEd classroom
during independent spelling EBD who displayed internalizing challenges, science classroom with behavior problems Black; ~50% FRL
work behaviors
2.0 Three boys, 12–13 yo, LD, Three boys with ED, math level 1–2 Two boys, 7 and 8 yo, one with Six boys, 13–14 yo, labels of severe Three boys and four girls, 5– Three students: Tamara and Two seventh-grade, 13 yo students
ID; teacher reports of yo below grade level, Nicky = PDD, Tourette’s syndrome, ED and other diagnoses 10 yo, variety of diagnoses, Catrina = 9 yo, female, Black; with EBD reading at fourth-
distractibility and off-task 7.10 yo, first grade; John = 7.6 yo, and ID, one with Down range of classroom settings Devin = 10 yo, male, White; grade level referred by teacher;
behavior verified through second grade; Bruce = 7.10 yo, syndrome, PDD, ADD, referred by teacher, verified by Tina = female, Black; Jacob male,
independent observation for second grade and ID three math probes White; assessed for inclusion
study inclusion
3.0 Female teacher presented Female SpEd teacher; training = Three female paras; 30-min Teacher; first or second author Therapist; no background Four graduate students in school Graduate students in a school
choices to students; teacher restated, modeled, and practiced training prior to each provided training the day before variables or training psychology; trained in study psychology training program
and paraeducator reviewed delivery condition each phase described procedures and demonstrated with prior training in behavioral
work for accuracy and 100% accuracy measurement and CBM
completion; training not procedures
described
4.0 30-min sessions 4 days per week; First 15 min of independent math Paras embedded choice 21- to 35-min sessions during science; Choice of eight to 10 teacher- One or two 15-min sessions three 3 days per week, 10 min of
five spelling task options period; chose order of three math opportunities in daily group and individual choice of nominated academic tasks times per week with 3-min instruction then test; chose type
worksheets instructional routine task activity break; in phase 2, students and time of reading instruction
analyses chose math condition and reward
5.0 Not described All sessions videotaped and assessed Type of verbal prompt; no- Type of activity, choice provided, and Not described Scripted procedural protocols; Checklist protocol for each
for teacher implementation of choice conditions = 100% high-interest activity; teacher plan 33% of sessions assessed by condition including scripted
procedures (100% for all) directive prompts; choice log compared with observer data audiotape = 100% integrity verbal instructions; 100% of
condition sessions needed sessions assessed by audiotape
≥80% choice questions
6.0 A-B-C-D-E-F design: choice Multiple-baseline design with B-A-B A-B-A-B design; during all A-B-A-B design; during baseline A-B-A-B design; functional Phase 1 = multielement assessment; Multiple-probe design across
of preferred tasks, choice withdrawal; during no-choice no-choice conditions only lesson activities teacher directed analysis condition that phase 2 = alternating treatment tasks (reading passages); during
of nonpreferred, no choice sessions the teacher randomly directive prompts were given all students to complete same produced highest rate of design; baseline condition was baseline and maintenance
preferred, no choice assigned order of worksheet assignments same format PBs served as the first no- one math CBM for 2 min with no students read aloud while exp
nonpreferred, no choice completion choice condition other intervention or feedback marked errors
preferred yoked, no choice
nonpreferred yoked
7.0 DV = task engagement; IOA for DVs = task engagement, off-task, DVs = % of task analysis DVs = engagement and destructive DV = % of session time with DV = digits computed correctly DVs = correct read words and
29%–31% of sessions; SV not disruption, attempted task steps with PB, number of behavior; IOA for 29% of sessions; PB (Riley’s DV = frequency per minute; interscorer errors per 30 s; IOA for 100%
reported problems and problems correct; task analysis steps prior to SV = TARF-R and class evaluation of PB because PBs were agreement for 100% of sessions; of sessions by audio recording;
IOA for 25%–29% of sessions; SV PB, and (for Wally) % of sheet brief); IOA for students SV not reported SV not reported
= TARF-R independently initiated steps; across 25% of sessions; SV
SV not reported not reported
8.0 Higher task engagement Two of three students showed Lower % of task analysis steps Mean class engagement increased 30% Mean % of PB time reduced All students chose contingent Jacob showed change in
with high-preference tasks increased appropriate behaviors with PB and larger number first intervention; mean destructive 46.6% for four students reinforcement; for two of three trend; Tina change in level;
regardless of choice or no and task engagement with of steps completed prior behavior decreased from 12% students, choice conditions cannot separate choice from
choice; minimal effects of moderate effects with different to PB during choice; Wally to 0% were higher than baseline but opportunities to respond and
choice with nonpreferred tasks results for each variable increased independently not exp chosen condition motivational variables
initiated steps

(continued)
Table 1.  (continued)
Mechling, Gast, and Cronin Stenhoff, Davey, and Lignugaris- Smeltzer, Graff, Ahearn, and Libby Patall, Cooper, and Wynn Bicard, Ervin, Bicard, and Baylot-
QI (2006) Lannie and Martens (2008) Kraft (2008) (2009) (2010) Casey (2012) Rispoli et al. (2013)

1.0 Public middle school, self- Large mid-Atlantic urban district; High school resource biology Public elementary school in a separate 14 classrooms in two Southeastern parochial ES; fifth- Alex = resource classroom during
contained classroom for ES with 615 students (79% Black, classroom with 15 students classroom used for 1:1 instruction southeastern urban HSs; grade class with 20 students ELA; Dylan = self-contained
students with ASD 19% Latino, 0.8% White; 86.9% classified with LD or 55.5% White, 28% Black, average per day classroom; Kelly = university-
FRL); fifth-grade GE during math behavior disorders 7% Asian, 3% Latino, 1.5% based autism clinic; Eddie =
instruction Native American, 5% Other kitchen table at home
2.0 Two boys: Jackson = 13.2 yo, Two boys and three girls, fifth grade, 15-year-old male, ninth grade, Three boys: Will = 8 yo, second 207 students, 54% female, 21 fifth-grade students, 10 boys Three boys and one girl, 5–11 yo,
diagnosed with autism and Black, ages 10.7 to 12.4, below with LD grade, with PDD-NOS; Dan = 6 yo, Grades 9–12, 14 and 11 girls, 10–11 yo; two with ASD
mild ID; Donald = 14.4 yo, 16th or 25th percentile on WJ-III K, with ASD; Frank = 6 yo, different classrooms, six subjects students received SpEd services
diagnosed with autism and calculation subtest, on task <60% K, with Fragile X
moderate ID
3.0 Intervention agent unclear First author and two UG SpEd teacher (researcher) Exp; no background variables or Preservice teachers Teacher (second author) working Dylan’s teacher; trained graduate
students; trained prior to study, training described completing teaching on master’s degree student researcher for other
demonstrated proficiency with all internship; received training participants
procedures from first author
4.0 Two 30-min independent work 5- to 7-min sessions for 10 weeks; 40-min sessions; class Activity preference assessment = Homework assignments Individual seating and group seating 5-min sessions two to four times
sessions per day; three work audio cues for students to instruction followed by three low-preference tasks on over 4 weeks; choice choice conditions; students per day, 2–4 days per week;
tasks completed per session; self-monitor during math probe assignment to whole class; choice board then teacher’s board between two homework chose seats at the beginning of across-task choice = choose
choice of reinforcer completion; received chosen student given 15 s to make (no-choice condition, yoked), each assignments; no-choice the week for that week from two to four activities;
reinforcer after five successful choice between class ending in free time students yoked to choice within-task choice = choice
intervals assignment or alternate students and completed the of work location, materials,
assignment same assignment response
5.0 Videotaped 33% of sessions Scripted protocols; 35% sessions Not described Not described End of unit assessment of Secondary observer checked 30% TI data collected 30% of sessions
across conditions observed; 100%, 99%, 99%, 98% student-perceived choice of the sessions using procedural task analysis
TI for each student regarding homework
6.0 A-B-A-B design; identical tasks Multiple-baseline design; during A-B-A-B design; A = no-choice Yoked condition alternating treatment Group pre–post experimental A-B-A-B reversal design for A-B-A-B alternating treatment
each condition; A = only baseline students completed condition, B = choose then concurrent operants (picked crossover design, randomly individual seating and again for (across-task and within-task
tangible reinforcers; B = video math probes at frustration level between two assignments who chose order of tasks), assigned students to choice group seating choice by random assignment)
presentation of preferred for 5 min followed by a three-condition or no-choice conditions by design; A = no choice
items and choice of videos alternating treatment design drawing names out of a hat
7.0 DVs = task duration, task errors; DVs = digits correct per minute, DVs = % of assignment DVs = PB rate or % (IOA for DVs = intrinsic motivation DV = DB measured by tally per DVs: off-task, screams, aggression;
IOA for 33% of sessions across accuracy, on-task behavior; IOA completed and accuracy 30%–33% of sessions); on-task (IOA inventory, homework student in 1-min intervals and elopement, delayed echolalia,
all conditions; SV not reported for 35% of sessions; SV = CIRP (IOA for 31%–62% of for 33% of sessions); duration to assignments, unit tests; rate calculated for whole class; destruction, verbal protesting;
sessions); SV = course grade complete tasks (IOA not reported); IRR not provided; SV not IOA for 28% of sessions; SV not IOA 20% sessions; SV not
before and after SV not reported reported reported reported
8.0 Mean task duration decreased Digits correct/minute: Willie = % task completion increased On-task behavior higher with choice Choice = predictor of interest DB occurred less when teacher All participants’ rates of PB
with choice in each +4.56, Ellen = +0, Debra and 87% with 75% accuracy; for Dan, mixed for Will, Frank and enjoyment, perceived selected seats; during group decreased in choice phases,
intervention phase Charles increased during self- reintroduction task = no change; rate of PB lower; competence, test scores, seating with choice, DB across-activity choice associated
monitoring; accuracy results = completion = 99% with 81% task order choice reduced task and homework completion, occurred >twice as often; during with lowest rate of PB
little evidence accuracy completion time but not effort, value, or individual seating choice, DB
pressure during homework occurred >three times as often

(continued)

99
100
Table 1.  (continued)
QI Skerbetz and Kostewicz (2013) Hua, Lee, Stansbery, and McAfee (2014) Lane, Royer, et al. (2015) Skerbetz and Kostewicz (2015)

1.0 Public charter school in large urban city, Public ES in large urban Pennsylvania district, resource room Large, suburban, Midwest public ES, ~600 K–5 students; GE first-grade, charter school in Large urban city, northeast United States, ~300
general education classroom, vocabulary reading and math with one SpEd teacher, one part-time 25 students, writing block, with SpEd teacher supporting students K–6 students; GE fifth-grade, 23 students, groups of four to five
component of ELA; 20 other students paraeducator, and 10 to 12 students; small group with disabilities during independent math review activity
present and data collector
2.0 Two boys and three girls, eighth-grade, Two boys and one girl, fourth-grade, White, 10 yo, identified One boy, one girl, first-grade, moderate risk on SRSS; Neal = 6 yo, Two boys and two girls, fifth-grade, 10–11 yo, one to eight ODRs,
Black, 13 yo, six to 26 ODRs, identified as with SLD White, with ASD; Tina = 7 yo, Asian diagnosis of either ADHD or ED
ED or at risk of ED
3.0 Classroom teacher; no background variables Classroom SpEd teacher; researcher trained using model, prompt, General (female, 34 yo, White), special (female, 23 yo, White), and Exp and SpEd teacher; no background variables or training described
or training described check method with procedural checklist; teacher scored 100% instructional support (female, 37 yo, White) teachers trained via
before beginning the study distance technology
4.0 10 s to choose one of four vocabulary 5-min sessions end of math; choice of math paper size; order of Across-task (choose order of tasks) or within-task (choose supplies, 5-min warm up math computation probes with either choice
assignments from a packet to complete conditions counterbalanced across days where to work, with whom to work) choice during independent of reinforcement, reinforcement with no choice, or no
in 7 min writing reinforcement
5.0 Observations videotaped and scored by Not described TI data collected 100% of sessions with checklists for each phase; 25% 100% of videotaped sessions scored using an exp. checklist;
exp. using checklist; procedures were IOA each phase procedures followed 100%
followed 100%
6.0 A-B-A-B design, A = no choice, B = choice Alternating treatment design for math worksheet, paper slip, and A-B-A-B alternating treatment design (across-task and within-task Multielement design; baseline = no choice no reinforcer; phase 1
condition; during baseline students given student choice choice); A = no choice, B = alternating across- and within-task choice = independent level math, alternating no reinforcer, no choice
one vocabulary assignment without choice by random assignment reinforcer, choice of reinforcer; phase 2 = instructional level
to complete in 7 min
7.0 IOA for 45%–55% of sessions; DVs = DVs = digits correct per minute, task format selected; 33% IOA calculated for at least 25% of sessions within each phase; DVs = IOA for 54% of sessions for seconds of engagement and frequency
task engagement/nonengagement, task of assignments scored by an independent observer; SV not academic engaged time and DB; SV = IRP-15 and CIRP of engagement, 100% of session for accuracy of digits correct and
accuracy, task completion time; high SV reported incorrect; SV = teacher and student surveys with high SV results
survey results
8.0 Average task engagement increased No obvious differences between conditions for digits correct/min Improvements in academic engaged time and reduced DB for Tina; Phase 2: better separation between conditions with one student
15.4%; accuracy rose an average 10.2%; or task accuracy Neal = no functional relation (increased engagement for intervention displaying an improving trend for seconds and frequency of
completion time decreased an average phase 2) engagement
of 1.4 min

Note. QI = quality indicator; HS = high school; ES = elementary school; SpEd = special education; ED = emotional disturbance; EBD = emotional and behavioral disorder; ID = intellectual disability; PB = problem behavior; GE =
general education; ELA = English language arts; K = kindergarten; yo = years old; DB = disruptive behavior; ADHD = attention deficit hyperactivity disorder; ASD = autism spectrum disorder; exp = experimenter; DV = dependent
variable; SV = social validity; IOA = interobserver agreement; MS = middle school; FRL = free or reduced-price lunch; LD = learning disability; PDD = pervasive developmental disorder; ADD = attention deficit disorder; CBM =
curriculum-based measure; TARF-R = Teacher Acceptability Rating Form–Revised (Reimers, Wacker, Cooper, & DeRaad, 1992); WJ-III = Woodcock–Johnson Tests of Achievement Third Edition (Woodcock, McGrew, & Mather, 2001);
NOS = not otherwise specified; UG = undergraduate; TI = treatment integrity; CIRP = Children’s Intervention Rating Profile (Witt & Elliott, 1985); IRR = interrater reliability; ODR = office discipline referral; SLD = specific learning
disability; SRSS = Student Risk Screening Scale (Drummond, 1994); IRP-15 = Intervention Rating Profile–15 (Witt & Elliott, 1985).
Royer et al. 101

8.0 8.0 8.0


28 7.6 8.0
8.3 7.5 8.0 7.5
27 8.0 8.0
8.2
26
8.1 7.0
25 6.8
7.6 6.7 7.0
24 6.5 6.5
7.5 6.3 7.0 7.0 7.0 6.5 6.3 6.3
23 6.3 6.3
7.4
22 6.0 6.0
7.3
21 6.0

Number of CEC (2014) quality indicators met


5.6 5.7
7.2
CEC (2014) quality indicator component

20 6.0 5.3 6.0 6.0


7.1 5.2
19 5.0
5.1
6.9 5.0
18
6.8 5.0
17
6.7 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0
16
6.6
15
6.5
14 4.0
6.4
13 4.0 4.0 4.0 4.0 4.0 4.0
6.3
12
6.2
11
6.1 3.0
10
5.3 3.0 3.0
9
5.2
8
5.1
7 2.0
4.2 QIs met: Absolute coding
6
4.1 QIs met: Weighted coding
5
3.2
4
3.1 1.0
3
2.2
2
2.1
1
1.1
0 0.0
Daly et al. (2006)
Kern et al. (2002)

Smeltzer et al. (2009)


Patall et al. (2010)
Stenhoff et al. (2008)
Jolivette et al. (2001)

Bicard et al. (2012)

Skerbetz & Kostewicz (2015)


Carson & Eckert (2003)

Skerbetz & Kostewicz (2013)

Lane, Royer et al. (2015)


Killu et al. (1999)
Powell & Nelson (1997)

Rispoli et al. (2013)


Myrow (1979)

Vaughn & Horner (1997)

Cole & Levinson (2002)


Dunlap et al. #1 (1994)
Dunlap et al. #2 (1994)
Seybert et al. (1996)

Mechling et al. (2006)


Lannie & Martens (2008)

Hua et al. (2014)


Romaniuk et al. (2002)
Clifford (1975)

Cole et al. (1997)

Study

Figure 2.  Instructional choice studies (abscissa) and CEC (2014) individual QI components met (primary ordinate; shaded cells =
component met, white cells = component not met).
Note. Secondary ordinate (right y axis) displays number of QIs met by absolute coding (triangles; 8.0 QIs required) and weighted coding (circles), which
required 6.4 QIs (80%) to be considered a methodologically sound study. Components not applicable to a study’s design are marked NA. Dunlap et al.
(1994) contained two studies. CEC = Council for Exceptional Children; QI = quality indicator.
102 Behavioral Disorders 42(3)

QI 7.0. Outcome Measures/DVs and Eckert (2003) had three participants, but a functional
relation was only established for two, as one student’s alter-
Seventeen studies (65.38%) met all applicable components nating treatment design did not have the requisite four repeti-
of QI 7.0. All met QI 7.1 and 7.3, reporting or justifying tions (Kratochwill et al., 2013). Two SCRDs were not eligible
social validity and reporting outcomes for all measures, for other reasons. Kern et al. (2002) had six participants, but
respectively. Twenty-five studies (96.15%) met QI 7.2 for we were unable to determine the percentage who demon-
defining and describing DVs. Twenty-three studies (88.46%) strated therapeutic change because data were reported in
met QI 7.4 frequency and timing of outcome measures, 20 aggregate. Lannie and Martens (2008) had four participants,
(76.92%) met QI 7.5 reporting adequate internal/interob- but we were unable to determine whether the functional rela-
server reliability, and all group studies met QI 7.6 evidence tion between IVs and DVs was due to changes in self-moni-
of validity. For QI 7.2, Lane, Royer, et al. (2015) included toring type or due to choice of reinforcer.
examples and nonexamples of DV behaviors that helped We used an Excel-based effect size calculator (DeFife,
ensure primary and secondary data collectors were in agree- 2009) to calculate effect size for the two group studies
ment. For QI 7.5, Skerbetz and Kostewicz (2015) described (Myrow, 1979; Patall et al., 2010). We converted F statistics
how a trained observer collected IOA data for more than half to d = 0.113 for Myrow (1979), and used means and standard
of all sessions, the method used to calculate IOA, and the deviations across units to calculate d = 0.154 for homework
IOA percentages for each DV. The three group studies completion rate and d = 0.190 for unit test scores for Patall
(Clifford, 1975; Myrow, 1979; Patall et al., 2010) met QI 7.6 et al. (2010). According to Cohen (1988), effect sizes >0.20
by describing validity coefficients for measures and demon- could be considered small, and for this review, we set the
strating content validity. minimum threshold at 0.25 for a group study to be classified
as having positive effects. We, therefore, considered these
QI 8.0. Data Analysis two group studies to be in the neutral or mixed effects cate-
gory (CEC, 2014).
All group studies met QI 8.1 and 8.3 with appropriate data Jolivette et al. (2001) had three participants, two of whom
analysis techniques and enough information to allow effect had a functional relation between the choice IVs and DVs and
sizes to be computed. For example, Myrow (1979) reported demonstrated a positive therapeutic trend. The third partici-
means but not standard deviations, but included F statistics, pant, however, maintained high levels of engagement and low
degrees of freedom, and mean squares from analysis of vari- rates of off-task behavior across conditions, so we classified
ance tests, allowing for effect size calculation. All SCRD this study as neutral or mixed effects. The final study (Bicard
studies met QI 8.2 by including single-case graphs that dis- et al., 2012) presented a unique case. Students were able to
tinctly indicated DV outcome data across phases and allowed choose where to sit on Mondays for the week, but on alternat-
determination of intervention effects through traditional ing weeks when the teacher assigned seats, students’ disruptive
visual analysis of mean, level, trend, overlap, and pattern. behavior was less, classifying this study’s choice intervention
as having negative effects (see “Limitations” section).
With no eligible studies demonstrating positive effects, we
Determination of Instructional Choice as an EBP were not able to categorize instructional choice into the
The second research question regarded categorizing the evi- mixed evidence category, and therefore concluded it fits the
dence base of instructional choice using CEC’s (2014) stan- insufficient evidence category based on CEC’s (2014) evi-
dards. We applied weighted coding as suggested by Lane dence-based category specifications.
et al. (2009), recognizing studies that met 80% of the eight
QIs as methodologically sound. Composite scores ranged
Discussion
from 4.97 to 8.00, with 12 studies (46.15%) ≥6.40 and, there-
fore, eligible to consider when classifying the evidence base Choice can increase motivation when a student has clear pref-
of instructional choice in traditional school settings. See erences, appreciates the options provided, enjoys the act of
Figure 2 for QI components present and weighted coding choosing, and benefits from the outcome of choosing (Patall,
results for each study. 2012). Despite the insufficient evidence category classifica-
When we applied CEC (2014) guidelines to classify the tion, it bears noting the majority of study authors reported
12 studies meeting 80% of the QIs as having positive, neutral positive participant outcomes (see Table 1, QI 8.0 for each
or mixed, or negative effects, two of the three group studies study’s data analysis). Looking specifically at the 12 method-
(one randomized, one nonrandomized) were included but ologically sound studies, the two group studies did report posi-
only two SCRDs met the minimum requirement of having tive, albeit small, effects. Jolivette et al. (2001) had positive
three or more participants (see Table 2). Rispoli et al. (2013) outcomes for all three participants but one maintained those
did have four participants, but only two had sessions in tradi- outcomes across phases. Five studies did not have enough
tional school settings, the focus of this review (as recom- participants to contribute to the evidence base for instruc-
mended by CEC, 2014, our review considered only tional choice but reported positive outcomes for how choice
participants who were part of the target population). Carson can reduce problem behavior for students with comorbid
Royer et al. 103

Table 2.  Evaluation of the Evidence Base for Instructional Choice.

Methodologically sound coding

  Absolute Weighted

No. of studies 3 (11.54%) 12 (46.15%)

Study characteristic SCRD Group SCRD Group


Total no. of participants 25 0 48 340
No. of studies eligible for effect classificationa  1 0  2  2
No. of participants in eligible studies 21 0 24 340

  Study effect classification


Positive 0 0 0 0
Mixed or neutral 0 0 1 2
Negative 1 0 1 0

Note. See CEC (2014) for more details. SCRD = single-case research design; CEC = Council for Exceptional Children.
a
SCRD studies require three or more participants for whom it can be determined if therapeutic or nontherapeutic trends were established.

disabilities or ASD (Cole & Levinson, 2002; Rispoli et al., day-to-day life in schools (Brown, 1992). Even with QIs in
2013); produce increased oral reading fluency rates for stu- special education available for over a decade for researchers
dents with EBD (Daly, Garbacz, Olson, Persampieri, & Ni, to use as they design studies (e.g., Gersten et al., 2005;
2006); result in faster task completion for puzzles, math, color Horner et al., 2005), the challenge of implementing interven-
by number, and sorting for students with ID (Mechling et al., tions with fidelity remains in applied settings where clinical
2006); and increase academic engaged time while reducing control is not always possible. For example, SCRDs need
disruptive behavior for a first-grade general education student three participants with a functional relation between IVs and
(but only increased engagement in intervention phase 2 for a DVs for EBP classification consideration. However, one of
student with ASD; Lane, Royer, et al., 2015). Two other meth- three students in a study may become ill or move and not be
odologically sound studies reported positive outcomes but available to finish a design replication sequence, eliminating
could not be considered when classifying the evidence base: the study from EBP classification consideration. Or, the
Choice increased mean engagement 30% and reduced destruc- same study may have four participants but one is absent from
tive behavior from 12% to 0% for six boys with severe EBD enough sessions to prevent QI 7.4 (frequency and timing)
but data were reported in aggregate (Kern et al., 2002), and from being met, preventing the study from being method-
choice increased math digits computed correctly per minute ologically sound (absolute coding) even though it is other-
for three fourth-grade general education students but one stu- wise of high quality. For such reasons, weighted coding with
dent’s alternating treatment design was missing a pattern rep- an 80% criterion (Lane et al., 2009) may be a useful option.
lication (Carson & Eckert, 2003). Essentially, this application allows for “partial credit” to be
This systematic literature review suggests the need for given to studies, subsequently enabling a broader array of
better designed studies to withstand the challenges associ- still rigorous studies to be examined when determining
ated with conducting experimental designs in the reality of whether a strategy, practice, or program is indeed an EBP.
traditional K–12 settings. More recent studies appear to be Although the majority of studies did not meet 80% of the
designed to meet more QIs: One of nine (11.11%) studies QIs, they did reflect areas of methodological strengths. For
published from 1975 to 2000 met 80% of QIs, whereas 11 of example, though we only required studies to include one or
17 (64.71%) studies published after 2000 met 80% of QIs. more background variables to meet QI 1.1 (context and set-
Although the methodological quality of the research base ting), 2.1 (participant demographics), and 3.1 (intervention
appears to be improving, refinement is still needed. Some agent), some studies reported many (see Table 1), such as
authors have expressed concern that applying stringent QIs tables for school characteristics, student participants, and
may exclude meaningful research when classifying the evi- intervention agent background (e.g., Lane, Royer, et al.,
dence base for practices and strategies (e.g., Moeller, Dattilo, 2015). All studies described procedures and actions suffi-
& Rusch, 2015). Indeed, the CEC (2014) requirement for ciently for replication, with 25 (96.15%) also describing
studies to meet 100% of QIs would exclude nine studies we materials used. All studies also met the requirement for hav-
considered to be of high quality from being considered meth- ing socially important outcomes, though we hoped more
odologically sound. studies would have explicitly used some version of a social
It is a talented research team that can address this delicate validity measure. For example, Kern et al. (2002) had stu-
balance between experimental control and the realities of dents complete a class evaluation sheet daily with questions
104 Behavioral Disorders 42(3)

asking how much they enjoyed class that day, and the teacher with a partner or alone), followed by the student returning to
completed the Teacher Acceptability Rating Form–Revised teacher-selected seating. In this study, as most experienced
(Reimers et al., 1992). Lannie and Martens (2008) and Lane, teachers have seen, allowing students to choose with whom
Royer, et al. (2015) had students complete the Children’s to sit was more likely to set conditions for disruptive behav-
Intervention Rating Profile (Witt & Elliott, 1985) and ior. In other words, the benefits of choice did not help when
Skerbetz and Kostewicz (2015) used teacher and student sur- offering a choice known to negatively impact the outcome
veys with high results. measure. Future author teams may consider refining criteria
It should also be noted we clarified the role of treatment to include studies where choice is provided for each task or
integrity in our coding. Specifically, we required QI 5.1 at minimum, each day.
(implementation fidelity) be met before QI 6.1 or 6.7 (sys- Third, we did not calculate effect sizes for SCRD studies.
tematic manipulation of the IV and internal validity) could Although not required by CEC (2014) to determine the evi-
be considered, and for QI 6.3, we required researchers to dence base of a strategy, calculating standardized effect sizes
explicitly state control/comparison or baseline/withdrawal would aid future comparisons of findings across studies con-
conditions were not contaminated by choice components. taining different outcome measures (Shadish, Hedges,
Although logical and reasonable, these conditions were not Horner, & Odom, 2015). A fourth limitation was our low
required by CEC (2014) and made these QIs more difficult to IRR on QIs 3.1, 6.3, 6.4, and 8.3, which were below 80%.
meet. We recommend future review teams also consider We had six and seven disagreements for QIs 3.1 and 6.3,
implementation fidelity as a prerequisite for these QIs, as it intervention agent role/background and IV isolated from
must first be established the intervention was implemented control/baseline, for reasons including inferring the interven-
as designed. tion agent role, missing a background variable, and missing
As we consider the educational implications of this treatment integrity data indicating the IV was isolated from
review, we caution the reader to consider the possibility that no-choice conditions. The low IRR for QIs 6.4 and 8.3 were
practices, including instructional choice, can be effective, yet the results of a low base rate of group design studies included
not currently classified as evidence based. Specifically, given (n = 3). For example, with one or two disagreements, IRR for
core QIs were introduced only 10 years ago (e.g., Gersten these group study QIs dropped to 2/3 or 1/3.
et al., 2005; Horner et al., 2005), it is likely not enough high- Finally, the broad focus of our review might be consid-
quality studies have been conducted to clearly demonstrate ered a limitation. That is, examining both behavioral and
the true effectiveness of current practices, strategies, and academic outcomes for all learners in traditional K–12 class-
programs (Cook et al., 2015). It is imperative for the scien- room settings may prevent generalizing review results to a
tific and practitioner communities to commit to additional specific outcome or population of students. For example, if
inquiry in pre-K–12 classrooms to build the body of evidence we had found instructional choice to be an EBP within these
needed to determine the extent to which instructional choice broad parameters, we could not assume it was evidence
is effective in supporting various students, demonstrating based for specific populations.
various target behaviors, in various contexts, and with the
support of various treatment agents. Studies presented in this
Conclusion
review provide an important starting point for examining the
utility of instructional choice as a low-intensity support. We The CEC (2014) expert panel set forth rigorous criteria for
recommend interested readers consider the following limita- methodologically sound studies, leaving flexibility within
tions and future directions before building additional evi- each QI for some interpretation, recognizing the substantial
dence to meet the call of determining how well instructional variety of studies to which QIs would be applied. We applied
choice works. and interpreted a few QIs differently when unique elements
of intervention studies were encountered, even after trainings
and check-ins to ensure accuracy of our coding (e.g., inde-
Limitations
pendently coding three to four studies and discussing results
Results of this review should be interpreted with caution before coding the next three to four studies). Overall, this
given additional studies used instructional choice but were strategy was successful. Out of 728 possible QI coding com-
not included. Our primary limitation was the focus on tradi- ponents (26 studies × 28 QI components = 728), we had 45
tional classroom settings, which excluded clinical, home, discrepancies (93.82% agreement; κ = .87, 95% CI = [.83,
residential schools, or other highly controlled environments .91], near perfect agreement) discussed and resolved. We
(e.g., discrete trial training; Newman, Needelman, Reinecke, encourage other systematic literature review research teams
& Robek, 2002). A second limitation was our inclusion crite- to also design systematic training with the CEC (2014) QIs
ria allowed studies where choice was not provided for each before beginning each new review.
task. For example, Bicard et al. (2012) gave students choice Given challenges researchers face implementing method-
of seating on Mondays, which were kept for the week. This ologically sound interventions in school settings, it was
week-long choice was different from choice of seating for encouraging to see practitioners implement choice interven-
the duration of a specific task (e.g., floor, desk, beanbag; tions with high procedural and treatment fidelity and
Royer et al. 105

minimal university support. For example, Lane, Royer, and the effects of student-selected versus empirically-selected
colleagues (2015) provided training to teachers in another interventions. Journal of Behavioral Education, 12, 35–54.
state via web technology to introduce the instructional choice doi:10.1023/A:1022370305486
strategy and train primary and secondary data collectors, *Clifford, M. M. (1975). Affective and cognitive effects of option
in an educational setting. Journal of Experimental Education,
allowing teachers to support each other, with university
43(3), 1–5. doi:10.1080/00220973.1975.10806327
researchers advising on when to implement phase changes
Cohen, J. (1960). A coefficient of agreement for nominal scales.
based on collected data. We believe this is a promising direc- Educational and Psychological Measurement, 20, 37–46.
tion for future studies to take to help establish instructional doi:10.1177/001316446002000104
choice and other low-intensity strategies as EBPs teachers Cohen, J. (1988). Statistical power analysis for the behavioral sci-
can confidently implement in multiple settings with minimal ences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum.
university support. *Cole, C. L., Davenport, T. A., Bambara, L. M., & Ager, C. L.
Although we anticipated being able to draw more defi- (1997). Effects of choice and task preference on the work
nite conclusions about instructional choice as an EBP, we performance of students with behavior problems. Behavioral
were pleased to see high-quality studies demonstrating suc- Disorders, 22, 65–74.
cess in different ways, in multiple settings and contexts, *Cole, C. L., & Levinson, T. R. (2002). Effects of within-activity
choices on the challenging behavior of children with severe
and with various populations. The CEC (2014) guidelines
developmental disabilities. Journal of Positive Behavior
for methodologically rigorous studies are available to all,
Interventions, 4, 29–52. doi:10.1177/109830070200400106
and we encourage the research community to use them Common, E. A., Lane, K. L., Cantwell, E. D., Brunsting, N., Oakes,
when partnering with practitioners to design interventions. W. P., & Bross, L. A. (2016). Teacher-delivered strategies
We especially encourage focused inquiry on traditional to increase students’ opportunities to respond: A systematic
school settings with a variety of students in different con- methodological review. Manuscript in preparation.
tent areas so in the end we will know how to best use Cook, B. G., Buysse, V., Klingner, J., Landrum, T. J., McWilliam,
instructional choice. R. A., Tankersley, M., & Test, D. W. (2015). CEC’s stan-
dards for classifying the evidence base of practices in special
Declaration of Conflicting Interests education. Remedial and Special Education, 36, 220–234.
doi:10.1177/0741932514557271
The author(s) declared no potential conflicts of interest with respect Cooper, J. O., Heron, T. E., & Heward, W. L. (2007). Applied
to the research, authorship, and/or publication of this article. behavior analysis (2nd ed.). Upper Saddle River, NJ: Pearson
Education.
Funding Cordova, D. I., & Lepper, M. R. (1996). Intrinsic motivation and
The author(s) received no financial support for the research, author- the process of learning: Beneficial effects of contextualiza-
ship, and/or publication of this article. tion, personalization, and choice. Journal of Educational
Psychology, 88, 715–730. doi:10.1037/0022-0663.88.4.715
Council for Exceptional Children. (2014). Council for Exceptional
References
Children standards for evidence-based practices in special
References marked with an asterisk indicate studies included in the education. Arlington, VA: Author.
literature review. *Daly, E. J., III, Garbacz, S. A., Olson, S. C., Persampieri,
Algozzine, B., Browder, D., Karvonen, M., Test, D. W., & Wood, M., & Ni, H. (2006). Improving oral reading fluency by
W. M. (2001). Effects of interventions to promote self-determi- influencing students’ choice of instructional procedures:
nation for individuals with disabilities. Review of Educational An experimental analysis with two students with behav-
Research, 71, 219–277. doi:10.3102/00346543071002219 ioral disorders. Behavioral Interventions, 21, 13–30.
American Psychiatric Association. (2000). Diagnostic and statisti- doi:10.1002/bin.208
cal manual of mental disorders (4th ed., text rev.). Washington, DeFife, J. (2009). Effect size calculator and conversions. Emory
DC: Author. University. Retrieved from http://web.cs.dal.ca/~anwar/ds/
*Bicard, D. F., Ervin, A., Bicard, S. C., & Baylot-Casey, L. Excel4.xlsx
(2012). Differential effects of seating arrangements on dis- Dibley, S., & Lim, L. (1999). Providing choice making opportunities
ruptive behavior of fifth grade students during independent within and between daily school routines. Journal of Behavioral
seatwork. Journal of Applied Behavior Analysis, 45, 407–411. Education, 9, 117–132. doi:10.1023/A:1022888917128
doi:10.1901/jaba.2012.45-407 Drummond, T. (1994). The Student Risk Screening Scale (SRSS).
Brown, A. L. (1992). Design experiments: Theoretical and meth- Grants Pass, OR: Josephine County Mental Health Program.
odological challenges in creating complex interventions in *Dunlap, G., dePerczel, M., Clarke, S., Wilson, D., Wright, S.,
classroom settings. The Journal of the Learning Sciences, 2, White, R., & Gomez, A. (1994). Choice making to promote
141–178. adaptive behavior for students with emotional and behavioral
Carroll, C., Patterson, M., Wood, S., Booth, A., Rick, J., & Balain, challenges. Journal of Applied Behavior Analysis, 27, 505–
S. (2007). A conceptual framework for implementation fidel- 518. doi:10.1901/jaba.1994.27-505
ity. Implementation Science, 2, Article 40. doi:10.1186/1748- Fairbanks, S., Sugai, G., Guardino, D., & Lathrop, M. (2007).
5908-2-40 Response to intervention: Examining classroom behavior
*Carson, P. M., & Eckert, T. L. (2003). An experimental analy- support in second grade. Exceptional Children, 73, 288–310.
sis of mathematics instructional components: Examining doi:10.1177/001440290707300302
106 Behavioral Disorders 42(3)

Flowerday, T., & Schraw, G. (2000). Teacher beliefs about behavior: A review of the literature. Journal of Behavioral
instructional choice: A phenomenological study. Journal of Education, 8, 151–169. doi:10.1023/A:1022831507077
Educational Psychology, 92, 634–645. doi:10.1037/0022- *Killu, K., Clare, C. M., & Im, A. (1999). Choice vs. preference:
0663.92.4.634 The effects of choice and no choice of preferred and non pre-
Fuchs, D., & Fuchs, L. S. (2006). Introduction to response to inter- ferred spelling tasks on the academic behavior of students with
vention: What, why, and how valid is it? Reading Research disabilities. Journal of Behavioral Education, 9, 239–253.
Quarterly, 41, 93–99. doi:10.1598/rrq.41.1.4 King, S. A., & Kostewicz, D. E. (2014). Choice-based stimulus
Gast, D. L., & Ledford, J. R. (Eds.). (2014). Single case research preference assessment for children with or at-risk for emotional
methodology: Applications in special education and behav- disturbance in educational settings. Education and Treatment
ioral sciences (2nd ed.). New York, NY: Routledge. of Children, 37, 531–558. doi:10.1353/etc.2014.0026
Gersten, R., Fuchs, L. S., Compton, D., Coyne, M., Greenwood, Kratochwill, T. R., Hitchcock, J. H., Horner, R. H., Levin,
C., & Innocenti, M. S. (2005). Quality indicators for J. R., Odom, S. L., Rindskopf, D. M., & Shadish, W. R.
group experimental and quasi-experimental research in (2013). Single-case intervention research design stan-
special education. Exceptional Children, 71, 149–164. dards. Remedial and Special Education, 34, 26–38.
doi:10.1177/001440290507100202 doi:10.1177/0741932512452794
Harding, J. W., Wacker, D. P., Berg, W. K., Barretto, A., & Rankin, Landis, J. R., & Koch, G. G. (1977). The measurement of observer
B. (2002). Assessment and treatment of severe behavior agreement for categorical data. Biometrics, 33, 159–174.
problems using choice-making procedures. Education and Lane, K. L., Common, E. A., Royer, D. J., & Muller, K. (2014).
Treatment of Children, 25, 26–46. Council for Exceptional Children 2014 group comparison
Harding, J. W., Wacker, D. P., Berg, W. K., Winborn-Kemmerer, and single-case research design standards quality indicator
L., & Lee, J. F. (2009). Evaluation of choice allocation matrix. Unpublished tool.
between positive and negative reinforcement during func- Lane, K. L., Kalberg, J. R., & Shepcaro, J. C. (2009). An examina-
tional communication training with young children. Journal tion of the evidence base for function-based interventions for
of Developmental and Physical Disabilities, 21, 443–456. students with emotional and/or behavioral disorders attending
doi:10.1007/s10882-009-9155-7 middle and high schools. Exceptional Children, 75, 321–340.
Horner, R. H., Carr, E. G., Halle, J., McGee, G., Odom, doi:10.1177/001440290907500304
S., & Wolery, M. (2005). The use of single-subject Lane, K. L., Menzies, H. M., Bruhn, A. L., & Crnobori, M. E.
research to identify evidence-based practice in spe- (2011). Managing challenging behaviors in schools: Research-
cial education. Exceptional Children, 71, 165–179. based strategies that work. New York, NY: Guilford Press.
doi:10.1177/001440290507100203 Lane, K. L., Menzies, H. M., Ennis, R. P., & Oakes, W. P. (2015).
Horner, R. H., & Sugai, G. (2015). School-wide PBIS: An exam- Supporting behavior for school success: A step-by-step guide
ple of applied behavior analysis implemented at a scale of to key strategies. New York, NY: Guildford Press.
social importance. Behavior Analysis in Practice, 8, 80–85. Lane, K. L., Oakes, W. P., & Menzies, H. M. (2014). Comprehensive,
doi:10.1007/s40617-015-0045-4 integrated, three-tiered models of prevention: Why does my
*Hua, Y., Lee, D., Stansbery, S., & McAfee, J. (2014). The effects school—and district—need an integrated approach to meet
of assignment format and choice on task completion. Journal students’ academic, behavioral, and social needs? Preventing
of Education and Learning, 3, 101–110. doi:10.5539/jel. School Failure: Alternative Education for Children and Youth,
v3n1p101 58, 121–128. doi:10.1080/1045988X.2014.893977
Individuals With Disabilities Education Improvement Act of 2004, *Lane, K. L., Royer, D. J., Messenger, M. L., Common, E. A.,
20 U.S.C. 1400 et seq. (December 3, 2004). Ennis, R. P., & Cantwell, E. D. (2015). Empowering teachers
Jolivette, K., Stichter, J. P., & McCormick, K. M. (2002). Making with low-intensity strategies to support academic engagement:
choices—Improving behavior—Engaging in learning. Implementation and effects of instructional choice for elemen-
TEACHING Exceptional Children, 34(3), 24–29. tary students in inclusive settings. Education and Treatment of
Jolivette, K., Stichter, J. P., Sibilsky, S., Scott, T. M., & Ridgley, Children, 38, 473–504. doi:10.1353/etc.2015.0013
R. (2002). Naturally occurring opportunities for preschool chil- *Lannie, A. L., & Martens, B. K. (2008). Targeting performance
dren with or without disabilities to make choices. Education dimensions in sequence according to the instructional hierar-
and Treatment of Children, 25, 396–414. chy: Effects on children’ math work within a self-monitoring
*Jolivette, K., Wehby, J. H., Canale, J., & Massey, N. G. (2001). program. Journal of Behavioral Education, 17, 356–375.
Effect of choice-making opportunities on the behavior of stu- doi:10.1007/s10864-008-9073-2
dents with emotional and behavioral disorders. Behavioral *Mechling, L. C., Gast, D. L., & Cronin, B. A. (2006). The effects
Disorders, 26, 131–145. of presenting high-preference items, paired with choice, via
Katz, I., & Assor, A. (2007). When choice motivates and when computer-based video programming on task completion of stu-
it does not. Educational Psychology Review, 19, 429–442. dents with autism. Focus on Autism & Other Developmental
doi:10.1007/s10648-006-9027-y Disabilities, 21, 7–13. doi:10.1177/10883576060210010201
Kennedy, C. H. (2005). Single-case designs for educational Moeller, J. D., Dattilo, J., & Rusch, F. (2015). Applying quality
research. Boston, MA: Pearson Education. indicators to single-case research designs used in special edu-
Kern, L., Bambara, L., & Fogt, J. (2002). Class-wide curricular mod- cation: A systematic review. Psychology in the Schools, 52,
ification to improve the behavior of students with emotional or 139–153. doi:10.1002/pits.21801
behavioral disorders. Behavioral Disorders, 27, 317–326. *Myrow, D. L. (1979). Learner choice and task engagement.
*Kern, L., Vorndran, C., Hilt, A., Ringdahl, J., Adelman, B., & Journal of Experimental Education, 47, 200–207. doi:10.108
Dunlap, G. (1998). Choice as an intervention to improve 0/00220973.1979.11011682
Royer et al. 107

Newman, B., Needelman, M., Reinecke, D. R., & Robek, A. (2002). intellectual disabilities. Journal of Behavioral Education, 6,
The effect of providing choices on skill acquisition and com- 49–65. doi:10.1007/BF02110477
peting behavior of children with autism during discrete trial Shadish, W. R., Hedges, L. V., Horner, R. H., & Odom, S. L. (2015).
instruction. Behavioral Interventions, 17, 31–41. doi:10.1002/ The role of between-case effect size in conducting, interpret-
bin.99 ing, and summarizing single-case research. Washington, DC:
Niesyn, M. E. (2009). Strategies for success: Evidence-based National Center for Education Research, Institute of Education
instructional practices for students with emotional and Sciences, U.S. Department of Education.
behavioral disorders. Preventing School Failure: Alternative Shogren, K. A., Faggella-Luby, M. N., Bae, S. J., & Wehmeyer,
Education for Children and Youth, 53, 227–234. doi:10.3200/ M. L. (2004). The effect of choice-making as an intervention
psfl.53.4.227-234 for problem behavior: A meta-analysis. Journal of Positive
Nota, L., Ferrari, L., Soresi, S., & Wehmeyer, M. (2007). Self- Behavior Interventions, 6, 228–237. doi:10.1177/1098300704
determination, social abilities and the quality of life of people 0060040401
with intellectual disability. Journal of Intellectual Disability Simonsen, B., Fairbanks, S., Briesch, A., Myers, D., & Sugai,
Research, 51, 850–865. doi:10.1111/j.1365-2788.2006.00939.x G. (2008). Evidence-based practices in classroom manage-
*Patall, E. A. (2012). The motivational complexity of choosing: A ment: Considerations for research to practice. Education and
review of theory and research. In R. M. Ryan (Ed.), The Oxford Treatment of Children, 31, 351–380. doi:10.1353/etc.0.0007
handbook of human motivation (pp. 248–279). Oxford, UK: *Skerbetz, M. D., & Kostewicz, D. E. (2013). Academic choice
Oxford University Press. for included students with emotional and behavioral dis-
Patall, E. A., Cooper, H., & Wynn, S. R. (2010). The effectiveness orders. Preventing School Failure: Alternative Education
and relative importance of choice in the classroom. Journal of for Children and Youth, 57, 212–222. doi:10.1080/10459
Educational Psychology, 102, 896–915. doi:10.1037/a0019545 88X.2012.701252
*Powell, S., & Nelson, B. (1997). Effects of choosing academic *Skerbetz, M. D., & Kostewicz, D. E. (2015). Consequence choice
assignments on a student with attention deficit hyperactivity and students with emotional and behavioral disabilities: Effects
disorder. Journal of Applied Behavior Analysis, 30, 181–183. on academic engagement. Exceptionality, 23, 14–33. doi:10.10
doi:10.1901/jaba.1997.30-181 80/09362835.2014.986603
Prasse, D. P., Breunlin, R. J., Giroux, D., Hunt, J., Morrison, D., & Slaughter, S. E., Hill, J. N., & Snelgrove-Clarke, E. (2015). What
Thier, K. (2012). Embedding multi-tiered system of supports/ is the extent and quality of documentation and reporting
response to intervention into teacher preparation. Learning of fidelity to implementation strategies: A scoping review.
Disabilities: A Contemporary Journal, 10, 75–93. Implementation Science, 10, Article 129. doi:10.1186/s13012-
Prusak, K. A. (2004). Adolescent girls in physical education. 015-0320-3
Journal of Teaching in Physical Education, 23, 19–29. Slavin, R. E. (2008). Perspectives on evidence-based research in
Ramsey, M. L., Jolivette, K., Patterson, D. P., & Kennedy, C. education: What works? Issues in synthesizing educational
(2010). Using choice to increase time on-task, task-completion, program evaluations. Educational Researcher, 37, 5–14. doi:
and accuracy for students with emotional/behavior disorders in 10.3102/0013189X08314117
a residential facility. Education and Treatment of Children, 33, *Smeltzer, S. S., Graff, R. B., Ahearn, W. H., & Libby, M. E. (2009).
1–21. doi:10.1353/etc.0.0085 Effect of choice of task sequence on responding. Research
Reimers, T. M., Wacker, D. P., Cooper, L. J., & DeRaad, A. O. in Autism Spectrum Disorders, 3, 734–742. doi:10.1016/j.
(1992). Clinical evaluation of the variables associated with rasd.2009.02.002
treatment acceptability and their relation to compliance. *Stenhoff, D. M., Davey, B. J., & Lignugaris-Kraft, B. (2008). The
Behavioral Disorders, 18, 67–76. effects of choice on assignment completion and percent correct
*Rispoli, M., Lang, R., Neely, L., Camargo, S., Hutchins, N., by a high school student with a learning disability. Education
Davenport, K., & Goodwyn, F. (2013). A comparison of and Treatment of Children, 31, 203–211. doi:10.1353/
within- and across-activity choices for reducing challenging etc.0.0011
behavior in children with autism spectrum disorders. Journal *Vaughn, B. J., & Horner, R. H. (1997). Identifying instructional
of Behavioral Education, 22, 66–83. doi:10.1007/s10864-012- tasks that occasion problem behaviors and assessing the effects
9164-y of student versus teacher choice among these tasks. Journal
*Romaniuk, C., Miltenberger, R., Conyers, C., Jenner, N., Jurgens, of Applied Behavior Analysis, 30, 299–312. doi:10.1901/
M., & Ringenberg, C. (2002). The influence of activity choice jaba.1997.30-299
on problem behaviors maintained by escape versus atten- Walker, H. M., Forness, S. R., & Lane, K. L. (2014). Design and
tion. Journal of Applied Behavior Analysis, 35, 349–362. management of scientific research in applied school settings.
doi:10.1901/jaba.2002.35-349 In B. Cook, M. Tankersley, & T. J. Landrum (Eds.), Advances
Romaniuk, C., & Miltenberger, R. G. (2001). The influence in learning and behavioral disabilities: Special education past,
of preference and choice of activity on problem behavior. present, and future: Perspectives from the field (Vol. 27, pp.
Journal of Positive Behavior Interventions, 3, 152–159. 141–169). Bingley, UK: Emerald.
doi:10.1177/109830070100300303 Witt, J. C., & Elliott, S. N. (1985). Acceptability of classroom
Sailor, W. (2015). Advances in schoolwide inclusive school intervention strategies. In T. R. Kratochwill (Ed.), Advances
reform. Remedial and Special Education, 36, 94–99. in school psychology (Vol. 4, pp. 251–288). Mahwah, NJ:
doi:10.1177/0741932514555021 Routledge.
*Seybert, S., Dunlap, G., & Ferro, J. (1996). The effects of choice- Woodcock, R. W., McGrew, K. S., & Mather, N. (2001). Woodcock-
making on the problem behaviors of high school students with Johnson III. Rolling Meadows, IL: Riverside.
Copyright of Behavioral Disorders is the property of Council for Children with Behavioral
Disorders and its content may not be copied or emailed to multiple sites or posted to a listserv
without the copyright holder's express written permission. However, users may print,
download, or email articles for individual use.

You might also like