You are on page 1of 14

Received: 30 August 2021    Revised: 15 December 2021    Accepted: 20 December 2021

DOI: 10.1002/bin.1861

RESEARCH ARTICLE

Using an intervention package with percentile


schedules to increase on-task behavior

Daniel Kwak | Adel C. Najdowski  | Svetlana Danielyan

Graduate School of Education and Psychology,


Pepperdine University, Malibu, California, USA Abstract
Improving on-task behavior can allow individuals to access
Correspondence
more learning opportunities and is especially relevant for in-
Adel C. Najdowski, Graduate School of
Education and Psychology, Pepperdine dividuals with developmental disabilities. The current study
University, Malibu, CA, USA.
examined the efficacy of an intervention package that used
Email: adel.najdowski@pepperdine.edu
percentile schedule of reinforcement, feedback, and appli-
cation of lower limits to changes in criteria to increase on-
task behaviors in children with developmental disabilities.
Using a nonconcurrent multiple baseline across participants
design, data revealed a functional relationship between the
implementation of the intervention and an increase in the
percentage of intervals that participants were on task. Over-
all, the participants' on-task behavior improved from a mean
of 32% during baseline to 68% during intervention, with a
proportionate percentage change of 118%. Although there
is no widely accepted method for shaping, using percentile
schedules as part of an intervention package appears to be
a promising way to shape behavior.

KEYWORDS
attending, engagement, on task, percentile schedule, shaping

On-task behavior involves allocating time to tasks by interacting with the environment in a way that serves the in-
tended purposes of the tasks (Engstrom et al., 2015; McWilliam & Bailey, 1995; Rivera et al., 2015). Addressing prob-
lems with on-task behavior is especially important for individuals with developmental disabilities (DDs) because they
may experience difficulties with engagement across different areas of their lives including academic tasks, daily living
tasks, play activities, and employment tasks (Boswell et al., 2013; Dotson et al., 2013; Junod et al., 2006; Palmen &
Didden, 2012). Additionally, improving on-task behavior can provide individuals with increased opportunities to com-
plete tasks and help them to learn skills more efficiently (Karweit & Slavin, 1981; Lee et al., 1999; Ponitz et al., 2009).

We thank Bryan Acuña for his assistance with this project.

Behavioral Interventions. 2022;1–14. wileyonlinelibrary.com/journal/bin © 2022 John Wiley & Sons Ltd. 1
2 KWAK et al.

A variety of interventions have been used to increase on-task behaviors for individuals with DD. Stimulus
prompts have been used by providing participants activity schedules, which included step-by-step procedural guides
to completing or studying for specified academic content, to promote on-task behavior (Cirelli et al., 2016; Massey
& Wheeler, 2000; Rafferty et al., 2011). Response prompts have been used in which teachers in the classroom pro-
vided verbal prompts to ask questions when students encountered difficult problems (Knapczyk & Livingston, 1974).
Researchers have also used technological devices like the MotivAider to prompt completion of tasks and attending
behavior (Mechling et al., 2009) as well as self-management interventions wherein individuals were required to track
and record their own on-task behavior when alerted by external stimuli (Coyle & Cole, 2004; Holifield et al., 2010;
Slattery et al., 2016). Although the use of shaping procedures is likely a viable method for increasing on-task behavior,
there is a lack of published studies that evaluate the effectiveness of the method for this target behavior.
Shaping involves the use of differential reinforcement to successively approximate the terminal behavior of in-
terest by progressively changing the criteria for delivering reinforcement (Cooper et al., 2020). A notable compo-
nent of published studies that used shaping within response topographies is that successive criteria or goals were
increased or decreased by a fixed increment or percentage (Howie & Woods, 1982; Jackson & Wallace, 1974; Rea &
Williams, 2002). The rationales for determining successive goals are not always specified by the authors, but they like-
ly consider changes in criteria that are not too small or “easy” and not too large or “difficult” (Hartmann & Hall, 1976).
However, the appropriate size of the changes in criteria may perhaps be difficult to determine objectively. Although
the use of clinical judgment to determine the successive criteria or simply using fixed increments to change the
criteria are ways that shaping procedures have been used (Howie & Woods, 1982; Jackson & Wallace, 1974; Rea
& Williams, 2002), behaviors can also be shaped by systematically considering individuals' behavioral history using
percentile schedules of reinforcement (Galbicka, 1994).
Percentile schedules of reinforcement provide a systematic way of specifying successive criteria and tailoring the
criteria based on an individual's current level of responding (Galbicka, 1994). Accordingly, percentile schedules in-
volve continuous calculation of the criteria by taking into consideration a set number of recent observations (Galbicka,
1994). The criterion is calculated using the following formula: k = (m + 1) (1 – w). In this equation, m is the number
of previous observations that are considered in determining the criterion for reinforcement, and w is the “criterional
probability” or the expected likelihood that a response will meet the criterion (Galbicka, 1994; Hall et al., 2009). The
identification of the two values allows for the calculation of k or the “criterion rank,” which represents the criterion
that needs to be exceeded for the behavior to be reinforced (Galbicka, 1994). Essentially, a response is reinforced if
it is “better” than a certain percentage of most recent responses (Platt, 1973).
For example, if a researcher decides to use five most recent observations (m) to develop criteria and expects
to reinforce approximately 50% of an individual's responses (w), these numbers would be entered into the equation
(k = [5 + 1] [1 − 0.50]), and the resulting k would be 3. This value of k indicates that when an individual displays be-
havior that is “better” than the third highest value out of the five observations, the behavior is reinforced. If the goal
is to increase the duration of walking and an individual walked for 5, 6, 1, 2, and 4 min during the first five sessions,
the third highest numerical value (i.e., 4 min) would be selected as the criterion for reinforcement. This means that,
during the sixth session, the individual must walk for more than 4 min to receive reinforcement. Since the five most
recent observations (or sessions) are considered, only data collected from sessions two through six would be used to
develop the criterion for the seventh session. The criterion is calculated for every session based on the most recent
observations, which means that the criterion for reinforcement can change every session.
Thus far, percentile schedules of reinforcement have been used to shape smoking cessation (e.g., Lamb
et al., 2004; Lamb et al., 2010; Romanowich & Lamb, 2014), increase physical activity (e.g., Adams et al., 2013; Adams
et al., 2017; Hustiyi et al., 2011; Valbuena et al., 2015; Washington et al., 2014), increase the duration of eye con-
tact (Gannon et al., 2018; Hall et al., 2009), increase fruits and vegetable consumption (Jones et al., 2014; Jones
et al., 2014; Joyner et al., 2017), increase the behavioral variability in game play (Miller & Neuringer, 2000), in-
crease the number of words read (Bradley & Noell, 2018), and decrease latency to writing accurate sentences (Clark
et al., 2016). Athens et al. (2007) is the only study, to our knowledge, that used percentile schedules to increase
KWAK et al. 3

on-task behavior. Athens et al. (2007) examined the effects of percentile schedules in increasing the duration of
writing and the differential effects of the intervention when the values of m (number of observations considered to
develop successive criteria) were manipulated across participants through a reversal design. Results demonstrated
a clear and consistent increase in the duration of writing when 20 recent observations were considered for criteria
development. However, when 10 or 5 most recent observations were used to develop successive criteria, an initial
increase in the duration of writing during the intervention phase was followed by a downward trend or highly variable
durations of writing without a clear departure from baseline levels of responding.
Although using a greater value of m may have been feasible in Athens et al. (2007) because criterion changed
after each bout of responding (each bout typically lasting seconds to a few minutes in the study) by scoring task en-
gagement with an onset and offset criterion of 3 s, studies that change the criterion following 10- or 15-min sessions
can take longer to conduct. For instance, if 20 observations are considered and sessions are conducted once per day
5 days per week, it would take 4 weeks to develop the first criterion for reinforcement and for the intervention to
begin. The delay in the implementation of intervention may be unrealistic in some cases. Therefore, incorporation of
a greater number of observations (m) may be more appropriate if specified tasks require little time and many sessions
can be conducted each day.
The purpose of the current study was to examine the efficacy of an intervention package using percentile sched-
ules of reinforcement, feedback, and application of lower limits to changes in criteria to increase on-task behavior. We
aimed to incorporate components that would increase the feasibility of the intervention and prevent reinforcement
of continued decreases in behavior as observed in Athens et al. (2007) and Washington et al. (2014). Considering
feasibility of the intervention, ability to implement the intervention without much delay, and to re-examine the use
of this parameter, five most recent observations (m = 5) were used to develop successive criteria for reinforcement in
the current study. The value of 0.50 was selected for w because this parameter has been frequently used within per-
centile schedules (Athens et al., 2007; Bradley & Noell, 2018; Clark et al., 2016; Gannon et al., 2018; Hall et al., 2009;
Hustiyi et al., 2011) and is predicted to provide a balanced density of reinforcement and extinction to facilitate the
shaping process. We also considered a lower limit to changes in criteria because both Athens et al. (2007) and Wash-
ington et al. (2014) noted that in their studies, as responding decreased, the criterion for reinforcement decreased
as well. Washington et al. (2014) explained that a shortcoming in using percentile schedules is that they allow for
reinforcement of continued decreases in behavior and recommended that future studies address this potential issue
by setting a reasonable minimum. One of the ways in which they discussed this can be done is through disallowing
future criteria to be set lower. Despite this recommendation, no study, to our knowledge, has yet incorporated a
minimum component to percentile schedules. Therefore, we developed a rule for lower limits to changes in criteria
and applied it in our experiment. According to the rule for lower limits, if the participant's responses produce a new
criterion that is lower than the previously set criterion, the new criterion would remain at the previously determined
criterion. However, if the participant does not meet the criterion and does not contact reinforcement across five
consecutive sessions, the new criterion can be set to a previously successful criterion.

1 | METHOD

1.1 | Participants and settings

Melvin was a 9-year-old, Latino male diagnosed with intellectual disability and attention-deficit/hyperactivity disor-
der living in a middle-class neighborhood. Melvin received approximately 10 h of behavioral intervention per week in
the home. He primarily spoke English and some words in Spanish, but his overall vocal-verbal repertoire was limited.
Generally, Melvin communicated vocally using one-to three-word phrases. Based on the Verbal Behavior Milestones
Assessment and Placement Program (VB-MAPP; Sundberg, 2008), Melvin was classified as an early Level 3 learn-
er, which corresponds with approximate verbal skills of a typically developing 3-year-old individual. The reported
4 KWAK et al.

concern from his parent was that Melvin had difficulty maintaining attention to tasks shortly after initiation of tasks.
Typically, Melvin's disengagement with presented tasks involved repeatedly asking questions unrelated to the task,
looking around the room, or drawing on paper when a writing utensil and paper were available. The experimental
sessions were implemented in Melvin's home on the kitchen table.
Nick was a 16-year-old, White male diagnosed with autism spectrum disorder (ASD) living in a middle-class
neighborhood. Nick received approximately 20 h of behavioral intervention per week in the home. He received group
occupational therapy and speech therapy in the school setting as well as occupational therapy at a center-based
setting. He primarily spoke English and some Armenian but had limited vocal-verbal skills and communicated using a
communication device. Based on the VB-MAPP (Sundberg, 2008), Nick was classified as a Level 2 learner, which cor-
responds with approximate verbal skills of a typically developing 2-year-old individual. The reported concern was that
Nick had difficulty maintaining attention to tasks. Nick's off-task behaviors involved looking away from the relevant
material, initiating social interactions with the individual presenting tasks, and not initiating tasks. The experimental
sessions were implemented in Nick's home on the dining table.
Olivia was a 9-year-old, Latina and White female diagnosed with ASD living in an upper-middle class neighbor-
hood. Olivia received approximately 12 h of behavioral intervention per week in the home. Additionally, she received
about one hr each of speech and occupational therapy per week in the school setting. She spoke English but had a
limited vocal-verbal repertoire. Olivia was able to emit three-word vocal mands with prompts, spontaneously emit
one-to two-word vocal mands, follow one-to two-step instructions, and imitate motor actions. The reported concern
from her parent was that Olivia was consistently distracted by looking away and engaging in unrelated tasks. The
experimental sessions were implemented in Olivia's home on the dining table.
Orlando was a 4-year-old, White male diagnosed with ASD living in a middle-class neighborhood. Orlando re-
ceived approximately 30 h of behavioral intervention per week in the home. Additionally, he received speech therapy
for an hr per week in the home through remote sessions. He primarily spoke English and some words in Russian and
had some listener skills in Armenian. However, his overall vocal-verbal repertoire was limited. Based on the VB-MAPP
(Sundberg, 2008), Orlando was classified as a Level 2 learner. The reported concern was that Orlando had difficulty
with attending to independent activities. The experimental sessions were implemented in Orlando's home on the
living room floor.

1.2 | Experimenters

At the time of the study, the first and third authors implemented experimental sessions with the participants. The first
author, who implemented experimental sessions with Melvin, identified as Asian and male, primarily spoke English,
was 27 years old and a master's student in a behavior analysis program with a master's degree in the field of educa-
tion, and held a registered behavior technician (RBT) certification with three yr of experience in the field of behavior
analysis. The third author, who implemented experimental sessions with Nick, Olivia, and Orlando identified as Arme-
nian and female, primarily spoke English, was 25 years old and a master's student in behavior analysis and held a RBT
certification with three years of experience in the field of behavior analysis. Both experimenters had previous experi-
ence with shaping on-task behavior and providing feedback but did not have experience using percentile schedules.

2 | DATA COLLECTION AND INTEROBSERVER AGREEMENT

Melvin's sessions were 15 min in length, and he copied sentences on worksheets that were created with the third-
grade oral reading fluency passages from the Dynamic Indicators of Basic Early Literacy Skills (DIBELS), sixth Edi-
tion (Good & Kaminski, 2002). Melvin was considered on-task when he was looking at the worksheet, keeping qui-
et from vocally calling out others' names in the home, and having the worksheet faced up. A mobile application
KWAK et al. 5

(Radloff, 2018) was used to collect Melvin's on-task behavior, and it immediately displayed the number of intervals
that he was on-task throughout the session. The number of intervals was converted to a percentage by referring to
the criterion guide developed by the first author. The criterion guide provided a list of percentages that corresponded
with the number of intervals the participant was on task. The mobile application and criterion guide was used by the
experimenter so that the reinforcer could be delivered quickly if the criterion was met.
Nick's sessions were 15 min in length, and he engaged in tracing letters on worksheets and coloring pages of a
book. The two tasks were alternated in the same way every session in which tracing worksheets were first presented
and after 7.5 min, the researcher presented the pages from the coloring book. Nick was considered on-task when he
was sitting on the chair, positioning his head within approximately 45 degrees of the task material, looking at the task
materials, and manipulating the material in a way that serves the intended purposes of the tasks. With Nick, Olivia,
and Orlando, the updated version of the mobile application (Radloff, 2019) was used, and it immediately displayed
the percentages of on-task behavior during the observation, so the criterion guide was not needed.
Olivia's sessions were 15 min in length, and she engaged in the same task as Melvin. Olivia was considered on-
task when she was sitting on the chair, positioning her head toward the task material, and writing.
Orlando's sessions were 10 min in length, and he engaged in an activity that involved putting together a puzzle
set, which included 40-piece spelling puzzles with pictures and three- to four-letter words (e.g., tree, shoe, and dog).
Orlando was considered on-task when he was staying in the area (i.e., living room floor), orienting toward the task
materials, and manipulating the task materials that serves their intended purposes.
All topographies of behaviors described in the operational definitions had to occur simultaneously. Each 10-s
interval was considered on-task if a minimum of 7 s of the interval met the definition for on-task behavior, and the
terminal goals for all participants were that they would be on-task for at least 80% of the intervals across three con-
secutive sessions.
Interobserver agreement (IOA) was determined using interval-by-interval agreement and was collected by a sec-
ond independent observer. The percentage of IOA was calculated by dividing the number of intervals in agreement by
the total number of intervals and subsequently multiplying the resulting quotient by 100%. IOA was collected across
all participants and phases. For Melvin, IOA was collected for 33% of baseline and 36% of intervention sessions with
an IOA of 92% and an average IOA of 93% (range, 90%-96%), respectively. For Nick, IOA was collected for 33% of
baseline and 40% of intervention sessions with an average IOA of 93% and 85% (range, 72%-92%), respectively. For
Olivia, IOA was collected for 50% of baseline and 50% of intervention sessions with an average IOA of 98% (range,
92%–100%) and 90% (range, 86%–96%), respectively. For Orlando, IOA was collected for 33% of baseline and 38%
of intervention sessions with an average IOA of 98% (range, 97%–98%) and 93% (range, 87%–98%), respectively.

2.1 | Procedural fidelity

Procedural fidelity was evaluated using a checklist, which assessed components of the baseline and intervention
phases such as the preference assessment, verbal feedback, delivery of reinforcement, and criterion calculation. Dur-
ing the baseline phase, procedural fidelity data were collected on six components, which involved the experimenter
providing materials, providing expectations, and refraining from providing verbal feedback during and at the end of
sessions. During the intervention phase, procedural fidelity data were collected on 12 components, which involved
the experimenter conducting preference assessments, providing materials, and providing expectations. Also, proce-
dural fidelity data were collected on the accuracy of the implementers' praise and feedback during sessions, delivery
of reinforcement or feedback at the end of sessions, and calculations of the criteria for reinforcement. Procedural
fidelity was represented by the percentage of procedural components completed correctly within the checklist, and
the percentage was calculated by dividing the number of components marked as “yes” by the number of applicable
components (i.e., number of components marked as “yes” and “no”).
6 KWAK et al.

For Melvin, procedural fidelity was collected for 33% of baseline and 36% of intervention sessions with mean
fidelity of 100% and 90%, respectively. For Nick, procedural fidelity was collected for 33% of baseline and 90% of
intervention sessions with mean fidelity of 100% and 98% (range, 90%–100%), respectively. For Olivia, procedural
fidelity was collected for 50% of baseline and 50% of intervention sessions with mean fidelity of 100% for both phas-
es. For Orlando, procedural fidelity was collected for 33% of baseline and 38% of intervention sessions with mean
fidelity of 100% for both phases.

2.2 | Social validity

Social validity of the goals and procedures, effects of the intervention, and potential generality of the intervention
was evaluated by providing a researcher-developed questionnaire consisting of questions on a 4-point Likert-scale
(“Strongly agree,” “Agree,” “Disagree,” and “Strongly disagree”) to parents of the participants within a week after the
last intervention session. The social validity questionnaire is available from the first author upon request.

2.3 | Experimental design and analysis

A nonconcurrent multiple baseline across participants design was used to evaluate the effects of the intervention.
Effects were primarily determined through visual analysis of changes in level, trend, variability of the data, immediacy
of the effect, overlap, and consistency in data patterns across similar conditions. Log response ratio as an effect size,
along with the confidence intervals (CIs), was calculated to supplement visual analysis, and it was selected as the
method for determining the effect size because the dependent variable was on a ratio scale, the parametric effect
size was sensitive to the change in behavior found in this study, and the description of proportionate change was
considered as a meaningful way to describe the effects of the intervention (Pustejovsky, 2018). Log response ratio
estimate allows one to quantify the change in the dependent variable and the magnitude of the functional relation in
proportionate terms and report the percentage change observed (Pustejovsky, 2018).

2.4 | Procedure

2.4.1 | Baseline

Participants were provided with materials needed to engage in the tasks (e.g., worksheet and pencil) and given in-
structions (e.g., “You will be writing the sentences in the blank lines below”). The participants were told how long they
were expected to engage in the task and were provided a cue to start working on the task. Then, the experimenter
collected data on the participants' on-task behavior. No praise, feedback, or reinforcement was provided.

2.4.2 | Intervention

The criterion to receive reinforcement was determined and a brief preference assessment was conducted prior to the
start of each session. The criterion was calculated by taking the median of the five most recent observations. If the
baseline phase only consisted of three or four observations, which only applied to Melvin, the median of existing data
points was used until data for five observations across phases were available. During the preference assessment, the
experimenter presented three to four stimuli and asked the participant to choose a preferred stimulus. The stimuli
included in the preference assessment were selected based on nominations made by individuals familiar with the
KWAK et al. 7

participants (e.g., caregivers, behavioral therapists, and supervisors) based on the stimuli that were used as rein-
forcers in the past or other stimuli that they reported would likely serve as reinforcers. When the participant chose
a stimulus by vocally referring to or pointing to the stimulus, the identified stimulus was set aside to be provided
contingent upon meeting the session criterion.
During sessions, participants were provided with materials needed to engage in the tasks and given instructions.
The participants were told how long they were expected to engage in the tasks, told that they could earn the selected
item or activity for attending to the tasks, and provided a cue to start working on the task. Then, the experimenter
collected data while providing praise every time the participant was on task for three consecutive intervals and ver-
bal feedback when the definition for off-task behavior was met (e.g., looking away from the task materials for three
seconds). At the end of each session, the experimenter delivered the identified reinforcer and provided praise if the
predetermined criterion was met or provided feedback (e.g., “You did not get balloon this time. Let's show more at-
tention next time.”) if the criterion was not met.
Percentile schedules of reinforcement were used to determine the criteria for successive sessions. The ex-
perimenter considered the five most recent observations (m = 5) and the “criterional probability” was set to 50%
(w = 0.50). This means that, for all intervention sessions, the participants' on-task behavior needed to exceed the
third best, which happened to be the median, percentage of intervals on task out of the five most recent sessions to
meet the criterion. The criterion was calculated for every session and if the new criterion produced for the following
session was lower than the previously set criterion, the criterion simply remained the same. However, if the partici-
pant did not meet the criteria across five consecutive sessions, the new criterion was set to the previously successful
criterion.

3 | RESULTS

The effects of the intervention package on the participants' percentage of intervals on task are shown in Figure 1.
Sessions were implemented approximately one to two times per week and were 15 min in length (10-min sessions for
Orlando) as measured by whole-interval recording. Overall, the participants showed low levels of on-task behavior in
the baseline phase with a grand mean of 32% and higher levels of on-task behavior in the intervention phase with a
grand mean of 68% when data across all four participants are considered. The effect size estimate using log response
ratio indicated that the percentage change in the level of the on-task behavior from baseline to intervention phase,
considering the intervention effects across all four participants, was 118%, 95% CI = [12%, 326%]. Immediacy of
effect and minimal overlapping of data points between the baseline and intervention phases were observed across
the participants. Additionally, the overlapping baseline data points remained constant or changed in the counterther-
apeutic direction when the intervention was introduced for each of the participants.
Melvin's data in baseline displayed moderate levels of on-task behavior (M = 45%, SD = 3%, range = 42%–49%). Dur-
ing intervention, there was an immediate increase in the level on-task behavior (M = 78%, SD = 8%, range = 60%–86%).
The percentage change in the level of the on-task behavior from baseline to intervention phase was 71%,
95% CI = [55%, 90%], and was 90%, 95% CI = [74%, 107%], when only the last three intervention data points were
compared to the baseline data.
Nick's data in baseline displayed a low level of on-task behavior (M = 22%, SD = 10%, range = 9%–36%). The
data during intervention showed an immediate increase followed by an initial decreasing trend with the first four
intervention data points followed by an increasing trend, and high levels of on-task behavior toward the end of the
intervention (M = 68%, SD = 17%, range = 32%–87%). The percentage change in the level of the on-task behavior
from baseline to intervention phase for Nick was 199%, 95% CI = [100%, 347%], and was 283%, 95% CI = [165%,
455%], when only the last three intervention data points were compared to the baseline data.
Olivia's baseline data displayed low levels on-task behavior (M = 13%, SD = 12%, range = 0%–32%). The data
during intervention displayed moderate levels of on-task behavior (M = 47%, SD  =  5%, range  =  38%–51%). The
8 KWAK et al.
KWAK et al. 9

percentage change in the level of the on-task behavior from baseline to intervention phase for Olivia was 238%, 95%
CI = [83%, 523%]. Sessions ended before Olivia reached the mastery criteria due to the COVID-19 pandemic and
services being terminated.
Orlando's data in baseline were variable and displayed moderate levels of on-task behavior during baseline
(M = 49%, SD = 30%, range = 8%–92%). Upon implementation of the intervention, on-task responding stabilized and
displayed a high level (M = 79%, SD = 7%, range = 67%–85%). The percentage change in the level of the on-task be-
havior from baseline to intervention phase for Orlando was 60%, 95% CI = [7%, 139%], and was 69%, 95% CI = [13%,
152%], when only the last three intervention data points were compared to the baseline data.

3.1 | Social validity

Parents either strongly agreed or agreed that the intervention goals and procedures were acceptable, that the inter-
vention was effective, and that there is potential generality to the intervention (M = 3.58, range, 3–4). These results
reflect the responses from Melvin, Nick, and Orlando's parents because Olivia's parent was not able to complete the
social validity questionnaire due to the abrupt cessation of sessions due to the COVID-19 pandemic.

4 | DISCUSSION

The primary purpose of this study was to examine the effects of an intervention package that used percentile sched-
ule of reinforcement, feedback, and lower limits to changes in criteria, in improving the on-task behaviors of individ-
uals with developmental disabilities. The intervention included a brief preference assessment, feedback for on-task
and off-task behaviors during the session, and delivery of the stimulus identified during the preference assessment
contingent on meeting the percentile criterion. The percentile criterion was determined for each session by percentile
schedules in which five of the most recent observations were ranked, and sessions with on-task behavior scoring
above the third highest observation resulted in the delivery of the reinforcer.
Results indicated that there was a functional relationship between the implementation of the intervention and an
increase in the percentage of intervals that the participants were on task. All the participants met the mastery criteria
of 80% of intervals on task across three consecutive sessions except for Olivia, as her sessions were terminated due
to concerns regarding COVID-19 pandemic. Still, effect of the intervention was observed for Olivia despite not being
able to collect more data points during the intervention phase. Overall, there were four demonstrations of effects,
and there is evidence to suggest that there is a functional relationship between the intervention implemented in the
current study and percentage of intervals that the participants were on task. Additionally, the mean percentage of
procedural fidelity of the intervention and rating of social validity were high.
Percentile schedules allow experimenters to systematically consider learners' current levels of responding, set
initial goals that are within learners' current distribution of behavior, and develop continually changing criteria toward
terminal goals (Galbicka, 1994). There has been a growing interest in using percentile schedules to shape behav-
iors, and this study adds to the body of literature that promotes its use (Adams et al., 2017; Gannon et al., 2018;
Joyner et al., 2017; Romanowich & Lamb, 2014). Consistent with previous research, the use of 0.50 for w (Athens
et al., 2007; Bradley & Noell, 2018; Clark et al., 2016; Gannon et al., 2018; Hall et al., 2009; Hustiyi et al., 2011) and
5 for m (Bradley & Noell, 2018; Hall et al., 2009; Hustiyi et al., 2011) when using percentile schedules was shown to
result in positive outcomes. The current study most significantly adds to the literature by using percentile schedules

F I G U R E 1   Effect of intervention package with percentile schedules on on-task behavior. The asterisks on the
percentile criterion line indicate sessions in which the percentile schedule suggested a lower criterion for reinforce-
ment relative to the previous criterion, so the rule for lower limits was applied (i.e., the use of previously determined
criterion)
10 KWAK et al.

to shape on-task behavior as there is just one published study (Athens et al., 2007), to our knowledge, that used this
technology for increasing on-task behavior. A direct comparison between Athens et al. (2007) and the current study
cannot be made because several methodological differences were present in the current study such as measurement
of on-task behavior by session, provision of rules before the start of sessions, provision of praise for on-task behavior,
and application of lower limits to changes in criteria. However, the current study provides a method of implementing
an intervention package with percentile schedules that can be used in clinical contexts.
The current study also extends the literature on percentile schedules by applying lower limits to the criteria,
which has not yet been examined. The rule for lower limits was incorporated in this study partially because Athens
et al. (2007) and Washington et al. (2014) described that the criteria can be susceptible to changes in responding and
decrease in responding can lead to decrease in criteria. Additionally, Washington et al. (2014) noted that reinforce-
ment for continued decrease in behavior can be a weakness to the use of percentile schedules. In the current study,
we wanted to prevent the criteria from decreasing if the participant had already shown responding at the previously
set criteria. However, it is likely that the rule for lower limits did not affect the participants' on-task behavior in this
study because the differences between the suggested percentile criteria and adjusted criteria were minimal (less
than 5%). When analyzing both the on-task behavior and percentile criteria retrospectively, the adjusted criteria
would not have changed the meeting of the criterion for any of the sessions. Although the rule for lower limits may
not have been essential for the participants in this particular study, it is certainly possible that they could be for other
individuals. Also, it is reasonable for researchers and practitioners to generally disallow decreases in criteria during
the shaping process, even when percentile schedules suggest otherwise.
Another finding that we observed when examining the data retrospectively is that there was a difference be-
tween the actual and expected density of reinforcement (w). In the current study, we set the expected density to be
50% (w = 0.5). However, the actual delivery of the reinforcer occurred at the end of 77.14% of intervention sessions,
which substantially exceeded the expected reinforcement. This may have occurred due to the mechanisms that will
be described later that likely led to the immediacy in effects across the participants. Across all participants, the on-
task behavior observed during the first several intervention sessions were considerably higher than the percentile
criteria. If there were no procedural elements added to the intervention aside from the percentile schedule of rein-
forcement, the levels of on-task behavior, at least for the initial sessions, may have potentially been lower and led to
approximately 50% of sessions following the delivery of the reinforcer. Therefore, researchers and practitioners may
consider that if other intervention components are to be added to the intervention besides percentile schedules, the
initial criteria may be easily met, and the density of reinforcement may potentially be higher than expected.
Although we did not conduct a component analysis, the effects of feedback, and perhaps provision of rules, can
be observed especially when examining the first intervention data points across participants. If the intervention only
depended on the stimulus that was provided at the end of each session based on the percentile criterion, its effects
would not have been observed until the second intervention data point, at the earliest, because the participants
would not have contacted the contingency. However, we observed immediate effects across all four participants. We
believe that the immediacy of effects was influenced by (a) praise provided for on-task behavior during the interven-
tion session; (b) feedback provided for off-task behavior during intervention session; and (c) a rule being provided
prior to the start of sessions that indicated a contingency between the behavior and consequence. The participants
were told before the start of session that they may receive a highly preferred stimulus or opportunity to engage in a
selected activity contingent on on-task behavior (e.g., “If you write and focus well, you can play with the play-doh”).
Therefore, a potential mechanism that may have influenced on-task behavior is the use of rules, in which conducting
the preference assessment followed by stating the contingency may have established a rule for engaging in increased
on-task behavior. Still, the effects of the contingency with percentile schedules can be seen through the patterns of
on-task behavior as there is generally an increase in on-task behavior. Additionally, on-task behavior often increased
following sessions in which the participant did not meet criteria, which may reflect the participants' learning of the
contingency. Although the participants likely were not able to identify the exact percentage that they had to meet,
they were encouraged to “do better” next time (e.g., “You did not get the play-doh. Let's show more focus next time”)
KWAK et al. 11

following the session in which they did not meet criteria. Therefore, we believe that the full intervention package was
needed for the participants to ultimately meet the mastery criteria.
One of the limitations of the current study is that we did not evaluate generalization and maintenance effects.
Therefore, we do not know whether the intervention effects would maintain over time or generalize across tasks or
settings. Future studies should assess maintenance of on-task behavior following the intervention phase and evalu-
ate the use of percentile schedules with other types of tasks such as employment-related tasks and daily living tasks.
Another limitation is that due to the COVID-19 pandemic, we were unable to continue data collection until Olivia met
the terminal goal. Although there was immediacy in effects, the pattern of data observed after achieving the terminal
goal could have provided additional insight into the effect of the intervention. A practical limitation is that criterion
calculation can be somewhat laborsome compared to increasing the criterion by a fixed number. However, using only
five observations and selecting the median as the criterion for reinforcement, as was done in the current study, can
simplify the process. One reason we incorporated only five observations to inform our criterion was to increase the
feasibility of using the intervention. During the study, there was never an issue with the implementers' calculation of
the percentile criterion as indicated in the procedural fidelity checklist. Although technological devices are available
(e.g., mobile applications and computer software), we did not use any aid aside from using Microsoft Excel to simply
list the percentages of on-task behavior across sessions. Lastly, a limitation to the study is that we did not collect
social validity data with the participants, and future studies should aim to include the recipients of the intervention
in the social validation process.
The use of percentile schedules within an intervention package to increase on-task behavior offers practitioners
features that other interventions do not offer: objectively and systematically shaping on-task behavior by considering
learners' current levels of responding and setting continually changing goals that depend on the learners' responses.
Additionally, there is flexibility in the use of percentile schedules because we do not believe that the “art” to shaping
must be completely removed from the shaping process even when using percentile schedules. For example, we used
the rule for lower limits in the current study, which does not purely abide by the rules of percentile schedules. We also
believe some flexibility can be incorporated during clinical treatment. For example, if it is the case that high variability
is observed for extended sessions, practitioners may consider using a higher value of m and take into consideration
a larger number of sessions when setting the criteria. If an individual is mostly not contacting reinforcement across
sessions, the w or expected density of reinforcement may be increased. In Washington et al. (2014), the criterion
was made more stringent when the participants demonstrated success with a previous percentile schedule. Perhaps
the decision may be to use a more general shaping procedure to ensure some stability of responding before reimple-
menting percentile schedules. Although researchers and practitioners can certainly choose to strictly abide by the
rules of percentile schedules, we do not believe that one must use it without any clinical judgment or incorporating
exceptions to the rules.
In conclusion, the use of percentile schedules of reinforcement within an intervention package can provide an
efficacious way in which behaviors can be shaped. Because the performance of individuals is considered systemati-
cally, percentile schedules may allow for changes in criteria that are not too “lenient” (i.e., small increases in criteria,
which may lead to a slow shaping process) or too “stringent” (i.e., large increases in criteria, which has the potential
to extinguish responding; Galbicka, 1994). The current study demonstrated that percentile schedules combined with
feedback and setting of lower limits to changes in criteria successfully increased the on-task behaviors of individuals
with DD. Such results are promising; however, the feasibility of its use and clear benefits of using percentile schedules
to shape on-task behavior over traditional shaping procedures warrants further investigation.

ACKNOWLE DG ME NT
No funding has been provided for this research.

CO N FLI CT OF I NTE RE ST
The authors declare that they have no conflict of interest.
12 KWAK et al.

ET HICS STATE ME NT
Informed consent was obtained by all human participants using a consent form approved by Pepperdine University's
IRB.

DATA AVAI LABI LI TY STATE M E N T


The data that support the findings of this study are available from the corresponding author upon reasonable request.

O RC ID
Adel C. Najdowski https://orcid.org/0000-0002-2512-0397

R EF ERE NCE S
Adams, M. A., Hurley, J. C., Todd, M., Bhuiyan, N., Jarrett, C. L., Tucker, W. J., Hollingshead, K. E., & Angadi, S. S. (2017).
Adaptive goal setting and financial incentives: A 2 × 2 factorial randomized controlled trial to increase adults’ physical
activity. BMC Public Health, 17(1), 286. https://doi.org/10.1186/s12889-017-4197-8
Adams, M. A., Sallis, J. F., Norman, G. J., Hovell, M. F., Hekler, E. B., & Perata, E. (2013). An adaptive physical activity inter-
vention for overweight adults: A randomized controlled trial. PLoS One, 8(12), e82901. https://doi.org/10.1371/journal.
pone.0082901
Athens, E. S., Vollmer, T. R., & Peter Pipkin, St. C. (2007). Shaping academic task engagement with percentile schedules. Jour-
nal of Applied Behavior Analysis, 40(3), 475–488. https://doi.org/10.1901/jaba.2007.40-475
Boswell, M. A., Knight, V., & Spriggs, A. D. (2013). Self-monitoring of on-task behaviors using the MotivAider® by a mid-
dle school student with a moderate intellectual disability. Rural Special Education Quarterly, 32(2), 23–30. https://doi.
org/10.1177/875687051303200205
Bradley, R. L., & Noell, G. H. (2018). The effectiveness of supplemental phonics instruction employing constant time delay
instruction for struggling readers. Psychology in the Schools, 55(7), 880–892. https://doi.org/10.1002/pits.22148
Cirelli, C. A., Sidener, T. M., Reeve, K. F., & Reeve, S. A. (2016). Using activity schedules to increase on-task behavior in chil-
dren at risk for attention-deficit/hyperactivity disorder. Education and Intervention of Children, 39(3), 283–300. https://
doi.org/10.1353/etc.2016.0013
Clark, A. M., Schmidt, J. D., Mezhoudi, N., & Kahng, S. (2016). Using percentile schedules to increase academic fluency. Be-
havioral Interventions, 31(3), 283–290. https://doi.org/10.1002/bin.1445
Cooper, J. O., Heron, T. E., & Heward, W. L. (2020). Applied behavior analysis (3rd ed.). Pearson.
Coyle, C., & Cole, P. (2004). A videotaped self-modelling and self-monitoring intervention program to decrease off-task be-
haviour in children with autism. Journal of Intellectual and Developmental Disability, 29(1), 3–16. https://doi.org/10.108
0/08927020410001662642
Dotson, W. H., Richman, D. M., Abby, L., Thompson, S., & Plotner, A. (2013). Teaching skills related to self-employment to
adults with developmental disabilities: An analog analysis. Research in Developmental Disabilities, 34(8), 2336–2350.
https://doi.org/10.1016/j.ridd.2013.04.009
Engstrom, E., Mudford, O. C., & Brand, D. (2015). Replication and extension of a check-in procedure to increase activi-
ty engagement among people with severe dementia. Journal of Applied Behavior Analysis, 48(2), 460–465. https://doi.
org/10.1002/jaba.195
Galbicka, G. (1994). Shaping in the 21st century: Moving percentile schedules into applied settings. Journal of Applied Behav-
ior Analysis, 27(4), 739–760. https://doi.org/10.1901/jaba.1994.27-739
Gannon, C. E., Britton, T. C., Wilkinson, E. H., & Hall, S. S. (2018). Improving social gaze behavior in fragile X syndrome using a
behavioral skills training approach: A proof of concept study. Journal of Neurodevelopmental Disorders, 10(1), 25. https://
doi.org/10.1186/s11689-018-9243-z
Good, R. H., & Kaminski, R. A. (Eds.). (2002). Dynamic indicators of basic early literacy skills (6th ed.). Institute for the Develop-
ment of Educational Achievement. Retrieved from http://dibels.uoregon.edu/
Hall, S. S., Maynes, N. P., & Reiss, A. L. (2009). Using percentile schedules to increase eye contact in children with fragile x
syndrome. Journal of Applied Behavior Analysis, 42(1), 171–176. https://doi.org/10.1901/jaba.2009.42-171
Hartmann, D. P., & Hall, R. V. (1976). The changing criterion design. Journal of Applied Behavior Analysis, 9(4), 527–532. https://
doi.org/10.1901/jaba.1976.9-527
Holifield, C., Goodman, J., Hazelkorn, M., & Heflin, L. J. (2010). Using self-monitoring to increase attending to task and aca-
demic accuracy in children with autism. Focus on Autism and Other Developmental Disabilities, 25(4), 230–238. https://
doi.org/10.1177/1088357610380137
Howie, P. M., & Woods, C. L. (1982). Token reinforcement during the instatement and shaping of fluency in the intervention
of stuttering. Journal of Applied Behavior Analysis, 15(1), 55–64. https://doi.org/10.1901/jaba.1982.15-55
KWAK et al. 13

Hustyi, K. M., Normand, M. P., & Larson, T. A. (2011). Behavioral assessment of physical activity in obese preschool children.
Journal of Applied Behavior Analysis, 44(3), 635–639. https://doi.org/10.1901/jaba.2011.44-635
Jackson, D. A., & Wallace, R. F. (1974). The modification and generalization of voice loudness in a fifteen-year-old retarded
girl. Journal of Applied Behavior Analysis, 7(3), 461–471. https://doi.org/10.1901/jaba.1974.7-461
Jones, B. A., Madden, G. J., & Wengreen, H. J. (2014). The FIT Game: Preliminary evaluation of a gamification approach
to increasing fruit and vegetable consumption in school. Preventive Medicine, 68, 76–79. https://doi.org/10.1016/j.
ypmed.2014.04.015
Jones, B. A., Madden, G. J., Wengreen, H. J., Aguilar, S. S., & Desjardins, E. A. (2014). Gamification of dietary decision-making
in an elementary-school cafeteria. PLoS One, 9(4), e93872. https://doi.org/10.1371/journal.pone.0093872
Joyner, D., Wengreen, H., Aguilar, S., Spruance, L., Morrill, B., & Madden, G. (2017). The FIT Game III: Reducing the operating
expenses of a game-based approach to increasing healthy eating in elementary schools. Games for Health Journal, 6(2),
111–118. https://doi.org/10.1089/g4h.2016.0096
Junod, R. E. V., DuPaul, G. J., Jitendra, A. K., Volpe, R. J., & Cleary, K. S. (2006). Classroom observations of students with
and without ADHD: Differences across types of engagement. Journal of School Psychology, 44(2), 87–104. https://doi.
org/10.1016/j.jsp.2005.12.004
Karweit, N., & Slavin, R. E. (1981). Measurement and modeling choices in studies of time and learning. American Educational
Research Journal, 18(2), 157–171. https://doi.org/10.3102/00028312018002157
Knapczyk, D. R., & Livingston, G. (1974). The effects of prompting question-asking upon on- task behavior and reading com-
prehension. Journal of Applied Behavior Analysis, 7(1), 115–121. https://doi.org/10.1901/jaba.1974.7-115
Lamb, R. J., Kirby, K. C., Morral, A. R., Galbicka, G., & Iguchi, M. Y. (2010). Shaping smoking cessation in hard-to-treat smokers.
Journal of Consulting and Clinical Psychology, 78(1), 62–71. https://doi.org/10.1037/a0018323
Lamb, R. J., Morral, A. R., Kirby, K. C., Iguchi, M. Y., & Galbicka, G. (2004). Shaping smoking cessation using percentile sched-
ules. Drug and Alcohol Dependence, 76(3), 247–259. https://doi.org/10.1016/j.drugalcdep.2004.05.008
Lee, S. W., Kelly, K. E., & Nyre, J. E. (1999). Preliminary report on the relation of students’ on- task behavior with completion
of school work. Psychological Reports, 84(1), 267–272. https://doi.org/10.2466/pr0.1999.84.1.267
Massey, N. G., & Wheeler, J. J. (2000). Acquisition and generalization of activity schedules and their effects on task engage-
ment in a young child with autism in an inclusive pre-school classroom. Education and Training in Mental Retardation and
Developmental Disabilities, 35(3), 326–335.
McWilliam, R. A., & Bailey, D. B. (1995). Effects of classroom social structure and disability on engagement. Topics in Early
Childhood Special Education, 15(2), 123–147. https://doi.org/10.1177/027112149501500201
Mechling, L. C., Gast, D. L., & Seid, N. H. (2009). Using a personal digital assistant to increase independent task completion
by students with autism spectrum disorder. Journal of Autism and Developmental Disorders, 39(10), 1420–1434. https://
doi.org/10.1007/s10803-009-0761-0
Miller, N., & Neuringer, A. (2000). Reinforcing variability in adolescents with autism. Journal of Applied Behavior Analysis, 33(2),
151–165. https://doi.org/10.1901/jaba.2000.33-151
Palmen, A., & Didden, R. (2012). Task engagement in young adults with high-functioning autism spectrum disorders: Gen-
eralization effects of behavioral skills training. Research in Autism Spectrum Disorders, 6(4), 1377–1388. https://doi.
org/10.1016/j.rasd.2012.05.010
Platt, J. R. (1973). Percentile reinforcement: Paradigms for experimental analysis of response shaping. In G. H. Bower (Ed.),
Psychology of learning and motivation (Vol. 7, pp. 271–296). Academic Press. https://doi.org/10.1016/S0079-7421(08)
60070-5
Ponitz, C. C., Rimm-Kaufman, S. E., Grimm, K., & Curby, T. W. (2009). Kindergarten classroom quality, behavioral engagement,
and reading achievement. School Psychology Review, 38(1), 102–120.
Pustejovsky, J. E. (2018). Using response ratios for meta-analyzing single-case designs with behavioral outcomes. Journal of
School Psychology, 68, 99–112. https://doi.org/10.1016/j.jsp.2018.02.003
Radloff, L. (2018). Insight: Observation timer. (Version 1.2) [Mobile app]. Retrieved from https://itunes.apple.com
Radloff, L. (2019). Insight: Observation timer. (Version 1.3.2) [Mobile app]. Retrieved from https://itunes.apple.com
Rafferty, L. A., Arroyo, J., Ginnane, S., & Wilczynski, K. (2011). Self-monitoring during spelling practice: Effects on spelling ac-
curacy and on-task behavior of three students diagnosed with attention deficit hyperactivity disorder. Behavior Analysis
in Practice, 4(1), 37–45. https://doi.org/10.1007/BF03391773
Rea, J., & Williams, D. (2002). Shaping exhale durations for breath co detection for men with mild mental retardation. Journal
of Applied Behavior Analysis, 35(4), 415–418. https://doi.org/10.1901/jaba.2002.35-415
Rivera, C. J., Mason, L. L., Jabeen, I., & Johnson, J. (2015). Increasing teacher praise and on task behavior for stu-
dents with autism using mobile technology. Journal of Special Education Technology, 30(2), 101–111. https://doi.
org/10.1177/0162643415617375
Romanowich, P., & Lamb, R. J. (2014). The effects of percentile versus fixed criterion schedules on smoking with
equal incentive magnitude for initial abstinence. Experimental and Clinical Psychopharmacology, 22(4), 348–355.
https://doi.org/10.1037/a0036935
14 KWAK et al.

Slattery, L., Crosland, K., & Iovannone, R. (2016). An evaluation of a self-management intervention to increase on-task be-
havior with individuals diagnosed with attention-deficit/hyperactivity disorder. Journal of Positive Behavior Interventions,
18(3), 168–179. https://doi.org/10.1177/1098300715588282
Sundberg, M. L. (2008). Verbal behavior milestones assessment and placement program: The VB-MAPP. AVB Press.
Valbuena, D., Miltenberger, R., & Solley, E. (2015). Evaluating an internet-based program and a behavioral coach for Increas-
ing physical activity. Behavior Analysis: Research and Practice, 15(2), 122–138. https://doi.org/10.1037/bar0000013
Washington, W. D., Banna, K. M., & Gibson, A. L. (2014). Preliminary efficacy of prize-based contingency management to
increase activity levels in healthy adults. Journal of Applied Behavior Analysis, 47(2), 231–245. https://doi.org/10.1002/
jaba.119

How to cite this article: Kwak, D., Najdowski, A. C., & Danielyan, S. (2022). Using an intervention package
with percentile schedules to increase on-task behavior. Behavioral Interventions, 1–14. https://doi.
org/10.1002/bin.1861

You might also like