You are on page 1of 165

USING BEHAVIORAL SKILLS TRAINING TO TEACH BEHAVIORAL

INTERVENTIONS AND MILIEU TEACHING: A SYSTEMATIC REVIEW OF THE

LITERATURE AND EMPIRICAL INVESTIGATION

by

MYLISSA SLANE

(Under the Direction of Rebecca Lieberman-Betz)

ABSTRACT

Language impairments in children are associated with later impairments in

cognitive, language, and academic domains (Johnson et al., 1999). The prevalence rate

for language impairments is high among community samples (7% to 17%; King et al.,

2005), and speech and language disorders are often co-morbid with other

neurodevelopmental disorders (Rosenbaum & Simon, 2016). Thus, one way to ensure

greater access to services and increased intervention dosage is to train natural

implementers (those who are already part of the child’s typical environment; e.g.,

teachers in a classroom, parents/guardians in a home) to deliver evidence-based language

interventions. The purpose of the following two studies was to (a) systematically review

and synthesize the literature examining the use of behavioral skills training (BST) to train

natural implementers (i.e., teachers and other professionals) to implement various

interventions and (b) extend the current literature by utilizing BST to train teachers to

implement primary components of a language intervention, milieu teaching (MT), with

fidelity. Results of the systematic review showed that BST could be effectively used to

train teachers and staff to implement a variety of interventions (e.g., reading racetrack,
the picture exchange communication system [PECS], discrete trial teaching [DTT], the

natural language paradigm [NLP]) targeting a variety of skills and deficits. However,

only a handful of studies had sufficient rigor, quality, and interpretable outcomes to infer

a functional relation. The second study was an empirical investigation examining the

effects of BST training on implementation of MT. Two teachers were taught to

implement MT using BST and both teachers learned to implement three core MT

techniques: following the child’s lead (FTCL), teaching social routines (TSR), and the

system of least prompts (SLP). A functional relation was demonstrated across each tier

for one teacher, with two out of three behaviors (FTCL and TSR) replicated across two

teachers. Results from the systematic literature review and the empirical investigation

have implications for future research in that both studies suggested natural implementers

(teachers and staff) can and should be taught to implement interventions with fidelity,

thereby increasing access to evidence-based interventions for children with disabilities.

INDEX WORDS: behavioral skills training, milieu teaching, teachers, intervention


USING BEHAVIORAL SKILLS TRAINING TO TEACH BEHAVIORAL

INTERVENTIONS AND MILIEU TEACHING: A SYSTEMATIC REVIEW OF THE

LITERATURE AND EMPIRICAL INVESTIGATION

by

MYLISSA MARY SLANE

Bachelor of Arts, Bloomsburg University of Pennsylvania, 2011

Master of Science, Bucknell University, 2013

A Dissertation Submitted to the Graduate Faculty of The University of Georgia in Partial

Fulfillment of the Requirements for the Degree

DOCTOR OF PHILOSOPHY

ATHENS, GEORGIA

2020
© 2020

Mylissa Mary Slane

All Rights Reserved


USING BEHAVIORAL SKILLS TRAINING TO TEACH BEHAVIORAL

INTERVENTIONS AND MILIEU TEACHING: A SYSTEMATIC REVIEW OF THE

LITERATURE AND EMPIRICAL INVESTIGATION

by

MYLISSA MARY SLANE

Major Professor: Rebecca Lieberman-Betz

Committee: A. Michele Lease


Amy Reschly
Joel Ringdahl

Electronic Version Approved:

Ron Walcott
Dean of the Graduate School
The University of Georgia
December 2020
ACKNOWLEDGEMENTS

I would like to thank my major professor, Dr. Rebecca Lieberman-Betz for her

unwavering support and assistance. I never would have been able to complete this project

without her guidance, advice, and assistance. I would also like to thank my academic

advisor and committee member, Dr. Michele Lease for believing in me and for pushing

me to continue even when things seemed impossible or insurmountable. Thank you as

well to my committee members, Dr. Joel Ringdahl and Dr. Amy Reschly who provided

me with helpful feedback and support throughout this process. They challenged me to

think critically and helped improve my project immensely. I cannot express my gratitude

enough to those who helped me with data and reliability coding, Maggie Molony, Ali

Zelan, and Kelsie Tyson. Their hard work and dedication were truly admirable and

without them, this project would not have been possible. I would also like to thank my

family who have supported me through this process and throughout all of graduate

school. They have always been there for me, especially when times were trying or

difficult. They are my rock and inspiration and without them, I would never have had the

courage to push myself as far as I have. I would also like to thank the school where I

completed my project for their support and cooperation throughout the project. In

addition, I want to extend my sincerest gratitude to my participants, without whom this

project would not be possible. Thank you to all of the friends I made throughout graduate

school and especially to my cohort for always being there for one another and supporting

one another. We truly made a great team! Thank you all!

iv
TABLE OF CONTENTS

Page

ACKNOWLEDGEMENTS ............................................................................................... iv

LIST OF TABLES ............................................................................................................ vii

LIST OF FIGURES ..........................................................................................................viii

CHAPTER

1 INTRODUCTION ............................................................................................. 1

Naturalistic Developmental Behavioral Interventions ................................ 2

Milieu Teaching........................................................................................... 3

Behavioral Skills Training ........................................................................... 5

Purpose of the Studies ................................................................................. 7

2 STUDY 1: USING BEHAVIORAL SKILLS TRAINING TO TEACH

BEHAVIORAL INTERVENTIONS: A SYSTEMATIC REVIEW ................. 8

Abstract........................................................................................................ 9

Introduction ............................................................................................... 11

Method ....................................................................................................... 15

Results ....................................................................................................... 20

Discussion.................................................................................................. 28

3 STUDY 2: TEACHING THE TEACHER: USING BEHAVIORAL SKILLS

TRAINING TO TRAIN TEACHERS TO IMPLEMENT MILIEU

TEACHING TECHNIQUES........................................................................... 53

Abstract...................................................................................................... 54

v
Introduction ............................................................................................... 56

Method ....................................................................................................... 66

Results ....................................................................................................... 89

Discussion.................................................................................................. 97

4 GENERAL DISCUSSION ............................................................................ 113

REFERENCES ................................................................................................................ 117

APPENDICES

A OUTCOME CODING DESCRIPTIONS FOR SINGLE CASE ANALYSIS

REVIEW AND FRAMEWORK (SCARF)................................................... 133

B OBSERVATION DATA COLLECTION SHEET ....................................... 135

C INTERVENTION FIDELITY SHEETS ....................................................... 136

D DATA COLLECTION SHEETS .................................................................. 145

E TEACHER DEMOGRAPHICS FORM ........................................................ 147

F CHILD DEMOGRAPHICS FORM .............................................................. 151

G SOCIAL VALIDITY MEASURE ................................................................ 153

vi
LIST OF TABLES

Page

Table 2.1: Participant Demographics ................................................................................ 35

Table 2.2: Study Outcomes ............................................................................................... 38

Table 2.3: Rigor Coding Questions from the Single Case Analysis Review and

Framework (SCARF) ............................................................................................ 42

Table 2.4: Quality & Breadth of Measurement Coding Questions from the Single Case

Analysis Review and Framework (SCARF) ......................................................... 44

Table 2.5: SCARF Quality, Rigor, and Outcome Scores .................................................. 47

Table 3.1: Teacher Participant Demographics ................................................................ 105

Table 3.2: Child Participant Demographics .................................................................... 106

Table 3.3: IOA Agreement by Tier and Condition.......................................................... 107

vii
LIST OF FIGURES

Page

Figure 2.1: Preferred Reporting for Systematic Reviews and Meta Analyses (PRISMA)

Flow Diagram ........................................................................................................ 49

Figure 2.2: SCARF Quality and Rigor of Primary Outcomes........................................... 50

Figure 2.3: SCARF Quality and Rigor of Generalized Outcomes .................................... 51

Figure 2.4: SCARF Quality and Rigor of Maintained Outcomes ..................................... 52

Figure 3.1: Accurate Use of the Prompting Hierarchy .................................................... 108

Figure 3.2: Fidelity of Implementation Across Behaviors for Ms. Smith ....................... 109

Figure 3.3: Fidelity of Implementation Across Behaviors for Mr. Parker ...................... 111

viii
CHAPTER 1

INTRODUCTION

Language delays in children can have serious negative implications for future

development, including educational and social development (Peterson, 2004). Prevalence

rates of speech and language disorders in children vary based on child age and diagnostic

criteria, but are estimated to affect between 3% and 16% of children in the US

(Rosenbaum & Simon, 2016). They are also found to frequently co-occur with other

neurodevelopmental disabilities, such as autism spectrum disorder (ASD; Rosenbaum &

Simon, 2016). Given the impact of language delays on daily functioning and social

interactions, and the importance of language development for growth and development in

other areas, the need for intervention for children with language delays cannot be

understated.

Discrete trial teaching (DTT) became one of the most widely implemented

interventions for communication delays and many other developmental needs for

individuals with ASD (Schreibman et al., 2015). During the process of DTT, a targeted

skill is broken down into several components and the child and interventionist work on

learning one component at a time until the target behavior is mastered (Schreibman et al.,

2015). However, several limitations with the use of DTT in individuals with ASD were

identified including lack of generalization, challenging behaviors, a lack of spontaneity,

and a heavy dependence on prompts for performance (Schreibman et al., 2015).

According to Schreibman et al. (2015) the identification of these shortcomings, along

with a surge in the literature examining developmental interventions for young children

1
with social-communication disorders such as ASD, led to increased interest in more

naturalistic interventions.

Naturalistic Developmental Behavioral Interventions

More naturalistic teaching strategies began to be used after a surge in

developmental research suggested that children with ASD learn on a developmental

trajectory that is more similar to, rather than different from, typical development

(Schreibman et al., 2015). This, in combination with the limitations of DTT, led to an

incorporation of more naturalistic intervention based strategies for children with ASD

(Schreibman et al., 2015). Naturalistic intervention strategies use natural reinforcers (as

compared to arbitrary), use materials that children prefer, reinforce child attempts at

communicating or approximations of a target response, and intervene in a more natural

context rather than one that is contrived. Naturalistic developmental behavioral

interventions (NDBIs) combine these naturalistic intervention strategies with

developmental interventions and principles of applied behavior analysis to create a class

of interventions all its own (Schreibman et al., 2015). According to Schreibman et al.

(2015), there are several common features of NDBIs, including (1) a three-part

contingency; (2) manualized practice; (3) individualized treatment goals; (4) ongoing

measurement of progress; (5) child-initiated teaching episodes; (6) environmental

arrangement; (7) use of prompting and prompt fading; (8) modeling; (9) adult imitation

of the child’s language, play, or body movements; and (10) broadening the attentional

focus of the child. Naturalistic interventions, including NDBIs incorporate not only the

2
natural environments of children (e.g., classroom, home), but also intervention

implementers who are part of children’s natural environments (e.g., parents, teachers).

Training natural implementers to carry out NDBIs has the potential to increase

dosage exponentially for children with disabilities receiving intervention within the

natural environment (Peterson, 2004). Parents and teachers have many more

opportunities throughout the day and the week to implement intervention techniques and

to help address children’s identified needs. Teachers have been identified as one of the

prime candidates in terms of natural implementers and several studies have demonstrated

their ability to implement various interventions with fidelity. For example, teachers have

been trained to implement prelinguistic milieu teaching (PMT; Mccathren, 2000),

manualized interventions targeting joint attention (Kaale et al., 2012), enhanced milieu

teaching (Olive et al., 2007), naturalistic language teaching (Smith & Camarata, 1999),

and symbolic play and joint attention interventions (Wong, 2013). Indeed, several

intervention techniques including time delay, the mand-model procedure, and milieu

teaching have been identified as particularly suited to teacher implementation (Peterson,

2004).

Milieu Teaching

Milieu Teaching (MT) is an intervention technique that is conversation-based and

focuses on the child’s interests to encourage communication from the child (Kaiser et al.,

1993). MT has been successfully implemented with children with speech/language and

other communication delays. There are three main mechanisms to MT, which include

environmental arrangement, responsive interaction techniques, and milieu teaching

3
procedures (time delay, modeling, and mand-modeling; Peterson, 2004). As part of

environmental arrangement, the teacher or interventionist sets up the environment in such

a way as to encourage communicative acts on the part of the child (Peterson, 2004). For

example, an interventionist may place an object out of reach of the child or in a place

where the child cannot easily get to without the teacher’s help. It is the hope that such

object placement will occasion the child to communicate with the teacher in order to

receive help obtaining the desired item. Responsive interaction techniques include:

following a child’s lead, turn-taking, providing descriptive statements, imitating the

child’s verbalizations, and expanding on the child’s statements (Peterson, 2004). Time

delay is a procedure in which one uses nonvocal cues to occasion vocal responding from

a child (Peterson, 2004). During this procedure a teacher identifies something that the

child wants or desires and looks at them expectantly in the hope of occasioning a vocal

response. If this method does not work, the teacher then moves on to the mand-model

procedure. The mand model procedure is one that involves both manding (making a

request from the child) and modeling (demonstrating for the child what he/she is

expected to do). This is a teacher-initiated strategy in which the teacher directly asks the

child what he/she wants and then models the appropriate vocal response if no response is

given.

MT has a strong research base supporting its efficacy in teaching children with

language delays new language targets (Bolzani et al., 1990). Mand-model and incidental

teaching techniques improved communication in children with mild intellectual disability

(Warren & Gazdag, 1990), improved spontaneous production of multiple and single

4
words in children who experience prenatal cocaine exposure (Bolzani et al., 2009),

increased language in children with developmental disabilities (Togram & Erbas, 2010),

and taught a photo exchange system to a child with ASD (Ogletree et al., 2012). In

addition to having trained implementers utilize MT to increase language gains in young

children with developmental disabilities, several studies have taught teachers to

implement MT in young children and also demonstrated positive effects on children’s

language development (e.g., Kaiser et al., 1993). Therefore, research has shown that

children can benefit from teachers’ implementation of MT. However, the methods used

for training teachers to implement such interventions vary across studies.

Behavioral Skills Training

Behavioral Skills Training (BST) is a comprehensive training package that

incorporates four main elements, including: instruction, modeling, rehearsal, and

feedback (Kornacki et al., 2013; Ward-Horner & Sturmey, 2012). During the instruction

phase, the trainer provides information on the target intervention (usually in the form of

written directions or a slideshow). During modeling, the trainer demonstrates how to

perform the steps of the desired skill accurately for the learner. The rehearsal portion

gives the learner the opportunity to practice the desired skill with the trainer, so that

he/she may become more comfortable performing the target behaviors. Finally, the

trainer provides the trainee with corrective feedback as he/she implements the target

behaviors (Krumhus & Malott, 1980; Nuernberger et al., 2013).

BST has been used not only to train individuals to perform new behaviors or

engage in new tasks, but it has also been used to train teachers and other professional

5
staff to conduct a variety of interventions. For example, BST has been used to train

teachers to implement the Picture Exchange Communication System (PECS) with their

students (Homlitas et al., 2014), to train staff to implement mand training (Nigro-Bruzzi

& Sturmey, 2010), to train teachers and staff to use discrete trial teaching (DTT; Jull &

Mirenda, 2016; Sarokoff & Sturmey, 2004), to train teachers to implement specific goals

from students’ behavioral intervention plans (BIP; Madzharova & Sturmey, 2018), and to

train teachers to implement response interruption and redirection (RIRD) with students

exhibiting self-stimulatory behaviors (Giles et al., 2018).

Additionally, BST has been used to train several NDBIs in the literature,

including training teachers to implement the Natural Language Paradigm and response

chaining (NLP; Seiverling et al., 2010), training teachers to effectively use NLP

(Gianoumis et al., 2012), and training paraprofessionals to implement the system of least

prompts using an embedded teaching procedure (Toelken & Miltenberger, 2012). This is

a key study because the use of a brief, embedded teaching procedure ensured that the

staff could perform the intervention effectively while still performing their regular

classroom duties, thereby ensuring that the intervention did not interfere with their

primary teaching duties.

Thus, BST has been used to train a variety of NDBI techniques with a variety of

implementers, including teachers and other professional staff. Similarly, MT has been

shown to be effective in improving children’s language gains and outcomes. However,

studies examining MT report using a variety of training techniques, but do not report

utilizing BST as a training package

6
Purpose of the Studies

The purpose of the following studies is to: (1) provide a systematic review of the

current literature regarding the use of BST to train teachers and other professional staff to

implement various behavioral interventions; and (2) to conduct an empirical investigation

examining the effects of BST on teachers’ fidelity of implementation of various MT

techniques. First, Study 1 sought to determine whether BST has been used effectively to

train other individuals (teachers and other professionals) to implement interventions with

children ages birth to 21 through a systematic review of the current literature. It also

sought to determine the level of quality of the literature base. Finally, Study 1 sought to

determine whether there were any variables that impact the efficacy of BST training on

fidelity of intervention implementation. Second, Study 2 sought to determine whether

BST was effective in training teachers to implement several MT techniques in the

classroom with fidelity. This study also examined whether the fidelity of implementation

of intervention techniques generalized to another set of toys and whether it maintained

over time. The combined results of these two studies have the potential to increase the

confidence in the use of teachers and other professional staff as natural implementers

who can implement interventions with fidelity when properly trained. This has the

potential to increase dosage for students receiving interventions if they can be properly

implemented by teachers, who spend a great deal more time with students than most

other interventionists.

7
CHAPTER 2

STUDY 1: USING BEHAVIORAL SKILLS TRAINING TO TEACH

BEHAVIORAL INTERVENTIONS: A SYSTEMATIC REVIEW1

1
Slane, M. M., & Lieberman-Betz, R. To be submitted to Behavioral Interventions
8
Abstract

Behavioral skills training (BST) is a well-researched, established set of principles that has

been used to train a variety of individuals to complete numerous behavioral tasks, skills,

strategies, and interventions (DiGennaro et al., 2018). It includes the four main

components of instruction, modeling, rehearsal, and feedback (DiGennaro et al., 2018)

and has been used to teach individuals from a variety of backgrounds (including natural

implementers) to implement various interventions (e.g., Madzharova & Sturmey, 2018;

Nigro-Bruzzi & Sturmey, 2010; Sarokoff & Sturmey, 2004). However, there has not been

a comprehensive review that has synthesized and analyzed the existing literature on the

use of BST to train staff and teachers working with children, adolescents, and young

adults to implement interventions. Therefore, the current review aimed to close this gap

in the literature by conducting a systematic review of studies utilizing BST to train

teachers and other professionals to implement various interventions with children ages

birth to 21. A total of 19 studies from 17 articles were included in the review. The

SCARF protocol (Ledford, et al., 2016) was utilized to rate article quality/rigor and

outcomes of studies. All studies showed positive outcomes, suggesting that teachers and

other professional staff can be effectively taught using BST to implement a variety of

interventions with fidelity. However, only seven articles were found to have sufficient

quality/rigor scores in their primary outcomes to allow for interpretation of findings with

confidence. This indicates that additional high-quality studies are needed to examine the

efficacy of BST to teach others to implement intervention to support skill development in

9
individuals with disabilities. Implications for future research and intervention are

discussed.

INDEX WORDS: behavioral skills training, systematic literature review, natural

implementers, teachers

10
Using Behavioral Skills Training to Teach Behavioral Interventions: A Systematic

Review

Behavioral skills training (BST) is a well-researched, established set of principles

that has been used to train a variety of individuals to complete numerous behavioral

tasks, skills, strategies, and interventions (DiGennaro et al., 2018). In one of the earliest

studies to examine BST, Koegel et al. (1977) trained teachers to implement several

behavior modification procedures, including prompting, shaping, discrete trials, and

proper implementation of consequences. BST procedures involved reading a training

manual and watching video recordings of proper instructional techniques; implementing

the behavioral modification strategies and receiving live feedback from trained staff

regarding their performance; and receiving praise for correct performance, and corrective

feedback and modeling to rectify improper implementation. Using these procedures, the

authors were able to train teachers to implement behavioral modification techniques with

fidelity. Alden et al. (1978) further expanded the procedures that comprise BST, using

modeling and rehearsal as part of the initial training procedures rather than the corrective

feedback procedure. Around this time, recognition of the potential of BST to teach

important behaviors piqued and numerous studies examining its utility were published.

Several studies have utilized BST to increase knowledge and safety skills in

young children (e.g., Kolko et al., 1991; Miltenberger et al., 2009; Miltenberger &

Thiesse-Duffy, 1988; Wurtele & Owens, 1997; Wurtele, 1990), to teach children how to

find help when lost (Pan-Skadden et al., 2009), to teach safety skills aimed at preventing

sexual abuse (Wurtele et al., 1986), and to prevent child abduction (e.g., Bromberg &

11
Johnson, 1997; Johnson et al., 2005, 2006). BST has also been used to help promote

knowledge and prevent the risk of HIV/AIDS (e.g., Adams et al., 1992; Boyer &

Kegeles, 1991; Lawrence et al., 1995), to prevent gun play in young children (e.g., Himle

& Miltenberger, 2004; Miltenberger et al., 2004, 2005), and to encourage smoking

cessation (Glasgow & Lichtenstein, 1987). In recent years, the use of BST has further

expanded to include training teachers, staff, and parents to implement behavior analytic

principles as well as other interventions.

BST has been used to train teachers to implement complex behavior intervention

plans in the classroom (Madzharova & Sturmey, 2018); to train staff to use mand training

with children (Nigro-Bruzzi & Sturmey, 2010); to use discrete-trial teaching (Sarokoff &

Sturmey, 2004); and to improve the use of positive reinforcement, error correction, and

increase opportunities for responding (Palmen et al., 2010). BST has also been used to

train intervention-naïve adults to implement the picture exchange communication system

(PECS; Rosales et al., 2009). Additionally, researchers have used BST to train parents to

implement a variety of behavior analytic techniques. Parents have been taught to improve

food selectivity (Seiverling et al., 2012), promote social skills development (Hassan et

al., 2018), implement guidance compliance (Miles & Wilder, 2009), and implement

discrete-trial teaching (Ward-Horner & Sturmey, 2008).

Because of the extensive use of BST to train individuals to use or implement new

skills, several studies have sought to identify the most potent components of the

intervention package. Krumhus and Malott (1980) independently analyzed three

components of BST, including (1) instructions, (2) modeling, and (3) feedback. Although

12
use of instructions alone showed slight improvements in accuracy, use of modeling

drastically increased accuracy, and use of feedback led to further increases. In a follow up

study, Ward-Horner and Sturmey (2012) found that while modeling was an important

component of BST, feedback was the most effective and necessary component of the

training package. However, Kornacki et al., (2013) found that the key component for

BST success varied by individual participants. Given the results of these studies, BST is

now generally considered a four-component training package consisting of (1)

instruction, (2) modeling, (3) rehearsal, and (4) feedback (DiGennaro et al., 2018).

The BST literature has grown tremendously over the last 40 years, with its uses,

trainees, and contexts increasingly expanding. The field has identified the critical

components of this training package and has established it as a well-supported, evidence-

based training package. The contexts in which BST can be applied are constantly

increasing, expanding both the utility and applicability of BST to numerous behavior

analytic principles and interventions. In addition, BST has evolved from a training

package to change specific target behaviors to one that can be used to train other

individuals to implement a variety of intervention procedures. However, there has not

been a study that has synthesized and analyzed the existing literature on the use of BST to

train staff and teachers working with children, adolescents, and young adults to

implement various interventions. Such an analysis could benefit the field in several ways.

First, it would provide a comprehensive overview of the current BST-intervention

literature, including the populations with which it has been implemented, the topics of

focus (i.e., interventions, behavioral principles), and the outcomes of BST

13
implementation. Second, it would bring to light variables or factors that may influence

the efficacy/effectiveness of BST to teach others to implement interventions. Third, such

a review could aid future researchers and clinicians in determining whether BST is an

appropriate training package for training others to implement a target intervention and

highlight any significant considerations. Fourth, a review would provide a sense of the

quality of the current literature and in turn provide directions for future research to help

increase the quality of future studies.

Purpose of the Review

The purpose of the current review is to systematically synthesize and analyze the

BST-intervention literature to help guide both research and practice. This review will

support the translation of research into practice and will help guide clinical decision

making for assessing and evaluating whether BST is the appropriate training package

based on type of intervention, outcome variables, population, context, and quality of

published studies. The following research questions will be addressed:

1. Is BST efficacious when used to train other individuals to implement

interventions with children, adolescents, and young adults (ages birth to 21)?

2. What is the quality of studies comprising the current literature base examining the

use of BST to teach interventions?

3. What, if any, are the pertinent intervention variables that influence the

efficacy/effectiveness of BST on acquisition of intervention skills?

14
Method

The Preferred Reporting for Systematic Reviews and Meta Analyses (PRISMA)

guidelines were used to guide decision making throughout the literature review. The

PRISMA guidelines were developed in an effort to ensure scientific rigor and structural

commonality among systematic reviews and meta-analyses (Moher et al., 2009).

Article Search

A keyword search of the following databases was conducted to identify studies

for review: (1) PsycINFO, (2) Psychology and Behavioral Sciences Collection, and (3)

Education Research Complete. Search terms were entered as follows: Line 1: “behavioral

skills training;” AND Line 2: “intervention.” If available, the following options were also

selected: publication type- peer-reviewed journals, language- English. Studies were

published before or during December 2019, when the search was conducted. In addition

to the database search, the reference lists of all eligible studies were reviewed for relevant

articles to increase the scope of the search.

The following inclusion criteria were used to identify eligible studies: (1) the

study must have been published in a peer-reviewed journal (dissertations and theses were

excluded); (2) the study must either have been written in or translated to English; (3) the

study must have been quantitative in nature, utilizing either a group or single case design;

(4) the phrase “behavioral skills training” must have appeared somewhere in the article

(not the references section alone); and (5) the study must have used BST to train teachers,

paraprofessionals, or other staff to implement some form of intervention with children,

adolescents, or young adults (ages birth to 21 years).

15
A total of 172 studies were identified through the initial online search, with an

additional 21 studies identified through other sources, for a total of 193 studies to be

screened for full inclusion criteria. After duplicates were removed, a total of 146 studies

remained. The abstracts of these studies were reviewed using the full eligibility criteria

and a total of 124 were excluded. The remaining 22 full text articles were examined to

confirm eligibility and five were excluded due to failure to meet full inclusion criteria. In

the end, a total of 19 studies from 17 articles were included in the review (see Figure 2.1

for a complete diagram of study inclusion/exclusion).

Article Coding

Descriptive Information

Studies were reviewed and coded for descriptive characteristics. Data were

collected on trainees (e.g., age and gender), intervention recipients (e.g., age, diagnoses,

gender, demographic information), BST implementation (components used and fidelity),

intervention type, and intervention quality (setting, target behaviors, fidelity, and

effectiveness).

Evaluation Criteria for Study Quality

The methodology of all single case design (SCD) studies (all studies) was

evaluated using the Single-Case Analysis and Review Framework (SCARF; Ledford et

al., 2016). The SCARF was designed to assess study (1) rigor, (2) quality, and (3)

outcomes. Outcome scores on the SCARF of 3.0 - 4.0 or higher are consistent with

confidence in the demonstration of a functional relation. Similarly, quality/rigor scores of

16
2.0 or higher are considered to be strong enough for the study results to be interpreted.

The three areas assessed are discussed in more detail below.

Rigor. The three quality indicators for study rigor are reliability, fidelity, and

sufficiency of data. Evidence of the reliability of the dependent variable is determined

through examining the collection, reporting, and levels of interobserver agreement (IOA)

data. Evidence of fidelity of the independent variable is determined by looking at the

fidelity of implementation, sufficiency of the fidelity data, the frequency of fidelity data

collection, and the use of inter-observer agreement for fidelity data. Finally, evidence of

the sufficiency of the data is determined by examining the number of data points per

condition and the overall trend of the data when switching between conditions.

Quality of measurement. The seven indicators for study quality are social and

ecological validity, participant descriptions, condition descriptions, dependent variable

descriptions, two forms of generalization measurement, and measurement of

maintenance. Evidence of the social and ecological validity of the study is determined

through examining feasibility and acceptability data, psychometric properties of utilized

measures, normative comparisons for dependent variables, and the environment in which

the study is implemented. Evidence for participant description is determined by

examining quality of demographic data, reporting of formal test results, general

participant information, and study inclusion criteria. Condition descriptions are evaluated

by analyzing the description of condition procedures, the dosage, the setting, and the

demographics and training characteristics for the individuals implementing the

interventions.

17
The dependent variables are evaluated in terms of their operational definitions,

examples of target and non-target behaviors, and the description of the measurement

system and its use. Evidence for generalization is determined by evaluating the

description of generalization across contexts, materials, individuals, and settings in

addition to the programming of behavior generalization. Finally, evidence for

maintenance is evaluated by examining the reporting of continued data collection and

behavior change, the number of times maintenance is evaluated, and the time frame

during which maintenance is assessed.

Outcomes. The three quality indicators for study outcomes involve reporting for

primary outcomes, generalized outcomes, and the maintenance of outcomes. Evaluation

of these quality indicators requires the examination of the type of measurement used to

determine outcomes, the strength and evidence for treatment efficacy/effectiveness, and

the generalization and maintenance of reported study outcomes.

Interrater Reliability

First, interrater reliability (IRR) was examined for inclusion of studies.

Approximately 30% of all considered studies were reviewed by a second rater to

determine IRR (n = 43). Studies were judged to be included, excluded, or uncertain by

the second rater. Any study that was placed in the uncertain group and any disagreements

on inclusion were fully reviewed, discussed, and resolved by the two reviewers until a

consensus was reached. When screening a study for inclusion, raters first searched the

article for the phrase “behavioral skills training.” If the study did not include this phrase,

it was excluded from further eligibility review. If the study did include this phrase, the

18
raters then reviewed it to ensure it satisfied all five of the inclusion criteria described

above for inclusion in the review. IRR for inclusion/exclusion of studies was 88%.

For studies meeting criteria for full review, 30% were reviewed by a second rater

to determine IRR for article quality coding using the SCARF (n = 6). The two raters had

to demonstrate 80% reliability on two consecutive training studies before reliability

coding could begin. If an article’s IRR rating fell below 80%, discrepancies were

reviewed, discussed, and resolved by the two raters until a consensus was reached. IRR

was determined by dividing the number of agreements by the number of agreements plus

disagreements and multiplying by 100. Average IRR data was 80% with a range of 74%-

85%.

Data Analysis

The following methods were used to address each of the proposed research

questions:

1. To determine whether BST was efficacious when used to train other individuals to

implement some form of intervention with children, adolescents, and young adults

(ages birth to 21), the primary, generalized, and maintenance outcome SCARF

ratings for each study were analyzed and synthesized. Outcome scores of 3.0 - 4.0

or higher are consistent with confidence in the demonstration of a functional

relation. This information allowed for the determination of BST

efficacy/effectiveness in improving immediate intervention outcomes as well as

generalized and maintained outcomes.

19
2. In order to evaluate the quality of studies in the current literature base, all SCARF

variables were used to assign overall quality/rigor ratings. Scores of 2.0 or higher

are considered to be strong enough for the study results to be interpreted.

3. To determine the pertinent intervention variables that influence the

efficacy/effectiveness of BST on acquisition of intervention skills, procedural

fidelity and use of BST components were examined.

Results

Research Design

All included studies used a multiple baseline or multiple probe experimental

design (see Table 2.1). Eleven studies utilized a multiple baseline across participants

design (one of which was nonconcurrent). One study used a multiple baseline across

behaviors design. Of the seven studies using multiple probe designs, all were across

participants.

Participant Characteristics

As shown in Table 2.1, 11 studies examined the use of BST to train teachers as

implementers of interventions, seven studies examined the use of BST to train clinic staff

to implement intervention, and one study examined the use of BST to train a swimming

instructor to implement intervention with his/her student. Across all 19 studies, 74 natural

implementers were trained using BST. Natural implementers ranged in age from 19 to 50;

however, eight of the included studies did not report the age range for the participants

trained using BST. Implementers trained using BST had a variety of years of experience

20
ranging from no experience to multiple years of experience. Implementers' race/ethnicity

was only reported in one study (Chazin et al., 2018).

Across 17 studies, a total of 71 intervention recipients participated, with an age

range of 2 to 12 years; however, two studies did not report the number of intervention

recipients (Aherne & Beaulieu, 2019; Palmen et al., 2010) and an additional three studies

did not include their age ranges (Davenport et al., 2019; Hassan et al., 2017; Hogan et al.,

2015). Child intervention recipient race/ethnicity data were not reported in any studies

included in this review. The majority of child intervention recipients had a diagnosis of

autism spectrum disorder (ASD; n = 12 studies). Two studies listed developmental

disability as the primary diagnosis; one listed multiple disabilities, including global

developmental delay; one listed multiple physical disabilities; and diagnosis was not

reported in three studies (Aherne & Beaulieu, 2019; Davenport et al., 2019; Hogan et al.,

2015).

General Intervention Characteristics

The majority of the studies were conducted in a classroom/school environment (n

= 13). The remaining occurred at a home/training center (n = 1), a clinic/training center

(n = 1), a clinic (n = 1), a community pool (n = 1), a home (n = 1), and a treatment center

for individuals with ASD (n = 1). BST was used to target a variety of intervention skills,

including discrete trial teaching (DTT; n = 4), fidelity of behavior intervention plan (BIP)

implementation (n = 2), the natural language paradigm (NLP; n = 2), reading racetrack

intervention (n = 1), incidental teaching (n =1), and others. Sixty-eight percent of studies

(n = 13) also reported outcomes for child recipients, such as unprompted functional

21
communications, number of sight words read correctly, percentage of correct responses,

vocalizations, and stereotypy (see Table 2.2). All 19 studies reported using all four BST

intervention components (instruction, modeling, rehearsal, and feedback).

All 19 studies reported positive effects of BST, with BST effectively improving

the teachers’ and staffs’ implementation of the trained intervention. Such effects included

percentage of correct responses for discrete trial teaching (DTT), percentage of Reading

Racetrack intervention steps implemented correctly, percentage of steps performed

correctly in a natural language paradigm (NLP) intervention, and number of errors in

implementing response interruption and redirection (RIRD) for stereotypy. See Table 2.2

for a complete list of the foci of BST and recipient outcomes measured. Interestingly,

Aherne and Beaulieu (2019) reported the use of a self-evaluation procedure to help with

maintenance of learned skills. This is similar to the self-recording of performance

described by Nabeyama and Sturmey (2010), which together suggest that BST can be

combined with self-monitoring procedures to improve outcomes. Seiverling et al. (2010)

combined general-case training with BST in order to improve NLP and response chaining

performance. Finally, Chazin et al. (2018) found that BST effectively improved staff

implementation of student behavior intervention plans when training was combined with

coaching but not with training alone. Thus, while the majority of studies demonstrated

that BST alone was effective at improving teacher and staff fidelity of implementation of

various interventions, several studies noted the inclusion of other training components as

well.

22
Study Rigor

All 19 studies reported using all four BST intervention components (instruction,

modeling, rehearsal, and feedback). Thus, all key components were included for BST as

it was implemented in the studies. Ledford et al. (2016) describe study rigor as reliability

of the dependent variable, procedural fidelity, and the data itself (see Table 2.3 for coding

details). Overall, 100% of studies reported dependent variable reliability data (n = 19);

84.2% reported collecting reliability data in both primary comparison conditions for at

least 20% of sessions overall and reported greater than 80% agreement. Thus, the studies

in this review had mostly strong reliability data. In contrast, only 42.1% (n = 8) of studies

reported collection of fidelity data for the independent variable and of that 42.1%, only

37.5% (n = 3) reported collecting fidelity data in both primary conditions. However,

100% (n = 8) of these studies reported fidelity data of 80% or higher. Thus, the studies

that did report procedural fidelity reported good procedural fidelity with high ratings.

Even so, less than half the studies in this review included fidelity data and only three

reported collecting such data in both primary conditions. This lack of fidelity data across

over half the studies calls into question whether the interventions were implemented as

intended, and thus impacts the confidence with which we can attribute positive findings

to the BST training. Regarding sufficiency of data, 68.4% (n = 13) of studies had at least

three data points per condition and 84.2% (n = 16) had enough data to infer a functional

relation. However, for those studies that did not satisfy these criteria, this limits

inferences of a functional relation between the independent and dependent variables. This

is due to the fact that a greater number of data points allows for the detection of a pattern,

23
variability, and trends in the data whereas having only two data points can, for example,

show you a trend in one direction or another, but this trend may be highly misleading

without additional data points.

Study Quality

Study quality is composed of participant, condition, and dependent variable

descriptions; the examination of social/ecological validity; and maintenance and

generalization (see Table 2.4 for coding details). When describing participants, only

42.1% (n = 8) of studies gave complete descriptions of participants including the number

of participants, ages, gender, etc. In fact, only one study included race/ethnicity in its

description of participants, but only for implementers and not for intervention recipients.

This was a relative weakness for the studies included in this review, making it difficult to

determine for whom BST may be an appropriate and effective intervention. Conditions

for both primary comparison conditions were adequately described in 84.2% (n = 16) of

studies, demonstrating a relative strength for this group of studies. Similarly, authors

adequately described observable characteristics of dependent variables (i.e., operational

definitions) in 78.9% (n = 15) of studies, demonstrating yet another relative strength for

the studies included in this review. Of note, failure to include operational definitions is a

serious problem for reporting measurement of the dependent variable, and not only

makes the study difficult to replicate but makes it difficult to determine specific

behaviors that were measured as part of the study.

Overall, 73.7% (n = 14) of studies reported examining social/ecological validity.

Therefore, this is a relative strength of the studies included in this review. A total of

24
47.4% of studies (n = 9) assessed generalization in some form. One study assessed

generalization across contexts, four studies assessed generalization across individuals,

and five studies assessed generalization across responses. Of note, one study assessed

generalization across both responses and individuals (Sarokoff & Sturmey, 2008). Thus,

less than half the studies in this review assessed generalization in any form, indicating a

weakness for this group of studies. Similarly, only 42.1% of studies (n = 8), assessed

maintenance of outcomes. With less than half of the studies having assessed maintenance,

this also represents a weakness for this group of studies.

Primary Outcomes

Study rigor, quality, and outcome scores are provided in Table 2.5. For a

complete description of study outcome coding, see Appendix A. Overall quality/rigor

scores of 2.0 or higher are considered to be strong enough for the study results to be

interpreted. In addition, outcome scores of 3.0 - 4.0 or higher are consistent with

confidence in the demonstration of a functional relation (Ledford et al., 2016). As shown

in Table 2.5, based on outcome data for primary outcomes, all but two studies

demonstrated an outcome score of 3.0 or higher. However, when combined with their

overall quality/rigor scores, only seven studies (36.8%) had sufficient overall

quality/rigor and outcome scores to infer a functional relation with confidence (Chazin et

al., 2018; Davenport et al., 2019; Fetherston & Sturmey, 2014 [Study 1]; Homlitas et al.,

2014; Nabeyama & Sturmey, 2010; Sarokoff & Sturmey, 2008; Seiverling et al., 2010),

while 63.2% of studies (n = 12) had scores indicating low quality evidence of positive

effects (see Figure 2.2). Studies with high enough overall quality/rigor and outcomes

25
scores on the SCARF to be considered rigorous and high-quality examined the use of

BST to train fidelity of implementation of behavior intervention plans (BIP; Chazin et al.,

2018), reading racetrack intervention (Davenport et al., 2019), the picture exchange

communication system (PECS; Homlitas et al., 2014), discrete trial teaching (DTT;

Fetherston & Sturmey, 2014 [Study 1]; Sarokoff & Sturmey, 2008), the natural language

paradigm (NLP) and response chaining (Seiverling et al., 2010), and guarding procedures

for patients with ambulatory difficulties (Nabeyama & Sturmey, 2010).

Generalization and Maintenance

Generalization was assessed in 47.3% (n = 9) of studies (outcome scores greater

than 0 in Table 2.5). Of those studies that assessed generalization, 88.9% (n = 8) had

sufficient overall quality/rigor and outcome ratings for their generalization findings to be

interpreted with confidence (Fetherston & Sturmey, 2014 [all three studies]; Gianoumis

et al., 2012; Nabeyama & Sturmey, 2010; Nigro-Bruzzi & Sturmey, 2010; Palmen et al.,

2010; Sarokoff & Sturmey, 2008). Generalization was assessed across responses (n = 4;

Fetherston & Sturmey, 2014 [all three experiments]; Palmen et al., 2010), individuals (n

= 3; Gianoumis et al., 2012; Nabeyama & Sturmey, 2010; Sarokoff & Sturmey, 2008),

and contexts (n =1; Nigro-Bruzzi & Sturmey, 2010). For seven out of eight studies,

generalization was measured in the context of the study design, and for one study it was

measured pre and post intervention. Finally, all but one study received a rating indicating

that consistent, positive effects were shown via the context of the measurement design.

The remaining study received a rating indicating that generalization effects were

inconsistent or weak, positive effects. According to Figure 2.3, which shows overall

26
quality and rigor of generalization measurement and generalized outcomes 88.9% (n = 8)

of studies that measured generalization demonstrated high quality evidence of positive

effects and 11.1% of studies (n = 1) that measured generalization had low quality

evidence of positive effects. Overall, these studies suggest that BST can generalize across

a variety of parameters.

Maintenance was assessed in 42.1% (n = 8) of studies (outcome scores greater

than 0 in Table 2.5). Of those studies that assessed maintenance, 87.5% (n = 7) had

sufficient overall quality/rigor and outcome ratings for their maintenance findings to be

interpreted with confidence (see Figure 2.4; Aherne & Beaulieu, 2019; Davenport et al.,

2019; Hassan et al., 2017; Homlitas et al., 2014; Jimenez-Gomez et. al., 2019; Nabeyama

& Sturmey, 2010; Palmen et al., 2010). Four out of seven studies received scores

indicating that maintenance data was collected at least one week, but less than one month

after intervention was completed. The remaining three studies collected maintenance data

one or more months after the completion of intervention. Finally, all seven studies

showed maintenance data similar to intervention or criterion, and five of the seven

measured maintenance on more than one occasion. Additionally, 12.5% of studies (n = 1)

measuring maintenance had scores indicating low quality evidence of positive maintained

effects (see Figure 2.4). Overall, these studies indicate that skills learned from BST can

be maintained up to and over a month after intervention.

In sum, findings indicate that seven studies had both strong measurement

characteristics and primary outcomes that can be interpreted with confidence. Eight

studies indicated quality generalization measurement and outcomes and

27
seven studies indicated quality maintenance measurement and outcomes.

Summative Quality

Although most studies that measured generalization and maintenance did so in a

manner that allowed for the interpretation of their results with confidence, only a few

studies demonstrated sufficient overall quality/rigor and outcomes in multiple areas. The

only study to demonstrate sufficient quality/rigor and outcomes across primary outcomes,

generalization, and maintenance was Nabeyama and Sturmey (2010). Therefore, this

study was the most rigorous and comprehensive in terms of SCARF ratings and protocols

indicating high levels of quality in all three areas assessed. Two studies demonstrated

high quality/rigor in the areas of primary outcomes and generalization (Fetherston &

Sturmey, 2014; Sarokoff & Sturmey, 2008). Of note, this was only demonstrated in Study

1 of Fetherston and Sturmey (2014). Maintenance was not assessed in either of these

studies, so quality and outcome indicators for that area could not be determined.

Similarly, two studies demonstrated high quality/rigor in the areas of primary outcomes

and maintenance (Davenport et al., 2019; Homlitas et al., 2014). However, generalization

was not assessed in either of these studies, so quality and outcome indicators for that area

could not be determined.

Discussion

The current review evaluated 19 studies from 17 articles examining the effects of

BST on intervention implementation by natural implementers (teachers and other

professional staff) with children ages birth to 21. These studies were coded and analyzed

using the SCARF coding procedure (Ledford et al., 2016). All studies utilized a multiple

28
baseline or multiple probe design. The majority of interventionist-recipient relationships

were teacher-child, and the majority of interventions took place in the classroom or

school environment. All studies reviewed utilized all of the BST training components, but

a few incorporated a self-evaluation or self-report component, which helped improve

performance. The reviewed studies had several strengths in their reporting, including

dependent variable reliability data, social/ecological validity, description of dependent

variable characteristics (i.e., operational definitions), description of both primary

comparison conditions, and sufficiency of data. However, there were also several

weaknesses in their data reporting, including BST fidelity reporting and participant

descriptive characteristics.

Regarding SCARF scoring, less than half of the studies produced high enough

overall quality/rigor and outcome ratings for their primary outcome data to be interpreted

and to infer a functional relation with confidence. Similarly, generalization and

maintenance were assessed in less than half of all studies. However, the majority of these

studies received sufficient overall quality/rigor and outcomes scores to interpret

generalization/maintenance findings with confidence. Only one study produced sufficient

ratings in overall quality/rigor and outcomes across all three areas assessed: primary

outcomes, generalization, and maintenance. Additionally, only four studies produced

sufficient ratings across two outcome areas assessed.

This review demonstrated that although previous research has shown that BST

can be used to train individuals to implement a large variety of interventions, only a

certain number of included studies were found to be of sufficient overall rigor/quality and

29
outcomes to infer a functional relation with confidence. These included studies that

taught natural implementers to implement behavior intervention plans (BIP), reading

racetrack intervention, the picture exchange communication system (PECS), discrete trial

teaching (DTT), the natural language paradigm (NLP), and guarding procedures for

patients with ambulatory difficulties. The studies that implemented these intervention

techniques were found to be of higher quality among the studies included in this review.

However, that is not to say that all included studies that implemented these interventions

were rated as having high quality.

Implications

This review has shown that although BST has an expansive literature base, not all

BST studies have the same level of quality. This is important to note for consumers of

BST literature to be discerning in their review of existing and future BST studies.

Similarly, this is important for future researchers to help them ensure that their research

meets the highest quality standards to increase confidence in research findings. In

addition, although generalization and maintenance were included in several studies, less

than half of the studies reviewed incorporated these measures. These are important

components of BST research and future research should consider incorporating these

elements into the research design.

The identified studies had several strengths in their reporting. However, some

weaknesses in their data reporting were also noted, including reporting BST fidelity and

participant descriptive characteristics. This indicates a need for future research that

incorporates these elements into study design and reports on them more thoroughly in the

30
body of the text. Without intervention fidelity data, it is difficult to determine whether

BST was implemented as planned and thus whether it is responsible for changes observed

in the data. Similarly, without appropriate participant descriptions, it will be difficult for

future replication to occur, limiting an important part of scientific inquiry. In addition,

this limitation makes it difficult to collect information about the characteristics of

individuals for whom intervention may be effective. These two limitations highlight the

need for future research in which BST fidelity is measured and sufficient participant

characteristics are provided.

Every study included in this review reported some form of positive effects of BST

on interventions implemented by natural implementers (i.e., teachers or staff). Therefore,

this review has shown that several studies with well-established overall rigor/quality and

outcomes demonstrated that teachers and/or staff members can be effectively trained to

implement various interventions using the BST package. This is important because

teachers spend a great deal of time with children and thus, if taught to use various

interventions that target children’s needs, have the potential to increase the dosage of

intervention exponentially. Rather than seeing an interventionist once or twice a week for

an hour (or even daily for an hour), children could have the opportunity to receive

intervention daily for several hours a day while at school. This would likely lead to faster

gains and improvements in skills. However, it would also be important to make using the

intervention in the context of the classroom feasible for teachers. Therefore, future

research should continue to focus on incorporating intervention strategies into daily

classroom routines and structure.

31
This review has also identified particular interventions that are more likely to

work well with BST based on high quality SCARF coding. These include behavior

intervention plans (BIP), the reading racetrack intervention, the picture exchange

communication system (PECS), discrete trial teaching (DTT), the natural language

paradigm (NLP), and guarding procedures for patients with ambulatory difficulties. This

finding is important for two main reasons. One, it helps to strengthen the existing BST

literature by highlighting interventions that are likely to be successful due to high quality

research. Two, it encourages future research to expand on those intervention areas that

did not receive high SCARF overall quality/rigor and outcome scores such as incidental

teaching, activity schedules, cognitive behavioral therapy (CBT), and parent child

interaction therapy (PCIT) like verbal behaviors, etc. Of note, some of these interventions

alone may have a strong literature base, but this review focused only on those studies that

involved natural implementers (teachers and staff).

In addition to interventions, this review also helped identify some characteristics

of participants that have benefitted from this type of BST training. Teachers and staff

members ranging in age from 19 to 50 with a wide range of experience and backgrounds

were included in the studies. Thus, implementers between the ages of 19 to 50 and with

varying years of experience and backgrounds ranging from no experience to multiple

years of experience may be likely to benefit from BST coaching. Similarly, intervention

recipients ranged in age from 2 to 12 years, and the majority had a diagnosis of ASD.

Thus, children who are between the ages of 2 and 12 and who have a diagnosis of ASD

may be likely to benefit from intervention from natural implementers who have had BST

32
training. However, some studies did not report participant demographic information and

thus these data are missing from these ranges.

Limitations

The current review has several limitations worthy of note. First, the review is not

comprehensive of all studies involving training natural implementers with BST. Parents,

peers, and other natural implementers were excluded from the study due to a focus on

teachers and other professional staff members. Although important contributions to the

BST research, studies involving parents and other natural implementers are beyond the

scope of this review. Future reviews should focus on these other natural implementers.

Second, the method of score calculation for SCARF scores is not always inclusive

of all study information. For example, some questions, depending on whether you

answer, “yes” or “no,” then require you to answer “NA” for the remainder of questions in

a section. However, there are times when these questions are not mutually exclusive and

questions which could receive an answer of “yes” and earn more points are required to be

coded as “NA,” lowering the overall score for the study. Thus, the rules of coding

actually lead to a decrease in a study’s score rather than the lack of such an element in the

study. Such limitations are not unexpected as no systematic analysis framework is

without its limitations.

Conclusion

The current review demonstrated that BST can be successfully used to train

teachers and other professionals to implement a variety of interventions. However,

certain studies were ranked as having more rigor/quality and thus more confidence in

33
their outcomes according to SCARF. These seven studies involved training the following

intervention techniques: behavior intervention plans (BIP), the reading racetrack

intervention, the picture exchange communication system (PECS), discrete trial teaching

(DTT), the natural language paradigm (NLP), and guarding procedures for patients with

ambulatory difficulties. However, all studies, even those that received lower SCARF

scores, showed some form of improvement in intervention implementation after BST

training. The current review has shown that teachers and other professional staff can be

effectively taught various intervention techniques using BST and can then use those

intervention techniques with fidelity, increasing the potential intervention dosage for

recipients.

34
Table 2.1

Participant Demographics

Implementer Implementer Recipient Recipient


Implementer/Recipient N (age Gender N (age Gender Recipient
Study Relationship range) (%M) range) (%M) Diagnoses
Aherne
(2019) Staff/Client 3 (22-29) 33% NR NR NR
Multiple
Chazin disabilities
(2018): including
Study 1 Teacher/Student 4 (23-30) 0% 1 (3) 100% GDD
Davenport
(2019) Teacher/Student 3 (NR) 0% 3 (NR) NR NR
Fetherston
(2014) Developmental
Study 1 Teacher/Student 4 (NR) NR 4 (3-12) NR Disability
Fetherston
(2014) Developmental
Study 2 Teacher/Student 4 (NR) NR 4 (5-10) NR Disability
Fetherston
(2014)
Study 3 Teacher/Student 3 (NR) NR 3 (10-12) NR ASD
Gianoumis
(2012) Teacher/Student 3 (25-34) 0% 6 (3-4) NR ASD

35
Table 2.1 Continued
Implementer Implementer Recipient Recipient
Implementer/Recipient N (age Gender N (age Gender Recipient
Study Relationship range) (%M) range) (%M) Diagnoses
Giles ASD- Motor
(2018) Teacher/Student 3 (26-33) NR 3 (6-12) 100% Stereotypy
Hassan
(2017) Staff/Client 7 (22-32) 29% 7 (NR) NR ASD
Hogan
(2015) Staff/Client 4 (25-34) 0% 2 (NR) NR NR
Homlitas
(2014) Teacher/Student 3 (NR) NR 9 (2-7) NR ASD
Jimenez-
Gomez
(2019) Staff/Client 5 (NR) NR 3 (2-4) 100% ASD
Swimming
Jull (2016) Instructor/Child 6 (19-30) 17% 8 (5-8) 88% ASD
Multiple
Nabeyama physical
(2010) Staff/Client 3 (21-24) NR 3 (7-8) 100% disabilities
Nigro-
Bruzzi
(2010) Staff/Client 6 (NR) NR 6 (2-6) NR ASD
Palmen
(2010) Staff/Client 4 (41-50) 50% NR NR ASD

36
Table 2.1 Continued
Implementer Implementer Recipient Recipient
Implementer/Recipient N (age Gender N (age Gender Recipient
Study Relationship range) (%M) range) (%M) Diagnoses
Sarokoff
(2004) Teacher/Student 3 (NR) NR 1 (3) NR ASD
Sarokoff
(2008) Teacher/Student 3 (NR) 0% 5 (M = 5) 100% ASD
Seiverling
(2010) Teacher/Student 3 (23-42) NR 3 (3-4) NR ASD
Note. Studies identified by last name of first author and year. NR = not reported; ASD = autism spectrum disorder; GDD = global

developmental delay;

37
Table 2.2

Study Outcomes

Study Study Study Setting BST Focus Trained Intervention Brief Results
Design Outcomes
BST was effective in
teaching DTT; outcomes
were not maintained for
2/3 participants; a self-
evaluation procedure
Aherne Home/Training
helped with maintenance
(2019) MB-P Center DTT NR
BST improved staff
implementation of student
Chazin BIPs when training was
(2018): Unprompted functional combined with coaching
MP-P
Study 1 School BIP implementation communication but not with training alone
BST taught teachers to
Davenport Reading Racetrack Number of sight words read implement the reading
MP-P
(2019) School intervention correctly racetrack intervention
Fetherston
BST was effectively used
(2014) Percentage of correct responses
MP-P to train DTT
Study 1 School DTT by learners
Fetherston
BST was effectively used
(2014) Percentage of correct responses
MP-P to train incidental teaching
Study 2 School Incidental teaching by learners

38
Table 2.2 Continued
Study Study Study Setting BST Focus Trained Intervention Brief Results
Design Outcomes
Fetherston BST was effectively used
(2014) Percentage of correct responses to train staff to implement
MP-P
Study 3 School Activity Schedules by learners activity schedules
BST was effectively used
Gianoumis Child vocalizations and to train teachers to
MB-P
(2012) School NLP maladaptive behavior implement NLP
Teachers were able to
Giles implement RIRD after
MB-P
(2018 School RIRD Stereotypy BST
Staff improved CBT
Hassan Clinic/Training intervention over self-
MP-P
(2017) Center CBT intervention NR study alone
BST improved
Hogan implementation of student
MB-P
(2015) School BIPs NR BIPs
BST was used to train
teachers to implement
Phases 1, 2, and 3A of
Homlitas
PECS
(2014) MB-P School PECS NR
BST was used to train
Jimenez‐
staff to implement PCIT
Gomez PCIT type verbal
like verbal behaviors
(2019) MP-P Clinic behaviors NR

39
Table 2.2 Continued.
Study Study Study Setting BST Focus Trained Intervention Brief Results
Design Outcomes

5/6 instructors showed


improvement in
implementing swimming
Community Compliance and swimming behavior techniques with
Jull (2016) NC MB-P Swimming Pool swimming skills skills BST
BST combined with self-
recording of performance
was used to train staff to
correctly implement
guarding procedures for
Nabeyama ambulation guarding clients with ambulatory
MB-P
(2010) School procedures Distance ambulated difficulties
Nigro- BST was used to
Bruzzi effectively teach staff to
MB-P
(2010) School mand training Unprompted mands implement mand training
BST was used to improve
implementation of all
Treatment error correction,
behaviors and
Center for positive
significantly improved
Palmen Individuals with reinforcement, and Response efficiency and correct
error correction
(2010) MB-B ASD initiating opportunities target behavior
BST effectively trained
Sarokoff teachers to accurately
(2004) MB-P Home DTT NR implement DTT

40
Table 2.2 Continued
Study Study Study Setting BST Focus Trained Intervention Brief Results
Design Outcomes
To use BST to train BST was used to
Sarokoff staff to implement Correctly identifying target effectively train staff to
MB-P
(2008) School DTT sight words implement DTT
BST and GCT were
effectively used to train
Seiverling NLP and response NLP and response
MB-P
(2010) School chaining Emission of vocal chains chaining
Note. Studies identified by last name of first author and year. MB-P = multiple baseline across participants; MB-B = multiple baseline

across behaviors; NC MB-P = nonconcurrent multiple baseline across participants; MP-P = multiple probe across participants; DTT =

discrete trial teaching; IMRF = instructions, modeling, rehearsal, and feedback; BST = behavioral skills training; NLP = natural

language paradigm; RIRD = response interruption/re-direction; CBT = cognitive behavioral therapy; BIP = behavior intervention plan;

PECS = picture exchange communication system; PCIT = parent child interaction therapy; ASD = autism spectrum disorder; GCT =

general-case training.

41
Table 2.3

Rigor Coding Questions from the Single Case Analysis Review and Framework (SCARF)

Criteria Criteria
Dependent Variable Reliability
1. Do authors report dependent 2. Do authors report collection of
variable reliability data? agreement data in both primary
comparison conditions and for at
least 20% of sessions overall?
3. Are dependent variable reliability 4. Was agreement data collected by
data (e.g., IOA data) calculated on observers who were blind to study
a point-point basis, and is conditions and/or purpose?
agreement higher than 80% (or
higher than 0.60 Kappa) in each
primary comparison condition?

Independent Variable Reliability (Fidelity)


1. Do authors report any data related 2. Do authors report the use of self-
to fidelity of implementation? report fidelity only?
3. Do authors report fidelity data 4. Do authors report collecting
suggesting fidelity of more than fidelity in both primary
80% or evidence of differentiation comparison conditions? (e.g.,
between conditions? baseline and intervention). If the
measurement context is separate
from the treatment context, PF
should be collected in both.
5. Do authors report fidelity data 6. Do authors report fidelity data
collection in at least 20% of separately for each primary
sessions? comparison condition? Note:
100% fidelity and explicit
collection in both conditions meets
this criteria.
7. Do authors (a) collect agreement
data on fidelity assessments (e.g.,
two observers assess fidelity and
compare their assessments to get a
percentage of agreement) or (b) are
data collected by observers blind to
study condition or purpose?

42
Table 2.3. Continued
Sufficiency of Data
1. Do at least three data points exist 2. Is the design a multiple baseline or
in each primary comparison multiple probe design?
condition?
3. Did data collection begin 4. Are more data points needed in
simultaneously during initial any primary comparison condition
baseline or probe conditions? due to (a) within-condition
variability, (b) within-condition
changes in level or trend, or (c)
potential covariation between tiers
in a multi-tier design?
5. Do at least four data points exist in 6. Do at least five data points exist in
each primary comparison each primary comparison
condition, or in conditions with condition, or in conditions with
only three data points is one of the only three data points is one of the
following true: all points at following true: all points at
baseline or ceiling levels, data baseline or ceiling levels, data
reached a criterion level, or no reached a criterion level, or no
overlap with adjacent conditions is overlap with adjacent conditions is
present? present?
Note. IOA = interobserver agreement.

43
Table 2.4

Quality & Breadth of Measurement Coding Questions from the Single Case Analysis
Review and Framework (SCARF)

Social and Ecological Validity


1. Do authors report feasibility or 2. Do authors report psychometric
acceptability ratings via data for the interviews,
interviews, questionnaires, or questionnaires, or surveys; or do
surveys? they provide a citation to another
source that shows acceptable
psychometric data?
3. Do authors report one or more of 4. Do authors report the use of
the following: (1) blind raters of typical environments and/or report
importance of results, acceptability the use of indigenous
or feasibility of procedures, or implementers or social partners?
acceptability of dependent
variables, (2) normative
comparisons?

Participant Descriptions
1. Do authors report demographic 2. Do authors report formal test
information, including age and results (e.g., IQ, language
diagnosis or eligibility category, competence, achievement)?
for all participants?
3. Do authors report general 4. Do authors report inclusion criteria
information about participants or pre-intervention behaviors for
(e.g., educational placement, all participants?
problem behaviors, functional
repertoire of behaviors, areas of
strength and weakness)?

Condition Descriptions
1. Are procedures for both primary 2. Is dosage adequately described?
comparison conditions adequately
described?
3. Is setting described for both 4. Are implementers adequately
primary comparison conditions described in terms of training and
general (i.e., if relevant: location, demographic characteristics? If
individuals in environment, indigenous implementers are used,
physical characteristics)? "yes" on this question requires
authors to report (a) how
implementers were trained, and (b)

44
Table 2.4. Continued.
evidence that the training was completed
as described (e.g., implementation
fidelity).
Dependent Variable Descriptions
1. Do authors describe observable 2. Do authors provide examples
characteristics of dependent and/or non-examples of target
variables (e.g., operational behaviors?
definitions)?
3. Do authors adequately describe 4. Do authors describe how system
measurement system? (e.g., was used? (e.g., Were data
counts, duration, 5-s partial collected by implementers or
interval system, 15-s momentary another individual? Were data
time sampling) collected in-vivo, via audio or
video?)

Generalization Measurement 1
1. Do authors report assessment of a 2. Do authors report assessment of a
target behavior performed in a target behavior performed with
context that is different than materials that are separate from
training/primary outcome those used in training/primary
measurement? measurement context?
3. Do authors report assessment of a 4. How do authors measure
target behavior performed with a generalization across materials,
different social partner than those social partners, or settings?
used in training/primary
measurement context?

Generalization Measures 2
1. Do authors measure a behavior 2. Do authors teach one specific
that is a generalized tendency, in behavior or type of behavior, but
addition to the primary outcome of measure a different specific (not
interest? generalized) behavior as a measure
of generalization (response
generalization)?
3. How do authors measure
generalized behavior? (e.g., either
by measuring a generalized
tendency or generalization of an
explicitly taught behavior).

45
Table 2.4. Continued.
Maintenance Measurement
1. Do authors report evidence of 2. Is this maintenance measured on
continued behavior change, during more than one occasion?
post-intervention sessions?
3. When is maintenance measured?

46
Table 2.5

SCARF Quality, Rigor, and Outcome Scores

Primary Outcome Generalization Measures Maintenance Measures


Study Rigor Quality Overall Outcome Quality/Rigor Outcome Quality/Rigor Outcome
Quality/Rigor
Aherne (2019) 1.0 1.6 1.2 5.0 0.0 1.0 3.0 5.0
Chazin (2018):
Study 1 3.0 2.0 2.7 4.0 0.0 1.0 1.0 5.0
Davenport
(2019) 3.3 1.9 2.8 5.0 0.0 1.0 3.0 4.0
Fetherston
(2014) Study 1 2.7 1.6 2.3 5.0 4.0 5.0 0.0 1.0
Fetherston
(2014) Study 2 1.3 1.6 1.4 5.0 4.0 5.0 0.0 1.0
Fetherston
(2014) Study 3 1.3 1.6 1.4 4.2 4.0 4.2 0.0 0.2
Gianoumis
(2012) 1.3 2.1 1.6 3.8 2.0 4.8 0.0 0.8
Giles (2018 2.0 1.7 1.9 3.0 0.0 1.0 0.0 1.0
Hassan (2017) 1.7 1.9 1.7 2.0 0.0 1.0 4.0 5.0

Hogan (2015) 1.0 1.0 1.0 3.0 0.0 1.0 0.0 1.0

47
Table 2.5 Continued
Primary Outcome Generalization Measures Maintenance Measures
Overall
Study Rigor Quality Rigor/Quality Outcome Quality/Rigor Outcome Quality/Rigor Outcome
Homlitas
(2014) 2.7 1.9 2.4 5.0 0.0 1.0 4.0 5.0
Jimenez‐
Gomez (2019) 0. 7 1.7 1.0 4.6 1.0 3.6 3.0 3.6
Jull (2016) 0.7 1.9 1.1 4.0 0.0 1.0 0.0 1.0
Nabeyama
(2010) 2.3 2.6 2.4 5.0 4.0 5.0 4.0 5.0
Nigro-Bruzzi
(2010) 2 1.7 1.9 2.5 3 4.5 0 0.5
Palmen (2010) 0.7 2.6 1.3 2.0 3.0 3.0 3.0 5.0
Sarokoff
(2004) 2.3 1.0 1.9 5.0 0.0 1.0 0.0 1.0
Sarokoff
(2008) 2.3 2.4 2.4 4.7 4 4.7 0 0.7
Seiverling
(2010) 3.0 1.9 2.6 4.9 0.0 0.9 0.0 0.9
Note. Studies identified by last name of first author and year. SCARF = Single Case Analysis Review and Framework.

48
Figure 2.1.

Preferred Reporting for Systematic Reviews and Meta Analyses (PRISMA) Flow Diagram
Identification

Records identified through Additional records identified


database searching through other sources
Screening

(n = 172) (n = 21)

Records after duplicates removed


(n = 146)
Screening
Eligibility

Records screened Records excluded


(n = 146) (n = 124)

Full-text articles assessed Full-text articles excluded,


Eligibility

for eligibility with reasons


(n = 22) (n = 5)
Include

Studies included in
qualitative synthesis
(n = 19 in 17 articles)
Identification
Included

Note. This figure shows the number of studies that were considered for eligibility in the

review and how they were reviewed and analyzed to obtain the final number of studies.

49
Figure 2.2

SCARF Quality and Rigor of Primary Outcomes

4
Primary Outcomes

3
1 2
2

1
3 4
0
0 1 2 3 4 5
Overall Study Quality & Rigor

Note. SCARF = Single Case Analysis Review and Framework. Filled-in circle data

points represent individual studies. The graph is interpreted as follows: Data points that

fall in quadrant one indicate low quality evidence of positive effects. Data points that fall

in quadrant two indicate high quality evidence of positive effects. Data points that fall in

quadrant three indicate low quality evidence of negative or minimal effects. Finally, data

points that fall in quadrant four indicate high quality evidence of negative or minimal

effects. In sum, the highest quality studies with the best outcomes fall in quadrant two.

50
Figure 2.3

SCARF Quality and Rigor of Generalized Outcomes

5
Generalized Outcomes

3
1 2
2

1
3 4
0
0 1 2 3 4 5
Quality & Rigor of Generalization Measurement

Note. SCARF = Single Case Analysis Review and Framework. Only studies that

measured generalization are included. Filled-in circle data points represent individual

studies. The graph is interpreted as follows. Data points that fall in quadrant one indicate

low quality evidence of positive effects. Data points that fall in quadrant two indicate

high quality evidence of positive effects. Data points that fall in quadrant three indicate

low quality evidence of negative or minimal effects. Finally, data points that fall in

quadrant four indicate high quality evidence of negative or minimal effects. In sum, the

highest quality studies with the best outcomes fall in quadrant two.

51
Figure 2.4

SCARF Quality and Rigor of Maintained Outcomes

4
Maintained Outcomes

3
1 2
2

1
3 4
0
0 1 2 3 4 5
Quality & Rigor of Maintenance Measurement

Note. SCARF = Single Case Analysis Review and Framework. Only studies that

measured maintenance are included. Filled-in circle data points represent individual

studies. The graph is interpreted as follows. Data points that fall in quadrant one indicate

low quality evidence of positive effects. Data points that fall in quadrant two indicate

high quality evidence of positive effects. Data points that fall in quadrant three indicate

low quality evidence of negative or minimal effects. Finally, data points that fall in

quadrant four indicate high quality evidence of negative or minimal effects. In sum, the

highest quality studies with the best outcomes fall in quadrant two.

52
CHAPTER 3

STUDY 2: TEACHING THE TEACHER: USING BEHAVIORAL SKILLS

TRAINING TO TRAIN TEACHERS TO IMPLEMENT MILIEU TEACHING

TECHNIQUES1

1
Slane, M. M., & Lieberman-Betz, R. To be submitted to Behavioral Interventions
53
Abstract

Behavioral skills training (BST) is a common set of four core principles that are used to

train others to implement various skills, behaviors, and interventions; it is well-

researched and has a strong literature base (DiGennaro et al., 2018). However, BST has

only begun to be examined in the literature on Naturalistic Developmental Behavioral

Interventions (NDBIs). NDBIs are a class of interventions that combine both principles

of applied behavior analysis as well as developmental sciences to improve skills in

children with disabilities from more basic developmental skills such as joint attention and

eye contact to more complex sills such as language and social interaction (Schreibman et

al., 2015). One NDBI that has led to improvements in children’s early language

development is Milieu Teaching (MT; e.g., Bolzani et al., 2009; Warren & Gazdag,

1990). The present study sought to investigate the utility of BST in the training of MT

techniques. Two teachers were trained to implement three MT techniques with children

who were minimally verbal: following the child’s lead, teaching social routines, and the

system of least prompts (Fey, 2008). A concurrent multiple baseline across behaviors

replicated across teacher participants was used. One teacher showed an increase in

fidelity of implementation of all three techniques when BST was introduced, effectively

implementing each of the MT components. The second teacher showed an increase in

fidelity of implementation for the first two techniques with limited evidence of a

functional relation for the third technique. These improvements in skill also generalized

to a new set of toys and materials and maintained over time. Thus, the present study

54
showed that BST can be effectively used to train natural implementers to carry out MT

techniques with fidelity.

INDEX WORDS: milieu teaching, behavioral skills training, naturalistic interventions,

teachers

55
Teaching the Teacher: Using Behavioral Skills Training to Train Teachers to

Implement Milieu Teaching Techniques

Language impairments in children are fairly common with community prevalence

estimates ranging from 7% to 17% (King et al., 2005). In addition, language impairments

are also frequently co-morbid with neurodevelopmental disabilities such as autism

spectrum disorder (ASD; Rosenbaum & Simon, 2016). These delays have been

associated with lower performance in cognitive, academic, and language domains in later

development (Johnson et al., 1999). Thus, the importance of early intervention for

children demonstrating language delays cannot be understated. However, most traditional

methods of intervention delivery, for example pull out sessions with speech/language

pathologists (SLPs) in schools, only allow for sessions to occur once or twice a week for

about an hour, which may not be enough to allow for children with language delays to

close the gap and catch up to their typically developing peers. This has led to a surge in

research involving natural implementers of intervention, such as teachers. Peterson

(2004) points out that teachers spend a great deal more time with children than traditional

interventionists. Therefore, if teachers were taught to implement various language

interventions, the dosage for these interventions has the potential to increase dramatically

with sessions happening multiple times a week, for multiple hours a day. However, in

order for this to occur, teachers must be properly trained in the various intervention

techniques so that they are able to implement these techniques with fidelity.

56
Behavioral Skills Training

Behavioral skills training (BST) is a comprehensive training package designed to

teach new skills and techniques to a variety of individuals (Kornacki et al., 2013; Ward-

Horner & Sturmey, 2012). BST is comprised of four primary components, (1) instruction,

(2) modeling, (3) rehearsal, and (4) feedback. During the first step, learners are given

instructions regarding how to perform the desired skill or behavior. Next, the therapist or

researcher models accurate completion of the target behavior. Then, the trainee and the

trainer practice with one another via role play (rehearsal). Finally, the trainer watches the

trainee implement the learned skill in the target environment and provides corrective

feedback (Krumhus & Malott, 1980; Nuernberger et al., 2013). Although all components

of BST have been established as important for proper implementation, modeling and

feedback have been identified as the critical components that increase the fidelity of

implementation of new skills (Krumhus & Malott, 1980; Ward-Horner & Sturmey,

2012).

BST has been used across a variety of settings and with a wide variety of

individuals. Studies have included using BST to increase on-task behavior for high-

functioning young adults with autism spectrum disorder (ASD; Palmen & Didden, 2012),

to improve undergraduate students’ ability to correctly implement functional analyses

(Iwata et al., 2000), and to train staff to implement Phases 1-3 of the Picture Exchange

Communication System (PECS; Homlitas et al., 2014). BST has also been used to teach

parents and caregivers of children with ASD to implement social skills training (Dogan et

al., 2017; Hassan et al., 2018), improve staff implementation of mand training and

57
subsequent unprompted mands in children (Nigro-Bruzzi & Sturmey, 2010), train

community staff and teachers to implement discrete trial training (Jull & Mirenda, 2016;

Sarokoff & Sturmey, 2004), train teachers to implement specific behavioral intervention

plan goals (Madzharova & Sturmey, 2018), and train teachers to implement response

interruption and redirection (Giles et al., 2018). Thus, BST is a well-established training

package that has demonstrated efficacy in a variety of contexts. However, researchers

have only recently begun to examine the use of BST to train professionals to implement

more naturalistic developmental behavioral interventions (NDBIs) with young children

with social-communication delays and disorders.

Naturalistic Developmental Behavioral Interventions

NDBIs are a class of interventions that combine both principles of applied

behavior analysis as well as developmental sciences (Schreibman et al., 2015). These

interventions are diverse in form and target multiple developmental domains, including

social, language, play, motor, and cognition (Schreibman et al., 2015). They also

typically target young children with social-communication disorders, such as those with

ASD (Schreibman et al., 2015). Schreibman et al. (2015) identified several common

features of NDBIs, including, (1) three-part contingency; (2) manualized practice; (3)

individualized treatment goals; (4) ongoing measurement of progress; (5) child-initiated

teaching episodes; (6) environmental arrangement; (7) use of prompting and prompt

fading; (8) modeling; (9) adult imitation of the child’s language, play, or body

movements; and (10) broadening the attentional focus of the child. In a review of NDBIs

targeting pre-linguistic communication skills, Dubin and Lieberman-Betz (2020)

58
identified seven components that were common across the majority of studies reviewed:

(1) following the child’s lead, (2) prompting, (3) natural consequences (i.e., outcomes

that logically result from a behavior, such as providing a child with a desired object

immediately after he/she requests it), (4) instruction embedded in routines, (5)

environmental arrangement, (6) time delay, and (7) linguistic mapping. Thus, while there

are a multitude of NDBIs, there are also a set of common core features of NDBIs that

target early developing communication behaviors.

Teachers have been trained to implement a variety of NDBIs using various

training techniques, such as verbal explanation, modeling, and coaching; didactic training

sessions; and presentations. However, the degree to which these training techniques are

used and the exact nature of their use is not always clear. Previous studies have focused

on training teachers to implement NDBIs such as prelinguistic milieu teaching (PMT;

Mccathren, 2000), manualized interventions targeting joint attention (Kaale et al., 2012),

enhanced milieu teaching (Olive et al., 2007), naturalistic language teaching (Smith &

Camarata, 1999), and symbolic play and joint attention (Wong, 2013). Despite the

growing number of NDBIs that have been implemented in the classroom by trained

teachers, there is a continued need to examine well-defined procedures that can be used

to train teachers to implement interventions successfully in their classrooms. BST is a

promising approach because it is a well-researched, routinized training procedure that

clearly lays out the process for training individuals, creating a more uniform standard for

training.

59
In an early study integrating BST and NDBIs, Seiverling et al. (2010) used a

multiple baseline across participants design to examine the use of BST to train preschool

teachers to use the Natural Language Paradigm (NLP) in their classrooms. After BST,

teachers successfully implemented NLP and response chaining with children with ASD

between 40 and 49 months of age. In a follow-up study, Gianoumis et al. (2012)

replicated these results, using a multiple baseline across participants design to show that

BST could be used to train preschool teachers to effectively use NLP with 3 to 4-year-old

children diagnosed with ASD. In yet another study designed to carve out a role for BST

in training adults to use naturalistic behaviorally-based interventions, Toelken and

Miltenberger (2012) utilized a multiple-baseline across behaviors design to examine the

use of BST to train paraprofessionals to implement the system of least prompts using an

embedded teaching procedure with two children with ASD (ages 4 and 5 years old). The

use of a brief, embedded teaching procedure allowed the staff to implement the behaviors

without interfering with their regular duties or causing an undue burden. Because the

evidence for the use of BST to train teachers, caregivers, and other professionals to

implement naturalistic behaviorally-based interventions is only emerging, it is important

to continue to examine its use to train teachers to use well established intervention

procedures, such as milieu teaching.

Milieu Teaching

Milieu teaching (MT) is one of three component interventions that comprise

milieu communication teaching (MCT; Fey, 2008). MCT is an intervention package

composed of three main intervention strategies: (1) prelinguistic milieu teaching, (2)

60
milieu teaching, and (3) focused stimulation. The most appropriate technique is chosen

based on a child’s current level of early social communication development. In addition

to these three intervention strategies, MCT utilizes several core techniques throughout all

three intervention strategies. These include following the child’s lead (FTCL), teaching

social routines (TSR), and setting up the environment (Fey, 2008). The MT intervention

technique is selected once a child is producing a minimum of 5 words, with the outcome

goal of furthering the development of the child’s language (Fey, 2008). In MT, FTCL,

TSR, and setting up the environment are combined with time delay and a mand-model

prompting hierarchy and several other techniques to develop and elicit communication

from the child. In the present study, the focus was on three core techniques believed to be

most amenable to the classroom environment and that fit participants’ current language

levels: FTCL, TSR, and the system of least prompts (SLP).

Following the Child’s Lead

FTCL is a technique that involves allowing the child to direct and lead the

interactions while the adult follows along with the child (Fey, 2008). In order to maintain

the children’s interest during intervention sessions, interventionists are taught to follow

children’s attentional leads and to focus on objects and routines of interest to the children

(Fey et al., 2017; Fey, 2008). Given that young children, especially those with social

communication delays, tend to attend to and focus more on objects they find interesting,

FTCL involves allowing the child to direct the interaction and to select the play materials

(Fey et al., 2017). This practice helps to ensure that children remain interested and

engaged during interactions with adults. Adults often try to direct children’s play by

61
making suggestions, asking questions, or issuing commands. However, if communicative

acts are to be encouraged, then the adult must learn to engage in play without utilizing

these behaviors and dominating the interaction. In essence, the adult must learn to be the

respondent rather than the director.

Teaching Social Routines

Social routines involve a collection of events that repeatedly occur in the same

pattern; they are created when a particular manner of playing or interacting occurs in the

same sequence repeatedly (Fey, 2008). Using identified routines, adults learn to create

opportunities for shared interaction. Rather than simply imitating the child’s play and

engaging in parallel play as in FTCL, adults serve as active participants by inserting turns

into the interaction. The goal is to create a back and forth routine in which the child and

adult take turns initiating and responding to one another. During the adult’s turn, he/she

can then pause, creating an opportunity for the child to communicate. If the adult does

not complete his/her turn, the child may look up, or engage in other forms of

communication to continue the routine. For example, when engaging in FTCL, the adult

and child may engage in parallel play, each running separate cars down separate tracks.

In contrast, when building a social routine, the interventionist and child might use the

same car and track, requiring the child to take turns sending the car down the track. Thus,

the adult has increased the likelihood that the child will maintain interest in the

interaction by allowing him/her to select the toys and routine and has also created

opportunities for interaction by establishing a turn-based routine.

62
System of Least Prompts

The combined use of FTCL and TSR allows the adult to create an opening for a

learner to emit new words through the use of the system of least prompts (SLP), a third

component of MT. The SLP involves both time delay procedures and mand-model

procedures (Fey, 2008), which are followed in succession. First, target words are selected

based on the routines that an adult and a child have established, and the child’s current

language level (Fey, 2008). For example, if a child is using primarily single words, words

such as “car”, “track”, “up”, or “down” may be selected as the target during an activity

where an adult and child are taking turns running a car down a track. After following the

child’s interest in the car and track and establishing the social routine of taking turns with

the car, the adult occasions a response from the child by not giving the child the car

during his/her turn. The adult will then initiate the prompting hierarchy with time delay,

followed by a linguistic prompt, and then a linguistic mand-model prompt (Fey, 2008).

During SLP, the adult looks at the child expectantly when first withholding the object

(time delay), then asks the child for the correct response to be emitted (linguistic prompt)

and, if no response is given, provides a model while asking for the child to emit the

correct response (mand-model).

Milieu Teaching in the Literature

MT has a strong research base supporting its efficacy in teaching children with

language delays new language targets (e.g., Bolzani et al., 2009; Warren & Gazdag,

1990). In a study examining the effects of MT on communication development, Warren

and Gazdag (1990) found that MT, specifically mand-model and incidental teaching

63
techniques, successfully improved communication in children with mild intellectual

disability. In another study examining the effects of MT on communication development

in children with prenatal cocaine exposure, Bolzani et al. (2009) found that participants

benefitted from MT, improving their spontaneous production of single and multiple

words. Similarly, Togram and Erbas (2010) found the mand-model component of MT to

be effective in increasing language in children with developmental disabilities; these

effects were maintained 16 weeks after the conclusion of the study. Using four types of

MT language prompts, including models, commands, questions, and time delay, Ingersoll

et al. (2012), found that both MT and a combined condition (MT plus responsive

interaction) were superior to responsive interaction in increasing language targets in

children. Further demonstrating the effectiveness of MT, Christensenet al. (2013) found

that MT components (modeling, mand-model, time delay, and incidental teaching) were

effective for increasing language targets in preschool-aged children with ASD in an early

childhood special education classroom. In addition, MT has been used to teach and

promote a photo exchange system for a child with ASD (Ogletree et al., 2012). These

studies suggest that MT can be used to improve language abilities in children and that the

use of target components rather than the intervention as a whole can still be beneficial for

improving children’s language and early social communication.

In addition to having trained implementers utilize MT to increase language gains

in young children with developmental disabilities, several studies have also utilized

natural implementers and demonstrated similar positive language outcomes. Kaiser et al.

(1993) successfully taught teachers to implement environmental arrangement techniques

64
as well as MT in a classroom with nonvocal preschool children, leading to increases in

child communication. Similarly, Kaiser et al. (1995) were able to successfully train

parents to implement MT techniques with their children, demonstrating gains in child

communication abilities. In a more recent study, Aktas and Ciftcitekinarslan (2018)

reported a successful parent training program, where parents were trained to implement

the mand-model procedure of MT, resulting in language gains for their children

diagnosed with ASD.

These studies demonstrate that the mand-model procedure of MT can be used

effectively by parents and teachers if they are trained properly (i.e., with coaching and

feedback). Thus, it would be logical to conclude that a training package such as BST

would effectively train teachers in the classroom to implement other MT techniques (i.e.,

FTCL and TSR), in addition to SLP (including mand-model procedures). Although these

previous studies described using procedures similar to BST (e.g., coaching and

feedback), they do not explicitly state that BST was used to train parents or teachers in

learning to implement MT techniques. Even though there is a plethora of research

establishing the efficacy of BST and MT individually, there is a lack of research

examining whether BST can be used to effectively train teachers to implement MT

techniques in the classroom with fidelity.

The present study aimed to fill this gap in the literature by examining the efficacy

of BST to train teachers at a school for children with disabilities to implement several

core MT techniques, including: (1) FTCL, (2) TSR, and (3) SLP. Specifically, this study

sought to address the following questions:

65
1. Is BST effective in training teachers to implement MT techniques in the

classroom?

2. Will improvements in teachers’ abilities to implement MT techniques

generalize to novel materials and activities?

3. Will improvements in teachers’ abilities to implement MT techniques be

maintained over time?

Method

Participants

Two teacher-child dyads participated in this study. Teachers at a school for

children with disabilities in the southeastern United States were recruited to participate in

the current study. To be eligible for the study, teachers must have been (a) willing to

participate in the study and (b) have at least one eligible child participant enrolled in their

classroom. Teacher participants were 20 and 47 years of age, with 1 week and 15 years of

teaching experience, respectively. For a complete description of teacher participant

demographics, see Table 3.1. One child per enrolled teacher was recruited to participate

in this study. To be eligible for participation, children were required to (a) be enrolled in

the classroom of an eligible teacher at a school for children with disabilities, (b) be

between 2 and 9 years of age, and (c) produce fewer than 5 referential words. Potential

participants were nominated by teachers and staff, and eligibility was confirmed via a

parent report measure and a brief, 20 min observation conducted by the first author (see

Appendix B for observation data sheet). During the observation, children’s language was

measured and documented and used to determine eligibility. It was initially thought

66
prelinguistic targets would be appropriate for intervention. However, once enrolled in the

study, it became clear that language targets were most appropriate based on the students’

communication skills and language levels. Therefore, MT was selected as the most

appropriate intervention for both child participants. For a complete description of child

participant demographics, see Table 3.2.

Dyad 1: Ms. Smith and Sarah

Ms. Smith. Ms. Smith had her associate’s degree and was certified as a

paraprofessional and a tutor, and served as the lead teacher in her classroom. She reported

experience working with children with a variety of disabilities, including ASD,

intellectual disability, Down syndrome, emotional disturbance, specific learning

disabilities (SLD), other health impairment (OHI), speech and/or language impairment,

visual and hearing impairments, traumatic brain injury, attention deficit hyperactivity

disorder (ADHD), and social communication deficits. She reported teaching in both

special and general education classrooms as well as gifted classrooms. Ms. Smith worked

at the school for a total of four years. She had nine students in her classroom, all of whom

had known diagnoses, including intellectual disability, speech and/or language

impairments, and several others. All nine children had an individualized Growth and

Performance plan (GPP), which was the school’s equivalent of an individualized

education program (IEP). She reported no experience with naturalistic social

communication interventions, but she had worked as a registered behavior technician

where she assisted lead teachers and board-certified behavior analysts (BCBAs) in

classroom and therapeutic settings. Ms. Smith reported minimal experience with

67
professional training, having participated in a previous study several years ago where she

was trained to perform various behavioral techniques, such as discrete trial teaching. Ms.

Smith reported using several strategies to support communication in the classroom,

including positive reinforcement, encouragement, motivation, differentiation, and peer

modeling.

Sarah. Sarah was 8 years, 0 months old and had a diagnosis of Phelan McDermid

Syndrome, developmental delay, and sensory dysfunction. Her full-scale IQ according to

the Abbreviated battery of the Stanford Binet Intelligence Scales- Fifth Edition (SB-5)

was 47. Her nonverbal IQ was below 42 and her verbal IQ was below 43. Previously,

Sarah received interventions including speech and language therapy, occupational

therapy, and applied behavior analysis. She had been at the school for four years and had

a GPP with social communication goals. As such, at the start of the study, Sarah was

receiving speech/language therapy and occupational therapy. According to parent report,

she was not using any words functionally or meaningfully. However, based on examiner

observation, she produced and used approximately 5 words meaningfully and

functionally.

Dyad 2: Mr. Parker and Josh

Mr. Parker. Mr. Parker was brand new to both teaching and working at the

school and did not have any previous experience working with children. He worked at the

school for a total of one week prior to enrolling in the study and worked as a

paraprofessional in the same classroom as Ms. Smith. He reported no experience with

naturalistic social communication interventions and no experience with professional

68
training. Mr. Parker did not report using any strategies to support communication in the

classroom at the beginning of the study.

Josh. Josh was 8 years, 7 months old and had a diagnosis of ASD. His full-scale

IQ according to the Abbreviated battery of the SB-5 was 47. His nonverbal IQ was below

42 and his verbal IQ was below 43. Josh was not receiving any interventions at the time

of the study. This was his first year attending the school and he had a GPP with social

communication goals, focusing on communication and language. According to parent

report, he was able to produce 166 words. However, based on examiner observation, he

produced and used approximately 3 words meaningfully and functionally. Much of his

language was comprised of echolalia, and teachers reported that they had never heard him

use more than a few words functionally and meaningfully.

Materials

Training Materials and Data Collection

A PowerPoint presentation containing the targeted information and video

examples was developed for each intervention strategy. During training sessions, the

PowerPoint presentation and videos were displayed on the researcher’s laptop computer

or on a larger projector screen when available. Data collection sheets were developed and

used to collect data on fidelity of BST for all teacher training sessions (see Appendix C).

All observation sessions were video recorded to allow for primary and reliability

coding of teacher behavior. Data collection sheets were developed to record teacher

strategy use across baseline and intervention sessions (see Appendix D). At the beginning

of SLP data collection, teachers were using their own watches or an iPad displayed with

69
the time to remind them to prompt for language targets every 2 min. An interval timer

that the teachers could keep on their person and that vibrated every 2 min was introduced

partway through SLP intervention to make it easier and less cumbersome for teachers to

determine when the 2 min interval had expired. It was introduced during Session 31 for

Ms. Smith and Session 32 for Mr. Parker during SLP intervention.

Baseline, Intervention, and Maintenance Toys

Toys were selected by Ms. Smith and Mr. Parker based on their knowledge of

Sarah’s and Josh’s interests. Toys used during baseline and intervention included the

following for both Sarah and Josh: magnets, books, puzzles, counting blocks, monkey

string with shape cards (a sticky string-like substance that is pliable and can adhere to

surfaces), string blocks, color sorting bears, coloring, small balls, connecting people

(small people who created circles and chains by holding hands), and Colorino (a game

where you match colorful markers to the picture to make a 3D version of the picture). A

giant beachball also became available to Josh toward the end of baseline for FTCL and

the beginning of FTCL intervention, and a bike became available to Sarah during FTCL

intervention.

Generalization Toys

Generalization toys for Sarah included a trampoline, a racetrack with cars, train

tracks, a vacuum cleaner, taking a walk, and a medium sized ball. Toys for Josh included

a trampoline, a tricycle, a medium sized ball, and pipe cleaners.

70
Setting

All baseline and intervention sessions were conducted in a 1:1 format in

classrooms and other available spaces (i.e., library, hallway, outdoor recreational area) at

a school for children with disabilities in the southeastern United States. Classrooms were

divided into several areas, including a reading area, an arts and crafts area, a play area,

and a block area. Baseline and intervention data collection was conducted during

unstructured free play in the play area of the classrooms and other available school

spaces. Training sessions were conducted 1:1 with the teachers in classrooms and/or

conference rooms at the school. No children were present during initial BST sessions.

Formal Measures

Demographic Questionnaire

Both teachers and parents of eligible children were asked to complete a brief

researcher-developed demographic questionnaire to gather background information (see

Appendices E & F).

Stanford-Binet Intelligence Scales, Fifth Edition (SB-5; Roid, 2003)

The SB-5 is an individually administered assessment of intellectual functioning. It

is appropriate for individuals ranging in age from 2 to 85+ years. Each child was

administered the SB-5 at the beginning of the study to determine his/her level of

intellectual functioning. The SB-5 demonstrates good internal consistency with values

ranging from .84 to .98, as well as good test-retest reliability, with values ranging from

.74 to .97 (Janzen et al., 2004). Similarly, the SB-5 demonstrates good concurrent validity

with a variety of tests, including the Woodcock Johnson-III Tests of Cognitive Abilities

71
(Woodcock et al., 2001a), the SB-IV (Thorndike et al., 1986), and many others with

values raging from .78 to .90 (Janzen et al., 2004). The SB-5 also demonstrated good

predictive validity with the Woodcock Johnson-III Tests of Achievement (Woodcock et

al., 2001b) and the Wechsler Individual Achievement Test-II (Wechsler, 2005) in

addition to good construct validity (Janzen et al., 2004).

The MacArthur-Bates Communicative Development Inventories Words and Gestures

Form (CDI; Fenson, et al., 2007)

The CDI is a parent report measure that assesses a child’s early developing

language, including use and understanding of words and phrases as well as gestures.

Parents were asked to complete the Words and Gestures form of the CDI. This form is

normed for use with children ages 8 months to 30 months; however, it can be used for

children who are older than 30 months if their communication and development are

delayed. The CDI demonstrates good internal consistency (ranging from .62 to .76), and

test-retest reliability (ranging from .59 to .99; Law & Roy, 2008). In addition, it has

shown good convergent validity: .52 with the Preschool Language Scale-Revised

(Zimmerman et al., 1979), .67 with the Peabody Picture Vocabulary Test- Third Edition

(Dunn & Dunn, 1997), and .82 with the Reynell Developmental Language Scales

(Fenson et al., 2007; Reynell & Gruber, 1990).

Social Validity

At the conclusion of the training and intervention process, teachers were asked to

complete an acceptability rating scale to help determine whether they found the

intervention acceptable, helpful, and practical (adapted from Hendrickson et al., 1993).

72
This measure was used to help gather information regarding teacher acceptability of the

training process, perceived effectiveness of the intervention techniques, and overall

feasibility of the techniques for use in the classroom (see Appendix G).

Response Definitions and Measurement Procedures

Teachers were taught three early social communication intervention strategies

from the Milieu Teaching (MT) intervention that were implemented in the classroom

during 20 min, pull-out sessions in which the teacher and child worked together

separately from the larger classroom. Three core techniques were targeted: (1) following

the child’s lead (FTCL), (2) teaching social routines (TSR), and (3) the system of least

prompts (SLP). All teachers were taught the intervention techniques in the same order,

beginning with the most foundational skill and progressing toward more complex,

response interactive strategies (Hemmeter & Kaiser, 1994). All intervention techniques

were based on those described by Fey (2008) and Fey et al. (2017).

Following the child’s lead

FTCL was coded if the teacher was engaged in any of the following behaviors:

imitating the child’s play with objects while engaging in play alongside him/her (parallel

play), commenting on the child’s behavior as the child plays, or responding appropriately

to child interactions (e.g., if the child holds out a toy in the teacher's direction, he/she

should accept the toy). FTCL was not coded if the teacher was engaged in any of the

following behaviors: asking questions, issuing commands or directions, using directive

speech (e.g., “my turn,” “your turn”), doing nothing, doing something other than what the

child was doing, playing with a different toy or activity than the child, prompting, doing

73
the opposite of what the child requested, correcting a behavior or response, or

commenting on their own behavior or actions.

Measurement and data collection. Based on research regarding the most effective

form of time sampling methods (Lane & Ledford, 2014) and optimal interval length, each

20-min data collection session was divided into 80, 15-s intervals. Momentary time

sampling was used when scoring intervals for FTCL. Thus, at the end of each interval,

the interventionist scored whether the teacher was correctly engaging in FTCL. If the

child and adult could not be seen in the video frame together, they were not considered to

be within arm’s reach and FTCL was not coded. All codes were based on the exact end of

the interval and not what came after (e.g., at 45.999 the teacher moves her hand, but does

not place it onto the child’s hand until 46, the code was based on the teacher moving her

hand, not where it ended up going). These data were then used to calculate a percentage

of intervals during which the teacher correctly implemented FTCL by dividing the

number of intervals containing FTCL by the total number of intervals and multiplying by

100. An interval was scored as containing FTCL if the operational definition of FTCL

was met at the boundary of the interval.

Teaching social routines

A social routine was operationally defined as: (1) successfully completing the steps in

a routine, (2) engaging in a routine, and (3) taking an imitative turn. TSR was coded as

correct if the teacher was successfully completing the steps in the identified routine,

engaging in the specified routine, taking an imitative turn, playing with the same toy as

the child and engaged in the same routine as the child, prompting the child to engage in

74
the routine (e.g., "my turn, your turn") or he/she was within the context of setting up or

maintaining the routine. TSR was still coded as correct if the routine was altered or

changed based on the child's interests as the teacher was still engaging in FTCL. TSR

was not coded if the teacher was engaged in parallel play (playing separately from the

child), engaged in an activity that did not allow for turn-taking (e.g., iPad or any game the

child played alone while the adult watched), engaged in a separate activity from the child,

was not within arm’s reach of the child, and/or was not visible in the same camera frame

as the child.

Measurement and data collection. Each 20-min data collection session was

divided into 80, 15-s intervals. Momentary time sampling was used when scoring

intervals. Thus, at the end of each interval, the interventionist scored whether the teacher

was correctly engaging in TSR. These data were then used to calculate a percentage of

intervals during which the teacher correctly implemented social routines by dividing the

number of intervals containing TSR by the total number of intervals and multiplying by

100. An interval was scored as containing TSR if the operational definition of TSR was

met at the boundary of the interval.

System of Least Prompts

The SLP is used to occasion a response from a child by moving through a

prompting hierarchy, starting with prompts that provide the least amount of support and

moving up to the most amount of support that is needed, or can be given, to occasion a

response. Teachers were told to start by giving an expectant look and waiting 3 s (time

delay: Step 1). If the children did not respond within 3 s, teachers then issued a linguistic

75
prompt, “What do you want?” (Step 2). If after another 3 s, children still did not respond,

teachers were instructed to give a linguistic model prompt and say, “X. You want X. Say

X.” encouraging the child to repeat the word after them (Step 3). See Figure 3.1 for a full

description of the prompting hierarchy. Note, this version of the prompting hierarchy was

adapted from Fey (2008). Several steps were removed from the hierarchy as described,

namely the cue step and the additional model step. This was done to ease the learning

burden for teachers and to make the SLP process more user-friendly. Seven steps were

defined and scored as correct or incorrect for the SLP variable. In order to have scored a

prompt, the teacher must have begun the prompting hierarchy with time delay. This

included withholding an object; pointing to or drawing the child’s attention to an object;

using gestures or vocalizations (e.g., “your turn” and pausing for 3 s (range = 3 to 5 s);

and waiting in silence to see if the child vocalized. Additionally, the prompt must have

been designed to elicit a vocal response to be scored (e.g., conveyed the expectation that

the child responds vocally).

SLP was implemented once every 2 min. First, a correct response was coded if the

teacher only gave one prompt within the 2 min time frame. In contrast, an incorrect

response was coded if the teacher gave more than one prompt or no prompts at all within

the 2 min time frame. Second, time delay was coded as correct if the teacher drew the

child’s attention to a specific item and then waited in silence for 3-5 s before issuing a

linguistic prompt (e.g., What color?). It was coded as incorrect if the adult waited less

than 3 s or more than 5 s before moving on to the next level of the hierarchy or if time

delay was implemented incorrectly. Third, a linguistic prompt was coded as correct if the

76
linguistic prompt was implemented correctly (i.e., saying a pure linguistic prompt [e.g.,

“What color is this?”] after time delay and before a model prompt. This was also coded

as correct if the child correctly responded after time delay and the linguistic prompt was

not necessary provided that the teacher appropriately stopped the prompting hierarchy

there. A linguistic prompt was coded as incorrect if the teacher skipped the linguistic

prompt and went right to a model prompt (e.g., “This is red.”). For example, if the teacher

went right to modeling (e.g., “What color? Purple”) then the linguistic aspect of this

prompt was considered part of the model prompt and the pure linguistic prompt was not

given.

Fourth, a model prompt was scored as correct if the teacher implemented the

model prompt correctly (i.e., saying the word he/she wanted the child to say and asking

the child to repeat it). A model prompt was also coded correct if the child correctly

responded after the linguistic prompt, making the model prompt unnecessary. A model

prompt was coded as incorrect if the model prompt was implemented incorrectly (e.g.,

the teacher modeled the wrong word; or issued the model prompt after the child had

already given the correct response; or the teacher failed to attempt to get the child to

repeat the word) and the adult did not model the desired vocalization and moved on to

another task/activity. Fifth, praise was operationally defined as encouraging sayings (e.g.,

“Great job! Nice work! Awesome! Way to go!”) and providing feedback (e.g., “That’s

correct. You got it right.”). Non-vocal forms of praise (e.g., high fives, fist bumps) were

acceptable if they were accompanied by vocal praise. Praise was coded as correct if the

teacher provided praise when the child correctly completed the behavior or attempted to

77
correctly complete the behavior, or the teacher did not provide praise if the child did not

complete or attempt to complete the behavior. Praise was coded as incorrect if the teacher

did not provide praise for successful completion of the target response, or if he/she

praised unsuccessful completion (i.e., the child did not emit the vocal response).

Sixth, the discontinuation of the prompting hierarchy was coded as correct if the

teacher concluded the prompting sequence when and only when the child

performed/attempted the requested vocalization or the teacher reached the end of the

prompting hierarchy. This column was coded as incorrect if the teacher discontinued

prompting before the child completed/attempted the vocalization or if the teacher

persisted with prompting even after the child had successfully completed/attempted the

vocalization. Of note, all prompts including time delay were coded regardless of whether

the prompts issued by the teacher were issued according to the order of the prompting

hierarchy. Seventh, sequence of prompts was coded to determine whether the teacher

implemented the steps of the prompting hierarchy in the appropriate and accurate order.

Prompting sequence was scored as correct if the sequence in which the prompt levels

were performed was done correctly (e.g., starting with time delay, moving to linguistic,

moving to modeling [if necessary]) regardless of whether each individual step was done

correctly. This step was coded incorrect if the prompts were delivered in an incorrect

sequence (e.g., modeling prompt performed before the linguistic prompt) regardless of

whether each individual step was done correctly.

Measurement and data collection. Accuracy of use of the system of least prompts

was measured using an event recording system. Each use of the prompting hierarchy was

78
given a percentage accuracy score based on the following criteria, which were marked as

yes, or no: (1) one prompt was given every 2 min (i.e., prompts should be separated by 2

min, plus or minus 15 s), (2) time delay was implemented correctly, (3) linguistic mand

prompts were implemented correctly, (4) linguistic mand-model prompts were

implemented correctly, (5) praise was provided when appropriate, (6) prompting was

discontinued at the appropriate step, (7) the sequence of prompts was followed correctly

(see Figure 3.1). All instances of prompting that demonstrated an 80% mastery criterion

(i.e., 6/7 steps completed correctly) were scored as correct (for a similar procedure, see

Wright & Kaiser, 2017). These data were then used to determine the percentage of

prompts used correctly in a session, calculated by dividing the number of prompts used

correctly by the total number of prompts and multiplying by 100.

Experimental Design

A single-case, concurrent multiple-baseline across behaviors design replicated

across teachers was implemented. This design allowed for the detection of a functional

relation between BST implementation and changes in teachers’ use of intervention

strategies in the classroom. Experimental control is demonstrated in a multiple baseline

across behaviors design when the data level or trend change upon introduction of the

intervention to the first tier while the data remain stable or unchanged in the remaining

tiers, and this change is repeated through the process of intra-participant replication (Gast

et al., 2018). Thus, with an increasing number of demonstrations of effect upon

introduction of the intervention, confidence that the intervention is responsible for the

79
change in data trend or level increases, experimental control is established, and a

functional relation between the intervention and change in the data can be inferred.

Procedures

Baseline Condition

A minimum of five, 20 min baseline sessions were conducted and video-recorded

for each teacher based on the recommendation in the What Works Clearinghouse (WWC)

Standards for Pilot Single-Case Designs, Version 4. During the baseline condition,

teachers were instructed to interact with children as they normally would and any

spontaneous use of programmed intervention techniques was recorded (for data collection

sheets, see Appendix D), but no instruction, training, or feedback was provided. The

teacher and child were observed during free play time. At the beginning of each session,

the teacher laid out all the toys and allowed the child to choose the toys with which to

play. Once baseline data were stable and without trend in the therapeutic direction across

tiers, and the minimum number of baseline data points had been collected for the first

behavior, the first intervention strategy was introduced. The introduction of intervention

was made based on teachers’ individual data; therefore, baseline length varied across

teachers.

Teacher Training

Teacher training occurred across two 1:1 sessions immediately prior to

introduction of intervention for each MT technique using BST. The BST procedure

included: 1) instruction, 2) modeling, 3) rehearsal, and 4) in-situ feedback. Upon

introduction of each intervention strategy, two training sessions were held - one session

80
that lasted approximately 45 min and one session that lasted approximately 20 min. Two

sessions were dedicated to each target intervention technique for a total of six training

sessions per teacher. Each individual teacher selected a training time that worked best for

his/her schedule and received training at the selected time. The two training sessions for

each technique occurred no more than two weekdays apart. For example, if the first

training session for FTCL occurred on a Tuesday, the second training was conducted by

Thursday of that same week.

During the initial training session for each intervention technique, the instructor

introduced and described the strategy, broke down and described the process for

implementing the technique, and answered any questions that teachers had. Next,

teachers were shown video models implementing the technique. During rehearsal and

feedback, the researcher and the teacher practiced the target intervention technique with

the researcher providing immediate feedback. Feedback included positive comments

regarding what the teacher was doing well in addition to comments designed to help

improve the accuracy of the teachers’ implementation of the techniques. During the

second training session for FTCL, TSR, and SLP, the teacher was asked to implement the

technique with the child in the classroom environment while the researcher provided in-

situ feedback. The researcher modeled the techniques for each teacher as necessary and

practice continued until each teacher felt confident that he/she could implement the

technique appropriately on his/her own. During the second training session, the

researcher observed the teachers engaging in each target technique with the child and

provided immediate, in-situ feedback on their performance.

81
Following the Child’s Lead. During training of FTCL, teachers were trained to

allow the child to lead and direct the interaction. First, they were trained how to imitate

the child’s actions and engage in parallel play, a process where they play alongside the

child, imitating the child’s actions but playing independently. For example, the teacher

and the child might run two different cars down two separate tracks to play with the cars

rather than taking turns with the same car or playing on the same track. Next, teachers

were trained how to comment on the child’s behavior as they played (e.g., “You have a

car. That’s a red block.”) rather than to ask questions or give commands (e.g., “What

color is that block? What do you have?”). Finally, teachers were trained how to respond

appropriately to child initiations. For example, if a child offered an object to the teacher,

he/she should accept the object; similarly, if the child made a request, the teacher should

attempt to fulfill the request (within reason).

Teaching Social Routines. During training for TSR, the researcher worked

closely with teachers to identify and develop several routines for each child participant

depending on the selected toys. Teachers establish routines with children in a variety of

ways, such as imitating a child’s play with the same or similar toys, imitating the child’s

actions, performing an action that is complementary to the child’s action to create a turn

for the teacher within the interaction, engaging the child by performing an action or

activity that he/she finds funny or interesting, or paring actions with singing or counting

(Fey, 2008). The development of these routines is critical for future techniques which

require the teacher to interrupt the established routine in order to create opportunities for

communicative interaction (e.g., prompting). For Ms. Smith, trained routines included

82
taking turns riding a bike throughout the school, putting magnets up on a board, coloring

a picture, and building with blocks. For Mr. Parker, trained routines included having Josh

request a turn to bounce on a giant ball and take turns playing with monkey string (i.e., a

pliable, sticky string that adheres to surfaces).

System of Least Prompts. The prompting hierarchy for SLP consisted of the

following prompting techniques: (1) time delay, (2) linguistic prompts, and (3) linguistic

mand-model prompts (adapted from Fey, 2008). Teachers were trained how to prompt in

the following manner: beginning with time delay, the teacher removed a toy of interest or

interrupted the routine, gave an expectant look, and then waited 3 s before delivering any

kind of instruction, giving the child (portrayed by the researcher) the opportunity to

communicate independently. Once the 3 s had elapsed, and the “child” had not provided

the correct response, the teacher then moved on to the next step in the prompting

hierarchy, linguistic prompts. These prompts are vocally issued prompts that encourage a

child to communicate with adults. For example, if after 3 s the “child” did not respond to

the initial disruption in routine, the teacher would say, “What do you want?” and pause

for another 3 s. If the “child” still did not respond to the linguistic prompt, the teacher

then issued a linguistic mand-model prompt, saying, “X. You want X. Say X.” saying the

name of the toy. In general, linguistic mand-model prompts are used to tell a child what is

being asked or what is expected (e.g., asking the child specifically what he/she wants).

Teachers were trained to prompt once every 2 min following the aforementioned

hierarchy.

83
As part of the initial training session for the SLP intervention technique, the

researchers and the teachers selected specific verbal goals for the individual child

participants. Because the children in this study were minimally verbal, single word

linguistic social-communication goals were identified for each child. Goal selection was

based on each child’s current level of social communication and communication

objectives in their GPPs. For example, if a teacher and child were engaged in a routine in

which they were rolling a ball back and forth but the teacher did not return the ball upon

the child’s turn, the primary prompting goal may be for the child to say, “ball,”

requesting the desired item. It would also be appropriate in this scenario for the child to

say, “turn” requesting his/her turn with the ball. To establish linguistic targets, based on

the recommendations of Fey (2008) the interventionist sat with each teacher to examine

the routines that each child developed during that stage of the intervention and identified

several target words that the children could emit in order to request continuation of the

routines. For Sarah, the primary routine was riding on the bike and prompted words

included, “back”, “go”, “bike”, “turn”, and “push.” For Josh, the primary routines were

rolling on the beach ball and taking turns with the monkey string and prompted words

included, “push”, “ball”, “bounce”, and “monkey string.”

Intervention Condition

Procedures for teacher-child sessions during the intervention condition were

identical to those used in baseline (1:1 pull out sessions, 20 min in length) with the

exception that teachers used the trained techniques rather than playing as they normally

would. No training or help was provided by the researcher during intervention sessions.

84
Data on teacher use of MT strategies were collected via video-recordings of intervention

sessions. A minimum of five data points was required for the intervention condition

across MT techniques and teachers, and intervention continued until the teachers reached

a mastery criterion of three consecutive intervention sessions at 80% or higher.

Given the importance of practice and continuing feedback in ensuring

maintenance of intervention skills (Ward-Horner & Sturmey, 2012), once the intervention

began, a 1:1 check-in with teachers was conducted after each data collection session in

order to ensure the continuation of intervention knowledge and accuracy. During the

check-in, the teacher and the researcher reviewed what went well during the session and

noted areas for improvement. The researcher also answered any questions the teacher had

and provided any additional practice/training as necessary. The researcher reviewed the

teachers’ graphed data with them as necessary (e.g., during a period of continued

performance decline or upon reaching a described goal) and discussed and explained their

performance based on visual analysis. Booster training sessions were also utilized during

periods of performance decline or after long breaks in data collection. During these

sessions, the researcher and teacher would practice the intervention techniques with the

child while the researcher provided immediate feedback and modeling as necessary.

Generalization and Maintenance

Generalization across materials was assessed once during each intervention phase

and during baseline for the TSR and SLP techniques. Teachers were given a novel set of

toys and were asked to use the trained intervention techniques with the novel toys. The

same set of novel toys were used during all three sessions for each teacher/child pair.

85
Generalization sessions were identical to intervention sessions. Data were collected using

the same procedures as during the baseline and intervention conditions, with a different

set of toys, and took place in the same environments. After completion of the

generalization probes, teachers were provided with feedback regarding their performance.

Maintenance of intervention techniques was assessed at two and four-week follow up

observations. Data were recorded on teachers’ use of all trained intervention techniques.

All maintenance sessions were identical to intervention sessions. The same data

collection systems used during previous data collection sessions were used during

maintenance sessions and all trained techniques were evaluated.

The study lasted for six months. In the beginning of the study, data were collected

once a day, two to three times per week. However, as the study progressed, data were

collected less frequently. There was a two-week break in data collection between sessions

14 and 15 for both teachers due to the Thanksgiving holiday and teacher availability. In

addition, after Session 23 for Ms. Smith and Session 24 for Mr. Parker, there was a five-

week break in data collection due to the holidays and teacher and researcher travel. Upon

returning from the five-week break, Mr. Parker requested a booster session, which was

performed between Sessions 24 and 25. Thus, a booster session for FTCL and primarily

TSR was conducted during which the researcher provided in-situ feedback to Mr. Parker

as he interacted with Josh. Ms. Smith was also offered a booster session, but she

declined. There was one additional two-week break between Sessions 30 and 31 for Ms.

Smith and between Sessions 28 and 29 for Mr. Parker. On four occasions for both

teachers, two sessions were conducted in one day with 30 min to 1 hr in between

86
sessions. For Ms. Smith, the following session pairs were conducted on the same day: 26

and 27, 29 and 30, 31 and 32, and 33 and 34. For Mr. Parker, the following session pairs

were conducted on the same day: 25 and 26, 27 and 28, 30 and 31, and 32 and 33.

Procedural Fidelity

Fidelity checklists of BST were developed for each of the intervention training

techniques, detailing specific criteria that must be covered within the training sessions

(see Appendix C). These checklists were used to ensure that all teacher participants

received the same information and training. Average procedural fidelity for Ms. Smith

was 77% with a range of 60-96%. Similarly, average procedural fidelity for Mr. Parker

was 78% with a range of 65-96%.

Interobserver Agreement

In accordance with WWC (Version 4.1) standards, 20% of randomly selected

baseline, intervention, generalization, and maintenance sessions across MT techniques

and teacher participants were coded by a second observer to determine interobserver

agreement (IOA). All raters were required to demonstrate 80% accuracy or higher for 3

different training video sessions on all data collection instruments before coding video

recordings for IOA. Interval-by-interval interobserver agreement was used to determine

IOA for all data collected (i.e., FTCL, TSR, and SLP). This was calculated by dividing

the number of intervals for which the two observers agreed by the total number of

intervals for the session and multiplying by 100. The primary coder was blind to sessions

selected for IOA. If IOA was below 80% on a session coded for IOA, a discrepancy

discussion was held to re-calibrate, but the original agreement percentage was maintained

87
for calculation of mean IOA. For Ms. Smith, observers had an average agreement across

baseline and intervention conditions of 80% (range: 56-93%) for FTCL, 92% (83-100%)

for TSR, and 97% (81-100%) for SLP. For Mr. Parker, observers had an average

agreement across baseline and intervention conditions of 83% (range: 63-93%) for FTCL,

96% (86-100%) for TSR, and 96% (80-100%) for SLP. See Table 3.3 for full reporting of

IOA, broken down by tier and condition for each teacher.

Visual Data Analysis

The dependent variables were graphed for each teacher. Both teachers had a

minimum of five data points per baseline condition. When training occurred on the first

MT strategy, baseline data continued to be collected for the remaining intervention tiers

(behaviors) and were analyzed for stability and absence of trends in the therapeutic

direction. With the introduction of training for each new intervention strategy, the

transition from baseline to intervention provided an opportunity for the replication of

effect, allowing for the detection of a functional relation (Gast et al., 2018). This process

was in accordance with standards issued by the What Works Clearinghouse (WWC) for

Pilot Single-Case Designs, Version 4.1, stating the need for a minimum of three

demonstrations of intervention effect at three different points in time. In addition, the use

of a multiple baseline design with a minimum of six intervention phases and a minimum

of five data points meets the standards put forth by WWC for Pilot Single-Case Designs

Without Reservations. Visual analysis of data trend, level, and variability was conducted

for each teacher to determine whether BST led to increases in implementation of MT

techniques.

88
Results

Milieu Teaching Strategies

Ms. Smith

Following the child’s lead. During baseline, Ms. Smith averaged 31% of

intervals demonstrating FTCL (range: 16-43%; see Figure 3.2). Baseline data were

somewhat variable and were trending in the non-therapeutic direction when intervention

began. Once intervention was introduced, there was an immediate increase in level of

percentage of intervals demonstrating FTCL, with a mean percentage during intervention

of 74% (55-90%). Of note, Ms. Smith’s performance during intervention was somewhat

variable; she began with a high percentage of intervals with FTCL, and then her

performance began to decline. A new toy (bike) was inadvertently introduced into the

interaction during Session 11 of the FTCL intervention condition. It is worth noting that

Ms. Smith’s performance increased initially to above the 80% criterion when this

happened; however, her performance went back to below the mastery criterion after this

session. This soon triggered the need for a booster session where the researcher met with

the teacher to review the intervention procedures and allowed her to practice these

procedures once again. After the booster session, Ms. Smith’s data began to increase in

the therapeutic direction, and she was able to achieve the mastery criterion of three

consecutive sessions at 80% of intervals or higher. Despite this variability in her data,

there was no overlap between data points in the baseline and intervention conditions. The

change in level of the data suggests that the intervention was responsible for the change

in the teacher’s percentage of intervals demonstrating FTCL.

89
Upon introduction of BST to TSR, the teacher’s performance of FTCL

demonstrated continued variability, with a mean accuracy score of 76% (60-90%) and

dipping below the 80% criterion several times. However, this is not unexpected based on

the nature of TSR. Once TSR is introduced as a technique, the teacher is instructed to

structure the interaction so that a routine is being used regularly. This can make it

difficult to maintain FTCL with levels commensurate with initial intervention

performance.

Teaching social routines. During baseline, Ms. Smith’s percentage of intervals

demonstrating TSR were largely consistent, with a mean score of 3% and a range from 0-

8%. Baseline data were low in level and were not trending in any direction after the first

few data points, which were initially trending in the therapeutic direction. Data were

largely stable around 0%. When intervention was introduced, percentage of intervals

demonstrating TSR showed an immediate increase in level from 0% to 88% and

remained above the 80% criterion for the remainder of the intervention condition. Her

mean percentage of intervals demonstrating TSR was 89% with a range of 86-94%. This

change in level suggests that the intervention was responsible for the teacher’s

improvement in percentage of implementation during sessions. Upon introduction of BST

to SLP, Ms. Smith’s performance became more variable with a mean percentage of

intervals demonstrating TSR of 71% with a range of 35-88%.

System of least prompts. During baseline, Ms. Smith’s performance was largely

consistent with an average percentage of prompts used correctly of 1% (range: 0-10%).

Baseline data were quite flat with a slight uptick in the therapeutic direction, which then

90
flattened once again. Data level remained low throughout baseline. Once intervention

was introduced, Ms. Smith’s performance actually decreased from 10% to 0% prompts

used correctly during the first session. This was due to the fact that Ms. Smith forgot to

implement one component of the SLP process, resulting in all of her prompts being

scored as incorrect for this session. After meeting with the researcher to discuss this issue

and to re-practice prompts, her performance improved to 80% of prompts used correctly

during the session. Her average percentage of prompts used correctly during intervention

was 72% with a range of 0-100%. With the exception of one data point (the session with

0% accuracy), there was no overlap in the data between baseline and intervention

conditions, suggesting that the intervention was responsible for Ms. Smith’s increase in

accuracy of implementation.

Generalization. During the first generalization probe for FTCL during the

intervention condition, Ms. Smith demonstrated 85% of intervals coded for FTCL

suggesting her skills had generalized to a new set of materials. This score decreased to

50%, and then increased to 84%. However, because there were no generalization sessions

conducted during baseline, evidence that BST contributed to generalization of FTCL

across materials is limited. Regarding TSR, baseline generalization performance was

10%. This increased to 79%, just below the 80% criterion, during the TSR intervention

condition, but decreased to 15% after introduction of BST to SLP. Thus, Ms. Smith was

able to generalize her TSR skills initially, but this generalization did not maintain.

Finally, during the two SLP baseline generalization probes, she performed at 0%. In

contrast, during the SLP intervention generalization probe, her performance improved to

91
80%, indicating that she was successfully able to generalize the SLP skills to a new set of

materials.

Maintenance. Regarding FTCL, Ms. Smith’s performance remained above the

80% criterion with values of 88% and 94%, at the two- and four-week follow ups,

respectively. Similarly, she maintained her scores for TSR at the two- and four-week

follow up sessions, with values of 84% and 80%, respectively. Finally, her SLP skills

maintained at the two-week follow up with a score of 90% but did not maintain at the

four-week follow up, with a score of 60%, falling below the 80% criterion.

Mr. Parker

Following the child’s lead. Mr. Parker demonstrated an average percentage of

intervals coded for FTCL of 30% (range: 16-40%) during baseline (see Figure 3.2).

Baseline data were variable but remained somewhat low in level. However, there was a

trend in the therapeutic direction toward the end of baseline. Once intervention was

introduced, there was an immediate increase in level, with a mean performance during

intervention of 77% (54-93%). Of note, Mr. Parker’s performance during intervention

was highly consistent in the beginning; however, as time went on, his performance

became more variable. A new toy (beach ball) was inadvertently introduced into the

interaction during Session 7 of the FTCL baseline condition. The ball was absent during

Session 8 due to the fact that it was unavailable, but it was again present during Session 9

when FTCL intervention began. During Session 15, the ball was again unavailable during

data collection, which may have resulted in Mr. Parker’s low accuracy of implementation

of 54%. This triggered the need for a booster session where the researcher met with the

92
teacher to review the intervention procedures and to allow him to practice these

procedures once again. Technically, Mr. Parker reached the mastery criterion of three

consecutive sessions at 80% or higher during Session 13; however, intervention was

continued due to the variability and therapeutic trend that were present in his TSR

baseline data. Thus, intervention was continued in the hopes that his TSR baseline data

would stabilize. After the booster session, Mr. Parker’s FTCL data demonstrated

increases in the therapeutic direction, and he achieved the mastery criterion of three

consecutive sessions at 80% accuracy or higher. Despite some variability in his data,

there was no overlap between data points in the baseline and intervention conditions. The

change in level of the data suggests that the intervention was responsible for the change

in the teacher’s percentage of implementation of FTCL. Upon introduction of BST to

TSR, the teacher’s performance dropped, with a mean percentage of intervals scored for

FTCL of 30% (8-63%). His performance remained below the 80% criterion level once

TSR was introduced. As previously mentioned, this drop in FTCL was expected based on

the structured nature of teaching social routines and the teacher’s increased involvement

in directing the interaction.

Teaching social routines. Mr. Parker’s performance during baseline was highly

variable, with an average percentage of intervals coded for TSR of 17%, ranging from

0% to 50%. The data began at a low, stable level around 0%; however, they began to

increase in the therapeutic direction, reaching a peak at about 50%. The data then began

trending in the non-therapeutic direction, reaching a level that was similar to the

beginning of baseline. However, it then began trending in the therapeutic direction again

93
before intervention began. When intervention was introduced, his percentage of intervals

coded for TSR increased from 36% to 91% and remained above the 80% criterion level

for all but one session. His average percentage of intervals coded for TSR during

intervention was 90% with a range of 83-99%. Of note, there was a large gap in data

collection between Sessions 24 and 25; this triggered the need for a booster session where

the researcher met with the teacher to review the intervention procedures and to allow

him to practice these procedures once again. Similarly, this skill remained above the

80% criterion when SLP was introduced with an average accuracy score of 94% and a

range of 90-99%. This change in data level suggests that the intervention was responsible

for Mr. Parker’s increase in accuracy of implementation.

System of least prompts. Mr. Parker demonstrated an average percentage of

prompts used correctly for SLP of 0% during baseline. Baseline data were flat and stable

and remained at the 0% level throughout. When intervention was introduced, his score

increased from 0% to 80%. With the exception of one data point, he remained above the

80% criterion throughout intervention, with an average percentage of prompts used

correctly for SLP of 80% (range: 50-100%). However, due to the fact that there was a

break in between the final baseline session and SLP training, these results must be

interpreted with caution. Although a generalization baseline point was conducted prior to

BST for SLP, inference of a functional relation between BST and MT implementation for

Mr. Parker is limited.

Generalization. During the generalization probe of FTCL during the intervention

condition, Mr. Parker demonstrated 81% of intervals coded for FTCL, suggesting that his

94
skills had generalized to a new set of materials. However, this decreased to 25%, and then

increased to 63%. Although his scores were somewhat variable, his FTCL skills

generalized to a new set of materials, though without baseline generalization sessions,

interpretations about whether BST contributed to generalization of FTCL are limited.

Regarding TSR, his baseline generalization performance was 0%. This increased to 99%

during the intervention condition, and further increased to 100%. Thus, he was able to

generalize his TSR skills to a new set of materials. Finally, during the two SLP baseline

generalization probes, Mr. Parker performed at 10% and 0%. In contrast, during the SLP

intervention generalization probe, his performance improved to 100%, indicating that he

was able to generalize the SLP technique to a new set of materials.

Maintenance. Regarding FTCL, Mr. Parker’s performance did not remain above

the 80% criterion, with values of 46% and 65%, at the two- and four-week follow up

probes, respectively. In contrast, he maintained his scores for TSR at the two- and four-

week follow ups, with values of 100% at both maintenance probes. Finally, his SLP skills

also maintained at the two- and four-week follow-up maintenance probes, with scores of

100% at both probes.

Descriptive Analysis of Formal Measures

Communication. Although child outcomes were not targeted in this study and

therefore not measured directly, parents completed a CDI for each child both before and

after the study. Prior to the study, Sarah’s parents reported that she understood 28/28

phrases and 307/396 words on the checklist. She reportedly could not produce a single

word on the list. Following the intervention, parents reported that she could understand

95
27/28 phrases and 370/396 words, a decrease of one phrase and an increase of 63 words

understood. They also reported that she could now produce a total of 16/396 words, an

increase of 16 words from the beginning of intervention. Josh also showed gains on this

measure from pre to post intervention. Before intervention parents reported that he could

understand 18/28 phrases and 380/396 words on the checklist. He could produce a total

of 166/396 words. Following intervention, he reportedly understood 22/28 phrases and

392/396 words, an increase of 4 phrases and 12 words. Josh could also reportedly

produce 201/396 words, an increase of 37 words. Though interesting, these gains are

merely descriptive, as a causal relationship cannot be concluded based on the study

design.

Social Validity. At the conclusion of the study, teachers were asked to complete

an acceptability rating scale to help determine whether they found the intervention

acceptable, helpful, and practical (adapted from Hendrickson et al., 1993). The

questionnaire was broken down into four main parts: Research, Intervention Effects,

Social Validity, and Training. Ms. Smith and Mr. Parker agreed that research is important

in schools, can improve staff teaching, and is important for better teaching all children.

Teachers also agreed that the intervention was helpful for them as teachers as well as for

the students with whom they worked. In terms of the social validity section, both teachers

reported that their knowledge, skills, and confidence in implementing techniques

improved; they believed they could incorporate techniques into daily classroom routines;

and that the intervention techniques were feasible to implement in the classroom.

However, Mr. Parker disagreed that the intervention techniques could be easily

96
incorporated into the classroom whereas Ms. Smith agreed. Both teachers strongly agreed

that they would participate in a similar project in the future.

Finally, regarding training session two, which featured in-situ feedback with the

child participants, both teachers agreed that they were comfortable during their training

sessions, the sessions were tailored to their experience levels, they felt comfortable

implementing techniques after their training sessions, felt comfortable asking questions,

and would recommend the sessions to their colleagues. In addition to the questionnaire,

Mr. Parker informed the researcher that he was grateful for having learned the techniques

and began implementing them with a new student with social communication difficulties

with whom he was working.

Discussion

The results of the present study indicate that BST can be effectively used to train

teachers to implement several core techniques of MT: FTCL, TSR, and SLP. Overall,

for Ms. Smith, the data showed one demonstration of effect and two replications of

effect, indicating evidence of a functional relation between BST and MT implementation

fidelity. For Mr. Parker, the data showed one demonstration of effect and one replication

of effect, with limited evidence of a second replication. Ms. Smith maintained all skills at

the two- and four-week follow up probes, with the exception of SLP at the four-week

follow up. Mr. Parker maintained TSR and SLP at the two and four-week follow up

probes but did not maintain FTCL. Ms. Smith initially generalized outcomes to a new set

of toys during intervention generalization sessions but struggled to maintain

generalization above the 80% criterion during additional generalization probes. Mr.

97
Parker struggled to maintain generalization criterion for FTCL but maintained

performance at the 80% criterion for TSR and SLP once these intervention techniques

were introduced. Similar to teacher gains in levels of implementation of the MT

techniques, post intervention gains were also observed in children’s word production and

comprehension according to parent-report.

These results are most similar to those reported by Kaiser et al. (1993) who found

that through implementing training techniques similar to those used in BST, they were

able to successfully teach educators environmental arrangement and MT techniques to be

implemented in the classroom. This study also expands upon similar studies which

trained parents to implement some form of MT techniques, using training techniques

similar to BST (Kaiser et al., 1995). In addition, similar to Aktas and Ciftcitekinarslan,

(2018), this study demonstrated the importance of intensive training in order for teachers

to correctly implement the MT techniques. BST provided a solid framework from which

to draw information and planning for teacher training sessions.

Implications

This study has several implications for both practice and for research. Data show

that teachers can be successfully taught to implement MT techniques when BST is used

as a training package. Teachers spend a great deal of time with children and if we can

train them to implement early social communication techniques designed to improve

language ability, then children have a chance to benefit from a much greater dosage of

intervention. Rather than seeing a therapist once a week for an hour, children could have

daily access to evidence-based interventions for multiple hours a day through trained

98
school staff. This would allow for an increase in dosage, and in theory, an increase in

language gains for each child exposed to the intervention in the classroom. Such practices

have the potential to make a big impact in the lives of children with social

communication deficits.

Based on the findings of the current study, future research should focus on

incorporating MT intervention techniques into regular classroom routines. The current

study allowed for an increase in dosage in intervention techniques, utilizing the teachers

as the interventionists. However, dosage could be increased even further and intervention

could reach more children if these techniques were integrated into regular classroom

routines and incorporated as part of normal classroom instruction. Similarly, given that

the present study has demonstrated that BST can be used to effectively train teachers to

implement MT in the classroom, other naturalistic interventions should be explored. BST

may be used to teach a variety of naturalistic interventions to not only teachers, but to

parents and other important figures in children’s lives. Finally, future research could

expand the results of the current study by directly measuring child outcomes to determine

whether the intervention examined has a direct effect on children’s language and other

targeted behaviors.

Limitations

Despite the positive outcomes of this study, there were several significant

limitations worth noting. First, regarding generalization, a generalization probe was not

conducted during FTCL baseline for Ms. Smith or Mr. Parker, making it impossible to

compare generalization after the introduction of the technique to baseline performance.

99
Thus, we cannot say for sure that generalization occurred for FTCL as performance may

have been equally high during baseline as during intervention. The remaining two tiers

had generalization probes conducted during baseline and during intervention so that this

comparison could be made. However, teacher performance was variable, with

generalization maintaining for some skills, but not for others.

Second, for Ms. Smith, a bike was introduced during Session 11 during FTCL

intervention. Ms. Smith’s data spiked during this session to above the 80% criterion line.

However, her data then began to fall below the 80% criterion and decreased for several

sessions, making it unlikely that the bike had a large impact on the data at that time. Yet

it is possible that the bike, introduced around the same time as intervention, could have

had an impact on the intervention data rather than the intervention itself, making it more

difficult to infer a causal relation between BST and improvement in FTCL data for Ms.

Smith. Similarly, a ball was introduced during Session 7 for Mr. Parker, was absent for

Session 8, and then present again for Session 9 when FTCL intervention began (it was

then present for the remainder of the sessions with the exception of one session).

However, given Mr. Parker’s performance during baseline and the continuing upward

trend from Session 7 to Session 8 where the ball was present then absent, it is unlikely

that this had a major impact on the data. Yet it is still possible given that when the ball

was missing during Session 15, Mr. Parker’s performance dropped dramatically and

began to climb once the ball was re-introduced. Therefore, similar to Ms. Smith, the

introduction of a new toy for Mr. Parker right around the introduction of FTCL

intervention calls into question whether the toy or the intervention was responsible for the

100
change in data. This again makes it more difficult to infer a causal effect of BST on

improvement in FTCL implementation.

Similarly, an interval timer that the teachers could keep on their person and that

vibrated every 2 min was introduced partway through SLP intervention to make it easier

and less cumbersome for teachers to determine when the 2 min interval had expired. It

was introduced during Session 31 for Ms. Smith and Session 32 for Mr. Parker during

SLP intervention. Upon introduction, Ms. Smith’s performance continued to decline until

Session 33. In contrast, Mr. Parker’s data increased to above the 80% criterion once the

timer was introduced and he immediately met criteria. Therefore, it is possible that the

interval timer had an impact on teacher performance in the SLP intervention condition.

Third, mean fidelity for BST of MT techniques was somewhat low with a wide

range for Ms. Smith’s and Mr. Parker’s training sessions. This was likely due to a number

of reasons. In order to remove redundancies from the training, several changes were

made to the coaching process during individual sessions, resulting in lower procedural

fidelity. Specifically, the second training session was made more informal to increase

teacher comfort and remove redundancies as coaching sessions always immediately

followed initial training sessions (both occurred on the same day). Rather than follow the

procedural fidelity sheet strictly, the researcher allowed the teachers to practice the skills

using a more open and informal style. The researcher provided training as the teacher

practiced the skills with the child, provided modeling when necessary as well as

immediate feedback that focused on both positive performance and areas that needed

improvement. This allowed the coaching session to flow more comfortably and naturally

101
and adhered more to naturalistic teaching strategies rather than following the pre-

determined criteria strictly. Ultimately, this resulted in lower procedural fidelity, but all

principles and pieces of behavioral skills training (instruction, modeling, rehearsal, and

in-situ feedback) were still utilized and followed accordingly during both training

sessions.

Fourth, frequency for the data collection sessions was not consistent. In the

beginning of the study, data were collected two to three times per week. However, as the

study progressed, toward the end of TSR and throughout SLP data collection, data were

often collected once a week but no less than every other week. At times, two sessions

were conducted in one day. This variation was due to conflicts with teacher, child, and

researcher availability and school holidays. Adjustments were made as necessary in order

to accommodate everyone’s schedules.

Finally, each baseline tier ended with the collection of generalization data, with

the exception of FTCL. Therefore, there was no additional baseline session prior to

implementing intervention for each of the remaining tiers (TSR and SLP). This prevented

the researcher from determining whether baseline data continued in the same direction,

level, and trend after a brief interruption (generalization session). This is problematic

because without knowing where baseline data are immediately prior to implementing

intervention, it makes it more difficult to determine whether there were changes in the

data when intervention was introduced. This in turn makes it more difficult to infer a

causal relation as having an additional baseline data session following generalization

would have increased confidence in the presence of a functional relation. Although this is

102
a limitation, data were collected close enough together that it is not likely a drastic threat

to internal validity, with one exception. There was a two-week break in between

generalization and SLP training for Mr. Parker and no additional baseline probes were

obtained before SLP training. Given the addition of extra time in between the last

baseline session and the first intervention session for SLP, these results must be

interpreted with caution.

Conclusion

Despite several limitations, this study demonstrated a causal relation between

BST and teacher-implemented MT for one teacher and a possible causal relation for a

second teacher, indicating that BST can be effectively used to train teachers to implement

the following MT techniques: FTCL, TSR, and SLP. Through the processes of

instruction, modeling, rehearsal, and feedback, we demonstrated that teachers can

effectively learn MT techniques to be implemented in the school setting and that these

skills largely generalized to new sets of toys and maintained across time. This study also

showed that a teacher with no experience in training or instruction could learn the

techniques just as well as a teacher with years of experience teaching and some

experience with training. This is important to note because such a finding suggests the

intervention techniques targeted in this study are accessible to teachers of all experience

levels, which could in turn make the intervention more widely available to larger groups

of students. The current study also expanded the current literature base by demonstrating

that BST can be effectively used to train natural implementers to use NDBI components.

103
Thus, the utility of BST as an intervention package has been further expanded into yet

another area of interventions, further increasing its utility and reach.

104
Table 3.1

Teacher Participant Demographics

Teacher Ms. Smith Mr. Parker


Age 47 20
Race/Ethnicity White White
Gender Female Male
Current Grade Teaching 3-4 3-4
Years of Experience 15 0
Highest Degree Earned Associate’s Degree High School Diploma
Area(s) of Certification Paraprofessional None
Certified Tutor

105
Table 3.2

Child Participant Demographics

Pre-Intervention Post-Intervention
Sarah Josh Sarah Josh
Age 8 8
Race/Ethnicity White Black
Gender Female Male
SB-5 AB Full Scale IQ (Percentile) <0.1 <0.1 N/A N/A
SB-5 AB Verbal IQ (Percentile) <0.1 <0.1 N/A N/A
SB-5 AB Nonverbal IQ (Percentile) <0.1 <0.1 N/A N/A
CDI: # of Phrases Understood (Raw Score) 28/28 18/28 27/28 22/28
CDI: # of Words Understood (Raw Score) 307/396 380/396 370/396 392/396
CDI: # of Words Produced (Raw Score) 0/396 166/396 16/396 201/396
CDI: # of Early Gestures (Raw Score) 16/18 6/18 17/18 6/18
CDI: # of Later Gestures (Raw Score) 43/45 27/45 42/45 27/45
CDI: Total # of Gestures (Raw Score) 59/63 33/63 59/63 33/63
Note. GPP = Growth & Performance Plan; SB-5 AB = Stanford-Binet Intelligence

Scales, Fifth Edition, Abbreviated Battery; CDI = MacArthur Bates Communicative

Development Inventory

106
Table 3.3

IOA Agreement by Tier and Condition

FTCL TSR SLP


T1 Baseline Average (Range) 67% (56-78) 93% (83-100) 100% (97-100)
T1 Intervention Average (Range) 85% (79-93) 92% (88-95) 89% (81-96)
T1 Generalization Average (Range) 81% (66-90) 78% (64-86) 99% (97-100)
T2 Baseline Average (Range) 85% (85-85) 100% (100-100) 100% (97-100)
T2 Intervention Average (Range) 82% (63-93) 89% (86-91) 86% (80-91)
T2 Generalization Average (Range) 81% (73-86) 100% (99-100) 96% (91-100)
Note. T1 = Ms. Smith; T2 = Mr. Parker; FTCL = Following the Child’s Lead; TSR =

Teaching Social Routines; SLP = System of Least Prompts.

107
Figure 3.1

Accurate Use of the Prompting Hierarchy

Response
Requested (Time
Delay)

No/Incorrect
Correct Response
Response

Vocal Mand
Prompt: "What do Praise!
you want?"

No/Incorrect
Correct Response Repeat
Response

Vocal Mand
Model Prompt:
Praise!
"X. You want X.
Say X."

No/Incorrect
Attempt Response
Response

Move on (praise
any attempt to
Praise!
say the target
word)

Repeat Repeat

108
Figure 3.2

Fidelity of Implementation Across Behaviors for Ms. Smith

Baseline Milieu Teaching Maintenance


100
Bike
90
Percentage of Intervals with

Introduced
80
70
60
FTCL

50
Booster
40
Session
30
20
10
0
100
90
Percentage of Intervals with

80
70
60
TSR

50
40
30
20
10
0
100
90
Percentage of SLP Used

80
70
Correctly

60
50 Timer
40 introduced
30 5-week
20 2-week 2-week
break
10 break break
0
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
Session

Note. Circles represent the percentage of intervals with FTCL and TSR implemented

correctly and the percentage of SLP prompts implemented correctly. Triangles represent

109
generalization data and filled in squares represent maintenance data. Slashes on the graph

represent breaks in data collection of more than one week. FTCL = following the child’s

lead; TSR = teaching social routines; SLP = system of least prompts.

110
Figure 3.3

Fidelity of Implementation Across Behaviors for Mr. Parker

100 Baseline Milieu Teaching Maintenance


90
Percentage of Intervals with

80
70
60 Ball
Introduced
FTCL

50 Booster
40 Session
Ball Missing
30
20
10
0
100
90
Percentage of Intervals with

80
70
60 Booster
Session
TSR

50
40
30
20
10
0
100
90
Percentage of SLP Used

80
70
Correctly

60
50
40
30
20 2-week 5-week 2-week Timer
10 break break break introduced
0
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
Session

Note. Circles represent the percentage of intervals with FTCL and TSR implemented

correctly and the percentage of SLP prompts implemented correctly. Triangles represent

111
generalization data and filled in squares represent maintenance data. Slashes on the graph

represent breaks in data collection of more than one week. FTCL = following the child’s

lead; TSR = teaching social routines; SLP = system of least prompts.

112
CHAPTER 4

GENERAL DISCUSSION

The prevalence of children demonstrating language delays is quite high, with

community prevalence estimates of 7% to 17% (King et al., 2005). However, this has

been shown to be even higher among children with disabilities, with approximately 70%

of three to five year olds with co-occurring disabilities and language impairment

(Wetherby & Prizant, 1992). Furthermore, research indicates that speech and language

impairments often co-occur with neurodevelopmental disabilities such as autism

spectrum disorder (ASD; Rosenbaum & Simon, 2016). Language impairments have also

been shown to be associated with issues in cognitive, academic, and language

development as children grow (Johnson et al., 1999). Children with ASD are one group

of children who often exhibit language delays. Many children with ASD benefitted from

the use of discrete trial teaching (DTT) when it was first introduced by Lovaas (1987),

but Schreibman et al. (2015) identified several weaknesses of DTT that may make it

more difficult for some children, such as those with ASD, to benefit from DTT. Thus,

researchers began to look for other means to teach children with ASD language skills,

and to help close the language gap between them and their peers. Therefore, researchers

began to turn to a different, more naturalistic intervention type known as naturalistic

developmental behavioral interventions (NDBIs; Schreibman et al., 2015). These

interventions encouraged a more naturalistic approach to language intervention and also

encouraged the use of natural implementers in the natural environment, including

teachers (Peterson, 2004). Peterson (2004) indicated that teachers spend a great deal of

113
time with children throughout the school day and the school year. This stands to reason

that if they could be taught to implement various naturalistic language interventions, the

dosage of such interventions for children experiencing language delays could increase

tremendously, giving these children more opportunity to benefit from intervention. As

such, the overall goal of the two studies presented was to provide a systematic literature

review of the use of BST to train natural implementers including teachers and/or other

professional staff and to provide an empirical investigation into the use of BST to train

teachers to implement NDBIs.

The purpose of the systematic literature review was to analyze and synthesize the

current literature on the use of BST by teachers and other professional staff to implement

a variety of interventions. It sought to guide research and clinical decision making

regarding whether BST was the appropriate training package regarding type of

intervention, outcome variables, population, context, and quality of published studies. A

total of 19 studies from 17 articles were reviewed; however, only seven studies showed

sufficient quality and rigor for their results to be interpreted with confidence. Similarly,

less than half of the studies measured generalization, maintenance, or BST fidelity, or

described participant characteristics sufficiently, demonstrating several weaknesses in the

reviewed literature. However, several strengths were also demonstrated in the reporting

of social/ecological validity, the description of dependent variables and conditions, and

strong reliability data. Similarly, all 19 studies reported using all components of BST,

including instructions, modeling, rehearsal and feedback indicating that although BST

fidelity was not reported, all BST components were included in each of the reviewed

114
studies. In addition, very few studies examined BST and NDBIs. Overall, the systematic

review revealed a need for more BST research in which BST fidelity is measured,

participant characteristics are described, and the use of BST with NDBIs is examined.

The purpose of the empirical investigation was to help fill some of these gaps in

the BST literature by examining the use of BST to train natural implementers (teachers)

to implement the components of milieu teaching (MT). This was accomplished through

the use of a multiple baseline design across behaviors. In this study, two teachers working

in a language classroom in a school for children with disabilities were taught to

implement several MT techniques, including following the child’s lead (FTCL), teaching

social routines (TSR), and the system of least prompts (SLP). Each of these behaviors

were taught using the four components of BST. For the first teacher, a functional relation

was shown between BST and the improvement in performance in the fidelity of

implementation of MT techniques. These results were replicated across the second

teacher for two of the three behaviors (FTCL and TSR), but not for SLP. Thus, these

results showed that BST can be used to effectively train teachers to implement

components of MT with fidelity. In addition, these effects generalized to a new set of

toys and maintained across time at two and four-week follow-ups.

Taken together, these studies show that BST can be used to train teachers and

other professionals to implement a variety of interventions, including NDBIs. However,

more research is needed in the area of training teachers to implement NDBIs using BST.

It is promising indeed that teachers can be taught to implement NDBIs in the classroom;

however, even in the current study, a pull-out system was used where the teacher and

115
child were separated from the general classroom, and teachers were taught to implement

the intervention in 1:1 settings. In order for the intervention to truly have increased reach

and dosage, more research is needed to determine how these interventions can be

incorporated into daily classroom routines. As a field, we must make them feasible

enough for teachers to use them without detracting from their regular classroom duties

and teaching. Only then will we truly have the chance to see the impacts of teachers as

natural implementers on a day-to-day basis.

116
References

Adams, R. A., Plercy, F. P., Jurich, J. A., & Lewis, R. A. (1992). Components of a model

adolescent AIDS/drug abuse prevention program: A delphi study. Family Relations,

41, 312–317.

Aherne, C. M., & Beaulieu, L. (2019). Assessing long-term maintenance of staff

performance following behavior skills training in a home-based setting. Behavioral

Interventions, 34(1), 79–88. https://doi.org/10.1002/bin.1642

Aktas, B., & Ciftcitekinarslan, I. (2018). The effectiveness of parent training a mothers of

children with Autism use of mand model techniques. International Journal of Early

Childhood Special Education, 10(2), 106–120.

https://doi.org/10.20489/INTJECSE.512378

Alden, L., Safran, J., & Weideman, R. (1978). Comparison of cognitive and skills

training strategies in the treatment of unassertive clients. Behavior Therapy, 9(5),

843–846. https://doi.org/10.1016/S0005-7894(78)80015-X

Bolzani Dinehart, L. H., Yale Kaiser, M., & Hughes, C. R. (2009). Language delay and

the effect of milieu teaching on children born cocaine exposed: A pilot study.

Journal of Developmental and Physical Disabilities, 21(1), 9–22.

https://doi.org/10.1007/s10882-008-9122-8

Boyer, C. B., & Kegeles, S. M. (1991). AIDS risk and prevention among adolescents.

Social Science & Medicine, 33(1), 11–23. https://doi.org/10.1016/0277-

9536(91)90446-J

Bromberg, D. S., & Johnson, B. T. (1997). Behavioral versus traditional approaches to

117
prevention of child abduction. School Psychology Review, 26(4), 1–13.

Chazin, K. T., Barton, E. E., Ledford, J. R., & Pokorski, E. A. (2018). Implementation

and Intervention Practices to Facilitate Communication Skills for a Child With

Complex Communication Needs. Journal of Early Intervention, 40(2), 138–157.

https://doi.org/10.1177/1053815118771397

Christensen-Sandfort, R. J., & Whinnery, S. B. (2013). Impact of milieu teaching on

communication skills of young children with autism spectrum disorder. Topics in

Early Childhood Special Education, 34(4), 211–222.

https://doi.org/10.1177/0271121411404930

Davenport, C. A., Alber-Morgan, S. R., & Konrad, M. (2019). Effects of behavioral skills

training on teacher implementation of a reading racetrack intervention. Education

and Treatment of Children, 42(3), 385–407. https://doi.org/10.1353/etc.2019.0018

DiGennaro Reed, F. D., Blackman, A. L., Erath, T. G., Brand, D., & Novak, M. D.

(2018). Guidelines for Using Behavioral Skills Training to Provide Teacher Support.

TEACHING Exceptional Children, 50(6), 373–380.

https://doi.org/10.1177/0040059918777241

Dogan, R. K., King, M. L., Fischetti, A. T., Lake, C. M., Mathews, T. L., & Warzak, W.

J. (2017). Parent-implemented behavioral skills training of social skills. Journal of

Applied Behavior Analysis, 50(4), 805–818. https://doi.org/10.1002/jaba.411

Dubin, A., & Lieberman-Betz, R. (2020). Naturalistic interventions to improve

prelinguistic communication for children with autism spectrum disorder: A

systematic review. Review Journal of Autism and Developmental Disorders, 7, 151–

118
167. https://doi.org/https://doi.org/10.1007/s40489-019-00184-9

Dunn, L. M., & Dunn, L. M. (1997). Peabody Picture Vocabulary Test--Third Edition

examiner’s manual (3rd ed.). Circle Pines: American Guidance Service.

Fenson, L., Marchman, V. A., Thal, D. J., Dale, P. S., Reznick, J. S., & Bates, E. (2007).

MacArthur-Bates Communicative Development Inventories: User’s guide and

Technical Manual (2nd ed.). Baltimore, MD: Paul H. Brookes Publishing Co., Inc.

Fetherston, A. M., & Sturmey, P. (2014). The effects of behavioral skills training on

instructor and learner behavior across responses and skill sets. Research in

Developmental Disabilities, 35(2), 541–562.

https://doi.org/10.1016/j.ridd.2013.11.006

Fey, M. E. (2008). Milieu communication teaching intervention manual. Department of

Hearing and Speech.

Fey, M. E., Warren, S. F., Bredin-Oja, S. L., & Yoder, P. J. (2017). Responsivity

education/prelinguistic milieu teaching. In R. B. McCauley, R. J., Fey, M. E., &

Gillam (Ed.), Treatment of Language Disorders in Children (2nd ed., pp. 57–85).

Baltimore, MD: Paul H. Brookes Publishing Co., Inc.

Gianoumis, S., Seiverling, L., & Sturmey, P. (2012). The effects of behavior skills

training on correct teacher implementation of natural language paradigm teaching

skills and child behavior. Behavioral Interventions, 27, 57–74.

https://doi.org/10.1002/bin.1334

Giles, A., Swain, S., Quinn, L., & Weifenbach, B. (2018). Teacher-Implemented

Response Interruption and Redirection: Training, Evaluation, and Descriptive

119
Analysis of Treatment Integrity. Behavior Modification, 42(1), 148–169.

https://doi.org/10.1177/0145445517731061

Glasgow, R. E., & Lichtenstein, E. (1987). Long-term effects of behavioral smoking

cessation interventions. Behavior Therapy, 18(4), 297–324.

https://doi.org/10.1016/S0005-7894(87)80002-3

Hassan, M., Simpson, A., Danaher, K., Haesen, J., Makela, T., & Thomson, K. (2018).

An evaluation of behavioral skills training for teaching caregivers how to support

social skill development in their child with autism spectrum disorder. Journal of

Autism and Developmental Disorders, 48(6), 1957–1970.

https://doi.org/10.1007/s10803-017-3455-z

Hassan, M., Thomson, K. M., Khan, M., Burnham Riosa, P., & Weiss, J. A. (2017).

Behavioral skills training for graduate students providing cognitive behavior therapy

to children with autism spectrum disorder. Behavior Analysis: Research and

Practice, 17(2), 155–165. https://doi.org/10.1037/bar0000078

Hemmeter, M. L., & Kaiser, A. P. (1994). Enhanced milieu teaching: Effects of parent-

implemented language intervention. Journal of Early Intervention, 18(3), 269–289.

https://doi.org/10.1177/105381519401800303

Hendrickson, J. M., Gardner, N., Kaiser, A., & Riley, A. (1993). Evaluation of a social

interaction coaching program in an integrated day-care setting. Journal of Applied

Behavior Analysis, 26(2), 1297740. https://doi.org/10.1901/jaba.1993.26-213

Himle, M. B., & Miltenberger, R. G. (2004). Preventing unintentional firearm injury in

children: The need for behavioral skills training. Education and Treatment of

120
Children, 27(2), 161–177. Retrieved from https://www.jstor.org/stable/42899794

Hogan, A., Knez, N., & Kahng, S. W. (2015). Evaluating the Use of Behavioral Skills

Training to Improve School Staffs’ Implementation of Behavior Intervention Plans.

Journal of Behavioral Education, 24(2), 242–254. https://doi.org/10.1007/s10864-

014-9213-9

Homlitas, C., Rosales, R., & Candel, L. (2014). A further evaluation of behavioral skills

training for implementation of the picture exchange communication system. Journal

of Applied Behavior Analysis, 47(1), 198–203. https://doi.org/10.1002/jaba.99

Ingersoll, B., Meyer, K., Bonter, N., & Jelinek, S. (2012). A comparison of

developmental social-pragmatic and naturalistic behavioral interventions on

language use and social engagement in children with autism. Journal of Speech,

Language, and Hearing Research, 55, 1301–1313.

Iwata, B. A., Wallace, M. D., Kahng, S. W., Lindberg, J. S., Roscoe, E. M., Conners, J.,

… Worsdell, A. S. (2000a). Skill acquisition in the implementation of functional

analysis methodology. Journal of Applied Behavior Analysis, 33(2), 181–194.

Retrieved from https://www.ncbi.nlm.nih.gov/pubmed/14702451

Iwata, B. A., Wallace, M. D., Kahng, S. W., Lindberg, J. S., Roscoe, E. M., Conners, J.,

… Worsdell, A. S. (2000b). Skill acquisition in the implementation of functional

analysis methodology. Journal of Applied Behavior Analysis, 33(2), 181–194.

https://doi.org/10.1901/jaba.2000.33-181

Janzen, H. L., Obrzut, J. E., & Marusiak, C. W. (2004). Test Review: Roid, G. H. (2003).

Stanford-Binet Intelligence Scales, Fifth Edition (SB:V). Itasca, IL: Riverside

121
Publishing. Canadian Journal of Psychology, 19(1), 235–244.

https://doi.org/10.1177/082957350401900113

Jimenez-Gomez, C., McGarry, K., Crochet, E., & Chong, I. M. (2019). Training

behavioral technicians to implement naturalistic behavioral interventions using

behavioral skills training. Behavioral Interventions, 34(3), 396–404.

https://doi.org/10.1002/bin.1666

Johnson, B. M., Miltenberger, R. G., Egemo-Helm, K., Jostad, C. M., Flessner, C., &

Gatheridge, B. (2005). Evaluation of behavioral skills training For teaching

abduction-prevention skills to young children. Journal of Applied Behavior

Analysis, 38(1), 67–78. https://doi.org/10.1901/jaba.2005.26-04

Johnson, B. M., Miltenberger, R. G., Knudson, P., Egemo-Helm, K., Kelso, P., Jostad,

C., & Langley, L. (2006). A preliminary evaluation of two behavioral skills training

procedures for teaching abduction-prevention skills to school children. Journal of

Applied Behavior Analysis, 39(1), 25–34. https://doi.org/10.1901/jaba.2006.167-04

Johnson, C. J., Beitchman, J. H., Young, A., Escobar, M., Atkinson, L., Wilson, B., …

Wang, M. (1999). Fourteen-year follow-up of children with and without

speech/language impairments: Speech/language stability and outcomes. Journal of

Speech, Language, and Hearing Research, 42(3), 744–760.

https://doi.org/10.1044/jslhr.4203.744

Jones, E. A., Carr, E. G., & Feeley, K. M. (2006). Multiple effects of joint attention

intervention for children with autism. Behavior Modification, 30(6), 782–834.

https://doi.org/10.1177/0145445506289392

122
Jull, S., & Mirenda, P. (2016). Effects of a staff training program on community

instructors’ ability to teach swimming skills to children with autism. Journal of

Positive Behavior Interventions, 18(1), 29–40.

https://doi.org/10.1177/1098300715576797

Kaale, A., Smith, L., & Sponheim, E. (2012). A randomized controlled trial of preschool-

based joint attention intervention for children with autism. Journal of Child

Psychology and Psychiatry and Allied Disciplines, 53(1), 97–105.

https://doi.org/10.1111/j.1469-7610.2011.02450

Kaiser, A. P., Hester, P. P., Alpert, C. L., & Whiteman, B. C. (1995). Preparing parent

trainers: An experimental analysis of effects on trainers, parents, and children.

Topics in Early Childhood Special Education, 15(4), 385–414.

https://doi.org/10.1177/027112149501500401

Kaiser, A. P., Ostrosky, M. M., & Alpert, C. L. (1993). Training Teachers to Use

Environmental Arrangement and Milieu Teaching with Nonvocal Preschool

Children. Research and Practice for Persons with Severe Disabilities, 18(3), 188–

199. https://doi.org/10.1177/154079699301800305

King, T. M., Rosenberg, L. A., Fuddy, L., McFarlane, E., Sia, C., & Duggan, A. K.

(2005). Prevalence and early identification of language delays among at-risk three

year olds. Journal of Developmental and Behavioral Pediatrics, 26(4), 293–303.

https://doi.org/10.1097/00004703-200508000-00006

Koegel, R. L., Russo, D. C., & Rincover, A. (1977). Assessing and training teachers in

the generalized use of behavior modification with autistic children. Journal of

123
Applied Behavior Analysis, 10(2), 197–205. https://doi.org/10.1901/jaba.1977.10-

197

Kolko, D., Watson, S., & Faust, J. (1991). Fire safety prevention skills training to reduce

involvement with fire in young psychiatric inpatients: Preliminary findings.

Behaviour Therapy, 22, 269–284. https://doi.org/10.1016/S0005-7894(05)80182-0

Kornacki, L. T., Ringdahl, J. E., Sjostrom, A., & Nuernberger, J. E. (2013). A component

analysis of a behavioral skills training package used to teach conversation skills to

young adults with autism spectrum and other developmental disorders. Research in

Autism Spectrum Disorders, 7(11), 1370–1376.

https://doi.org/10.1016/j.rasd.2013.07.012

Krumhus, K. M., & Malott, R. W. (1980). The effects of modeling and immediate and

delayed feedback in staff training. Journal of Organizational Behavior

Management, 2(4), 279–293. https://doi.org/10.1300/J075v02n04_05

Lane, J. D., & Ledford, J. R. (2014). Using interval-based systems to measure behavior in

early childhood special education and early intervention. Topics in Early Childhood

Special Education, 34(2), 83–93. https://doi.org/10.1177/0271121414524063

Law, J., & Roy, P. (2008). Parental report of infant language skills: A review of the

development and application of the communicative development inventories. Child

and Adolescent Mental Health, 13(4), 198–206. https://doi.org/10.1111/j.1475-

3588.2008.00503.x

Lawrence, J. S., Brasfield, T. L., Jefferson, K. W., Alleyne, E., O’Bannon, R. E., &

Shirley, A. (1995). Cognitive-behavioral intervention to reduce African American

124
adolescents’ risk for HIV infection. Journal of Consulting and Clinical Psychology,

63(2), 221–237. Retrieved from

http://onlinelibrary.wiley.com/o/cochrane/clcentral/articles/124/CN-

00114124/frame.html

Ledford, J. R., Lane, J. D., Zimmerman, K. N., Chazin, K. T., & Ayres, K. A. (2016).

Single case analysis and review framework (SCARF). Retrieved from

http://vkc.mc.vanderbilt.edu/ebip/scarf/

Lovaas, O. I. (1987). Behavioral Treatment and Normal Educational and Intellectual

Functioning in Young Autistic Children. Journal of Consulting and Clinical

Psychology, 55(1), 3–9. https://doi.org/10.1037/0022-006x.55.1.3

Madzharova, M. S., & Sturmey, P. (2018). Using in-vivo modeling and feedback to teach

classroom staff to implement a complex behavior intervention plan. Journal of

Developmental and Physical Disabilities, (30), 329–337.

https://doi.org/10.1007/s10882-018-9588-y

McCathren, R. B. (2000). Teacher-implemented prelinguistic communication

intervention. Focus on Autism and Other Developmental Disabilities, 15(1), 21–29.

https://doi.org/10.1177/108835760001500103

Miles, N. I., & Wilder, D. A. (2009). The effects of behavioral skills training on caregiver

implementation of guided compliance. Journal of Applied Behavior Analysis, 42(2),

405–410. https://doi.org/10.1901/jaba.2009.42-405

Miltenberger, R. G., & Thiesse-Duffy, E. (1988). Evaluation of home‐based programs for

teaching personal safety skills to children. Journal of Applied Behavior Analysis,

125
21(1), 81–87. https://doi.org/10.1901/jaba.1988.21-81

Miltenberger, R., Gross, A., Knudson, P., Bosch, A., Jostad, C., & Breitwieser, C. B.

(2009). Evaluating behavioral skills training with and without simulated in situ

training for teaching safety skills to children. Education and Treatment of Children,

32(1), 63–75. https://doi.org/10.1353/etc.0.0049

Miltenberger, R. G., Flessner, C., Gatheridge, B., Johnson, B., Satterlund, M., & Egemo,

K. (2004). Evaluation of behavioral skills training to prevent gun play in children.

Journal of Applied Behavior Analysis, 37(4), 513–516.

https://doi.org/10.1901/jaba.2004.37-513

Miltenberger, R. G., Gatheridge, B. J., Satterlund, M., Egemo-Helm, K. R., Johnson, B.

M., Jostad, C., … Flessner, C. A. (2005). Teaching safety skills to children to

prevent gun play: An evaluation of in situ training. Journal of Applied Behavior

Analysis, 38(3), 395–398. https://doi.org/10.1901/jaba.2005.130-04

Moher, D., Liberati, A., Tetzlaff, J., & Altman, D. G. (2009). Preferred reporting items

for systematic reviews and meta-analyses: The PRISMA statement (reprinted from

Annals of Internal Medicine). Physical Therapy, 89(9), 873–880.

https://doi.org/10.1371/journal.pmed.1000097

Nabeyama, B., & Sturmey, P. (2010). Using Behavioral Skills Training To Promote Safe

and Correct Staff Guarding and Ambulation Distance of Students With Multiple

Physical Disabilities. Journal of Applied Behavior Analysis, 43(2), 341–345.

https://doi.org/10.1901/jaba.2010.43-341

Nigro-Bruzzi, D., & Sturmey, P. (2010). The effects of behavioral skills training on mand

126
training by staff and unprompted vocal mands by children. Journal of Applied

Behavior Analysis, 43(4), 757–761. https://doi.org/10.1901/jaba.2010.43-757

Nuernberger, J. E., Ringdahl, J. E., Vargo, K. K., Crumpecker, A. C., & Gunnarsson, K.

F. (2013). Using a behavioral skills training package to teach conversation skills to

young adults with autism spectrum disorders. Research in Autism Spectrum

Disorders, 7(2), 411–417. https://doi.org/10.1016/j.rasd.2012.09.004

Ogletree, B. T., Davis, P., Hambrecht, G., & Phillips, E. W. (2012). Using milieu training

to promote photograph exchange for a young child with autism. Focus on Autism

and Other Developmental Disabilities, 27(2), 93–101.

https://doi.org/10.1177/1088357612441968

Olive, M. L., De La Cruz, B., Davis, T. N., Chan, J. M., Lang, R. B., O’Reilly, M. F., &

Dickson, S. M. (2007). The effects of enhanced milieu teaching and a voice output

communication aid on the requesting of three children with autism. Journal of

Autism and Developmental Disorders, 37(8), 1505–1513.

https://doi.org/10.1007/s10803-006-0243-6

Palmen, A., & Didden, R. (2012). Task engagement in young adults with high-

functioning autism spectrum disorders: Generalization effects of behavioral skills

training. Research in Autism Spectrum Disorders, 6, 1377–1388.

https://doi.org/10.1016/j.rasd.2012.05.010

Palmen, A., Didden, R., & Korzilius, H. (2010). Effectiveness of behavioral skills

training on staff performance in a job setting for high-functioning adolescents with

autism spectrum disorders. Research in Autism Spectrum Disorders, 4, 731–740.

127
https://doi.org/10.1016/j.rasd.2010.01.012

Pan-Skadden, J., Wilder, D. A., Sparling, J., Severtson, E., Donaldson, J., Postma, N., …

Neidert, P. (2009). The use of behavioral skills training and in-situ training to teach

children to solicit help when lost: A preliminary investigation. Education &

Treatment of Children, 32(3), 359–370. https://doi.org/10.1353/etc.0.0063

Peterson, P. (2004). Naturalistic language teaching procedures for children at risk for

language delays. The Behavior Analyst Today, 5(4), 404–424.

https://doi.org/10.1037/h0100047

Reynell, J. K., & Gruber, C. P. (1990). Reynell Developmental Language Scales. Los

Angeles: Western Psychlogical Services.

Roid, G. H. (2003). Stanford-Binet Intelligence Scales (5th ed.). Itasca, IL: Riverside

Publishing.

Rosales, R., Stone, K., & Rehfeldt, R. A. (2009). The effects of behavioral skills training

on implementation of the picture exchange communication system. Journal of

Applied Behavior Analysis, 42(3), 541–549. https://doi.org/10.1901/jaba.2009.42-

541

Rosenbaum, S., & Simon, P. (2016). Speech and Language Disorders in Children:

Implications for the Social Security Administration’s Supplemental Security Income

Program. Speech and Language Disorders in Children: Implications for the Social

Security Administration’s Supplemental Security Income Program.

https://doi.org/10.17226/21872

Sarokoff, R. A., & Sturmey, P. (2008). The effects of instructions, rehearsal, modeling,

128
and feedback on acquisition and generalization of staff use of discrete trial teaching

and student correct responses. Research in Autism Spectrum Disorders, 2(1), 125–

136. https://doi.org/10.1016/j.rasd.2007.04.002

Sarokoff, R. A., & Sturmey, P. (2004). The effects of behavioral skills training on staff

implementation of discrete-trial teaching. Journal of Applied Behavior Analysis,

37(4), 535–538. https://doi.org/10.1901/jaba.2004.37-535

Schreibman, L., Dawson, G., Stahmer, A. C., Landa, R., Rogers, S. J., McGee, G. G., …

Halladay, A. (2015). Naturalistic developmental behavioral interventions:

Empirically validated treatments for autism spectrum disorder. Journal of Autism &

Developmental Disorders, 45, 2411–2428. https://doi.org/10.1007/s10803-015-

2407-8

Seiverling, L., Pantelides, M., Ruiz, H. H., & Sturmey, P. (2010). The effect of

behavioral skills training with general case training on staff chaining of child

vocalizations within natural language paradigm. Behavioral Interventions, 25, 53–

75. https://doi.org/10.1002/bin.293

Seiverling, Laura, Williams, K., Sturmey, P., & Hart, S. (2012). Effects of behavioral

skills training on parental treatment of children’s food selectivity. Journal of Applied

Behavior Analysis, 45(1), 197–203. https://doi.org/10.1901/jaba.2012.45-197

Smith, A. E., & Camarata, S. (1999). Using teacher-implemented instruction to increase

language intelligibility of children with autism. Journal of Positive Behavior

Interventions, 1(3), 141–151. https://doi.org/10.1177/109830079900100302

Thorndike, R. L., Hagen, E. P., & Sattler, J. M. (1986). Stanford-Binet Intelligence Scale:

129
Fourth Edition (4th ed.). Chicago: Riverside Publishing.

Toelken, S., & Miltenberger, R. G. (2012). Increasing independence among children

diagnosed with autism using a brief embedded teaching strategy. Behavioral

Interventions, 27, 93–104. https://doi.org/10.1002/bin.337

Togram, B., & Erbas, D. (2010). The effectiveness of instruction on mand model - One of

the milieu teaching techniques. Egitim Arastirmalari - Eurasian Journal of

Educational Research, (38), 198–215.

Ward-Horner, J., & Sturmey, P. (2008). The effects of general-case training and

behavioral skills training on the generalization of parents’ use of discrete-trial

teaching child correct responses, and child maladaptive behavior. Behavioral

Interventions, 23, 271–284. https://doi.org/10.1002/bin.268

Ward-Horner, J., & Sturmey, P. (2012). Component analysis of behavior skills training in

functional analysis. Behavioral Interventions, 27, 75–92.

https://doi.org/10.1002/bin.1339

Warren, S. F., & Gazdag, G. (1990). Facilitating early language development with milieu

intervention procedures. Journal of Early Intervention, 14(1), 62–86.

https://doi.org/10.1177/105381519001400106

Wechsler, D. (2005). Wechsler Individual Achievement Test 2nd Edition (WIAT II).

London: The Psychological Corp.

Wetherby, A. M., & Prizant, B. M. (1992). Profiling young children’s communicative

competence. In Causes and effects in communication and language intervention.

(pp. 217–253). Baltimore, MD, England: Paul H. Brookes Publishing.

130
Wong, C. S. (2013). A play and joint attention intervention for teachers of young children

with autism: A randomized controlled pilot study. Autism, 17(3), 340–357.

https://doi.org/10.1177/1362361312474723

Woodcock, R.W., McGrew, K.S., Mather, N. (2001). Woodcock – Johnson III Tests of

Achievement (3rd ed.). Itasca, IL: Riverside Publishing.

Woodcock, R. W., McGrew, K. S., & Mather, N. (2001). Woodcock-Johnson III Tests of

Cognitive Abilities (3rd ed.). Itasca, IL: Riverside Publishing.

Wright, C. A., & Kaiser, A. P. (2017). Teaching parents enhanced milieu teaching with

words and signs using the teach-model-coach-review model. Topics in Early

Childhood Special Education, 36(4), 192–204.

https://doi.org/10.1177/0271121415621027

Wurtele, & Owens, J. S. (1997). Teaching personal safety skills to young children: An

investigation of age and gender across five studies. Child Abuse and Neglect, 21(8),

805–814. https://doi.org/10.1016/S0145-2134(97)00040-9

Wurtele, S. K. (1990). Teaching personal safety skills to four-year-old children: A

behavioral approach. Behavior Therapy, 21(1), 25–32.

https://doi.org/10.1016/S0005-7894(05)80186-8

Wurtele, S. K., Saslawsky, D. A., Miller, C. L., Marrs, S. R., & Britcher, J. C. (1986).

Teaching personal safety skills for potential prevention of sexual abuse: A

comparison of treatments. Journal of Consulting and Clinical Psychology, 54(5),

688–692. https://doi.org/10.1037/0022-006X.54.5.688

Zimmerman, I. L., Steiner, V. G., & Pond, R. (1979). The Preschool Language Scale--

131
Revised. Columbus: Charles Merrill.

132
Appendix A: Outcome Coding Descriptions for Single Case Analysis Review and
Framework (SCARF)

Primary Outcomes
1. Which best characterizes the study's effects? This framework is designed for
analysis of SINGLE STUDIES. Articles may include multiple studies; these
should be evaluated separately. A study is a stand-alone single case design with
a single dependent variable. Studies may include a single or multiple
participants. For ATD studies, assess each condition in comparison to single
other condition, if these comparisons match your research questions. Note:
Strong effects occur when consistent changes occur between conditions,
overlap is minimal and/or decreasing over time, and there is a clear change in
the expected direction in level, change, and/or variability. Weak effects occur
when one or more of those characteristics is not present. Non effects occur
when data do not reliably change when condition change occurs, or when data
patterns preclude decision-making. Contratherapeutic effects occur when data
changes in a non-expected direction.
a. ATD Designs: Enter 0: data paths undifferentiated, approximately half or
more of data paths are overlapping (approximately the same values or with
some higher values in one condition and some higher values in another
condition). Enter 1: approximately half or more data are overlapping as
described above, but overlap decreases over time. Enter 2: less than half of
data points are overlapping but there is a decreasing or variable
differentiation between conditions (e.g., difference in values between
conditions decreases over time or is not consistent). Enter 3: less than half
of data points are overlapping and there is increasing differentiation over
time (e.g., difference in values between conditions increases over time).
Enter 4: minimal/no overlap occurs, consistent differentiation between
conditions].
b. MB/MP Design: Enter 0: >1 non-effect or any contratherapeutic effect or if
vertical analysis suggests changes in data in one tier is associated with
condition change in another tier. Enter 1: <3 demonstrations of effect, 1
non-effect. Enter 2: >=3 demonstrations, >=1 non-effect. Enter 3: >=3
demonstrations, >=1 weak effects, 0 non-effect. Enter 4: >=3
demonstrations, 0 non-effects/weak effect.
c. Other Designs: :Enter 0: >1 non-effect or any contratherapeutic effect.
Enter 1: <3 demonstrations of effect, 1 non-effect. Enter 2: >=3
demonstrations, >=1 non-effect. Enter 3: >=3 demonstrations, >=1 weak
effects, 0 non-effect. Enter 4: >=3 demonstrations, 0 non-effects/weak
effect.

Generalized Outcomes
1. Which best characterizes the generalization outcomes in the study?

133
a. Enter 0: no measurement of generalization outcomes. Enter 1: consistent
non-effects or contratherapeutic effects. Enter 2: inconsistent or weak
positive effects. Enter 3: consistent positive effects shown via post-test.
Enter 4: consistent positive effects shown via measurement in context of
design

Maintained Outcomes
1. Which of the following best characterizes maintenance outcomes for the study?
a. Enter 0: maintenance was not assessed. Enter 1: maintenance data were
similar to pre-intervention/baseline data. Enter 2: maintenance data showed
outcomes that were deteriorating or less optimal than intervention or
criterion Enter 3: maintenance data showed maintained outcomes similar to
intervention or criterion levels. Enter 4: maintenance data showed
maintained outcomes similar to intervention or criterion levels and on
multiple occasions (e.g., more than one data point)

134
Appendix B: Observation Data Collection Sheet

135
Appendix C: Intervention Fidelity Sheets
Date: ______ Workshop #: _______ Session #: _________

Teacher ID: __________ Observer: ___________

Workshop 1 Session 1: Following the Child’s Lead

Following the Child’s Lead


Overview /7
1. What does it mean to follow the child’s lead?
2. Choosing toys/materials
3. Imitating children’s actions
4. Parallel play
5. Commenting on the child’s play
6. Things to Avoid
a. Suggestions
b. Questions
c. Commands
7. Responding to child interactions
Video examples (at least 2) /2
How to structure the environment to create opportunities for following the /1
child’s lead
Review
1. How to follow the child’s lead /5
2. Toy/Material Selection
3. Commenting
4. Parallel Play
5. Avoiding directive comments
How to structure the environment to create opportunities for following the /1
child’s lead
Researcher Model Session
Researcher models allowing the child to select the toys of interest /1
Researcher models how to engage in parallel play /1
Research models how to comment on child’s play /1
Researcher models how to respond to child’s interactions /1
Teacher Practice Session
Teacher practices allowing the child to select the toys with the researcher /1
Teacher practices parallel play with the researcher /1
Teacher practices commenting on play with the researcher /1
Teacher practices responding to interactions with the researcher /1
Ending Workshop
At the end of the session the researcher asks the teacher how he/she felt the /1
session went

136
Researcher summarizes how the teacher utilized following the child’s lead /1
Researcher asks the teacher whether he/she felt the session length was /1
appropriate for learning
Total /27

137
Date: _______ Workshop #: ______ Session #: ________

Teacher ID: ______________ Observer: __________

Workshop 1 Session 2: Following the Child’s Lead

Researcher Model Session- Live in Classroom


Researcher models choosing toys/materials with child /1
Researcher models imitating child’s actions /1
Researcher models parallel play with child /1
Researcher models commenting on child’s play /1
Teacher Practice Session- Live in Classroom
Teacher practices choosing toys/materials child while researcher provides /3
feedback (at least 3 times)
Teacher practices imitating child’s actions while researcher provides feedback /3
(at least 3 times)
Teacher practices parallel play with child while researcher provides feedback /3
(at least 3 times)
Teacher practices commenting on child’s play while researcher provides /3
feedback at least 3 times
Proficiency Assessment- Live in Classroom
Teacher demonstrates 80% proficiency with choosing toys/materials (4/5 /1
consecutive trials)
Teacher demonstrates 80% proficiency with imitating child’s actions (4/5 /1
consecutive trials)
Teacher demonstrates 80% proficiency with parallel play (4/5 consecutive /1
trials)
Teacher demonstrates 80% proficiency with commenting on child’s play (4/5 /1
consecutive trials)
Total /20

138
Date: _______ Workshop #: ______ Session #: _________

Teacher ID: _______________ Observer: ___________

Workshop 2 Session 1: Teaching Social Routines

Teaching Social Routines


Overview /6
8. What is a social routine?
a. How to identify appropriate routines
9. Choosing toys/materials the child finds interesting
a. Toy selection rules
10. How to establish a routine
a. Imitating play
b. Imitating actions
c. Engage the child
11. How to use the routine to insert a turn
12. What is a shared interaction?
a. How to create a shared interaction
b. Commands
13. Responding to child interactions
Video examples (at least 2) /2
How to structure the environment to create opportunities for establishing /1
social routines
Review
Overview /6
6. What is a social routine/how to identify routines?
7. Choosing toys/materials the child finds interesting
8. How to establish a routine
9. How to use the routine to insert a turn
10. What is a shared interaction/how to create one?
11. Responding to child interactions
How to structure the environment to create opportunities for establishing /1
social routines
Researcher Model Session
Researcher models how to identify appropriate routines /1
Researcher models how to choose toys /1
Research models how to establish a routine /1
Researcher models how to use the routine to insert a turn /1
Researcher models how to respond to child interactions /1
Teacher Practice Session
Teacher practices how to identify appropriate routines with researcher /1
Teacher practices how to choose toys with researcher /1
Teacher practices how to establish a routine with researcher /1
139
Teacher practices how to use the routine to insert a turn with researcher /1
Teacher practices how to respond to child interactions with the researcher /1
Ending Workshop
At the end of the session the researcher asks the teacher how he/she felt /1
the session went
Researcher summarizes how the teacher utilized establishing routines /1
Researcher asks the teacher whether he/she felt the session length was /1
appropriate for learning
Total /29

140
Date: ______ Workshop #: _____ Session #: ________

Teacher ID: ______________ Observer: __________

Workshop 2 Session 2: Teaching Social Routines

Researcher Model Session- Live in Classroom


Researcher models how to identify appropriate routines with child /1
Researcher models how to choose toys with child /1
Research models how to establish a routine with child /1
Researcher models how to use the routine to insert a turn with child /1
Researcher models how to create a shared interaction with child /1
Researcher models how to respond to child interactions with child /1
Teacher Practice Session- Live in Classroom
Teacher practices how to identify appropriate routines with child while researcher /3
provides feedback (at least 3 times)
Teacher practices how to choose toys with child while researcher provides /3
feedback (at least 3 times)
Teacher practices how to establish a routine with child while researcher provides /3
feedback (at least 3 times)
Teacher practices how to use the routine to insert a turn with child while /3
researcher provides feedback (at least 3 times)
Teacher practices how to respond to child interactions with child while researcher /3
provides feedback (at least 3 times)
Proficiency Assessment- Live in Classroom
Teacher demonstrates 80% proficiency with identifying appropriate routines (4/5 /1
consecutive trials)
Teacher demonstrates 80% proficiency with choosing appropriate toys (4/5 /1
consecutive trials)
Teacher demonstrates 80% proficiency with establishing routines (4/5 /1
consecutive trials)
Teacher demonstrates 80% proficiency with inserting a turn into the routine (4/5 /1
consecutive trials)
Teacher demonstrates 80% proficiency with creating a shared interaction (4/5 /1
consecutive trials)
Teacher demonstrates 80% proficiency with inserting a turn into the routine (4/5 /1
consecutive trials)
Teacher demonstrates 80% proficiency with responding to child interactions (4/5 /1
consecutive trials)
Total /28

141
Date: _______ Workshop #: _____ Session #: _________

Teacher ID: _______________ Observer: _______

Workshop 3 Session 1: Systematic Use of Prompts

Systematic Use of Prompts


Time Delay /6
1. Overview of constant time delay
2. When to use it
3. At least 3 examples of time delay presented (3 points)
4. Video model
Linguistic Prompts /6
1. Overview of linguistic prompts
2. Variety of linguistic prompts presented
3. At least 3 examples of linguistic prompts (3 points)
4. Video model
Non-Linguistic Prompts /6
1. Overview of non-linguistic prompts
2. Variety of non-linguistic prompts presented (including model)
3. At least 3 examples of non-linguistic model prompts (3 points)
4. Video model
How to structure the environment to create opportunities for prompting /1
Review
Types of prompts: /3
12. Time Delay
13. Linguistic Prompts
14. Non-Linguistic Prompts
Structuring the environment /1
Video quiz /1
Researcher Model Session
Researcher models time delay for teacher /1
Researcher models linguistic prompts for teacher /1
Researcher models non-linguistic model prompts for teacher /1
Teacher Practice Session
Teacher practices time delay with researcher (at least 3 times) /3
Teacher practices linguistic prompts with researcher (at least 3 times) /3
Teacher practices non-linguistic model prompts with researcher (at least 3 /3
times)
Ending Workshop
At the end of the session the researcher asks the teacher how he/she felt the /1
session went
Researcher summarizes how the teacher utilized each prompt in the hierarchy /1

142
Researcher asks the teacher whether he/she felt the session length was /1
appropriate for learning
Total /39

143
Date: _______ Workshop #: ______ Session #: ________

Teacher ID: ______________ Observer: ___________

Workshop 3 Session 2: Systematic Use of Prompts

Researcher Model Session- Live in Classroom


Researcher models time delay with child /1
Researcher models linguistic prompts with child /1
Researcher models non-linguistic model prompts with child /1
Teacher Practice Session- Live in Classroom
Teacher practices time delay with child while researcher provides feedback (at /3
least 3 times)
Teacher practices linguistic prompts with child while researcher provides /3
feedback (at least 3 times)
Teacher practices non-linguistic prompts with child while researcher provides /3
feedback (at least 3 times)
Teacher practices moving through the prompting hierarchy with child while /3
researcher provides feedback at least 3 times
Proficiency Assessment- Live in Classroom
Teacher demonstrates 80% proficiency with time delay (4/5 consecutive /1
trials)
Teacher demonstrates 80% proficiency with linguistic prompts (4/5 /1
consecutive trials)
Teacher demonstrates 80% proficiency with non-linguistic model prompts /1
(4/5 consecutive trials)
Teacher demonstrates 80% proficiency with non-linguistic physical prompts /1
(4/5 consecutive trials)
Teacher demonstrates 80% proficiency with regard to moving through the /1
prompting hierarchy (4/5 consecutive trials)
Total /20

144
Appendix D: Data Collection Sheets

145
146
Appendix E: Teacher Demographics Form

General Information
Pseudonym:
Age:
Race/Ethnicity:
Gender:
Current Grade Taught and
length of time teaching this
grade:
Previous Grades Taught and Grade Time Taught
length of time teaching each
grade:

Length of Time at The


Bridge:

Types of Classrooms Taught ¨ Special Education


(Check all that apply): ¨ General Education
¨ Other: Please specify
_______________________

147
Please check all disabilities ¨ Autism Spectrum Disorder (ASD)
with which you have ¨ Intellectual Disability
instructional experience: ¨ Down syndrome
¨ Emotional Disturbance (ED)
¨ Physical Handicap
¨ Cerebral Palsy
¨ Specific Learning Disabilities
¨ Other Health Impairment (OHI)
¨ Speech or Language Impairment
¨ Visually Impaired (including blindness)
¨ Hearing Impaired
¨ Deafness
¨ Deaf-Blindness
¨ Traumatic Brain Injury
¨ Attention Deficit/Hyperactivity Disorder
(ADHD)
¨ Social Communication Deficits
How many students are in
your current classroom?

Of the students in your


current class, how many have
a diagnosed disability?

Of the students in your


current class, how many have
a Growth & Performance
Plan?
What is your highest degree ¨ High School Diploma
earned? ¨ Bachelor’s Degree
¨ Master’s Degree
¨ Professional Degree (PhD, MD, JDD)

148
What was your area of study
for your highest degree
earned?

Please list your area(s) of


teacher certification:

What certificate type do you ¨ Pre-Service Certificate


currently hold? ¨ Certificate of Eligibility Certificate
¨ Clearance Certificate
¨ Induction Certificate
¨ Standard Professional Certificate
¨ Standard Performance Based Certificate
¨ Performance-Based Professional Educational
Leadership Certificate
¨ Standard Professional Educational Leadership
¨ Advanced Professional Certificate
¨ Lead Professional Certificate
¨ Not Applicable
Please describe any previous
professional development
experiences in naturalistic
social communication
interventions for children
with ASD:

Please describe any previous


experiences with receiving

149
coaching from an expert to
support you in the classroom:

Describe any strategies you


currently use in your
classroom to support
children’s communication:

150
Appendix F: Child Demographics Form

General Information
Pseudonym:
Date of Birth:
Gender:
Race/Ethnicity:

Diagnoses:

Has he/she received any ¨ Babies Can’t Wait


intervention/services prior to ¨ Private Therapy Services
attending the Bridge? ¨ Other:

Grade:
Number of Years at the
Bridge:
Does he/she have a Growth &
Performance Plan?
Has he//she received any
diagnoses? Please list each
one.

Current services received and ¨ Speech/Language:


frequency:

¨ Occupational Therapy:
151
¨ ABA:

¨ Other:

Where does he/she receive ¨ The Bridge


services? ¨ Private
¨ Other:

Are there social


communication goals in his/her
Growth & Performance Plan?
Or is social communication a
current focus in the classroom?

What are his/her preferred


toys, objects, activities, etc.?

152
Appendix G: Social Validity Measure
Teacher Feedback Questionnaire
Adapted from: Hendrickson, J. M., Gardner, N., Kaiser, A., Riley, A. (1993). Evaluation
of a social interaction coaching program in an integrated day-care setting. Journal of
Applied Behavior Analysis, 26(2), 13-225.

Directions: Please answer the following questions based on your experiences with the
research study. Be sure to answer every question and to only circle one response per
question:

Research Strongly Strongly


Disagree Disagree Agree Agree
Research in schools is important to learn to 1 2 3 4
better teach high-risk children/children with
disabilities
Research in schools is important to learn to 1 2 3 4
better teach all children
Research in schools can improve specific staff 1 2 3 4
teaching skills
Research in schools can improve staff teaching in 1 2 3 4
general

Intervention Effects
I believe the intervention techniques taught in 1 2 3 4
this study can be used in most classrooms to help
integrate high-risk children/ children with
disabilities
My participation in learning intervention 1 2 3 4
techniques was worth my effort
I would share my intervention skills with other 1 2 3 4
teachers
The social communication skills of the student(s) 1 2 3 4
with whom I worked have improved after this
intervention
This intervention was beneficial for the students 1 2 3 4
with whom I worked
This intervention was beneficial for me as a 1 2 3 4
teacher

Social Validity

153
Overall, I believe that my knowledge about social 1 2 3 4
communication and intervention techniques has
improved
I feel more knowledgeable and confident in my 1 2 3 4
ability to help my students improve their social
communication skills after participating in this
intervention
I feel confident in my ability to set up my 1 2 3 4
classroom and activities to help encourage
opportunities for social communication
I feel confident in my ability to incorporate the 1 2 3 4
intervention techniques I have learned into my
daily classroom routines
I believe that the intervention techniques are 1 2 3 4
feasible to implement in the classroom
I believe that the intervention techniques can be 1 2 3 4
easily incorporated into the classroom

Strongly Strongly
Disagree Disagree Agree Agree
I feel confident in my ability to follow the child’s 1 2 3 4
lead
I feel confident in my ability to establish and 1 2 3 4
engage in social routines
I feel confident in my ability to use the system of 1 2 3 4
prompts, including linguistic and non-linguistic
prompts and time delay
I feel confident in my ability to identify 1 2 3 4
opportunities to set up the environment to
encourage social communication
I would participate in a similar project in the 1 2 3 4
future

Training
I felt comfortable during the training sessions 1 2 3 4
The training sessions were tailored to my 1 2 3 4
experience level
I felt comfortable implementing the intervention 1 2 3 4
techniques after the training sessions were
completed

154
I felt comfortable asking questions during the 1 2 3 4
training sessions
I would recommend the training sessions to my 1 2 3 4
colleagues

155

You might also like