Sighn & O'Brien-Quantitaive Review-JCBS

Author’s Accepted Manuscript
A Quantitative Synthesis of Functional Analytic

Psychotherapy Single-Subject Research
R. Sonia Singh, William H. O’Brien
www.elsevier.com/locate/jcbs
PII: S2212-1447(17)30106-0
DOI: https://doi.org/10.1016/j.jcbs.2017.11.004
Reference: JCBS209
To appear in: Journal of Contextual Behavioral Science
Received date: 30 January 2017
Revised date: 23 October 2017
Accepted date: 8 November 2017
Cite this article as: R. Sonia Singh and William H. O’Brien, A Quantitative
Synthesis of Functional Analytic Psychotherapy Single-Subject Research,
Journal of Contextual Behavioral Science,
https://doi.org/10.1016/j.jcbs.2017.11.004
This is a PDF file of an unedited manuscript that has been accepted for
publication. As a service to our customers we are providing this early version of
the manuscript. The manuscript will undergo copyediting, typesetting, and
review of the resulting galley proof before it is published in its final citable form.
Please note that during the production process errors may be discovered which
could affect the content, and all legal disclaimers that apply to the journal pertain.
Running head: QUANTITATIVE SYNTHESIS OF FAP 1
A Quantitative Synthesis of Functional Analytic Psychotherapy Single-Subject Research
R. Sonia Singh, M.A.
William H. O’Brien, Ph.D.
Bowling Green State University
Corresponding author:
R. Sonia Singh
126 Psychology
Bowling Green State University
Bowling, Green OH 43403
rjsingh@bgsu.edu
(832) 530-9465
Highlights
 The current study attempts to summarize FAP single-subject research.

 There is great variation of demographics and treatment targets in FAP research.
 Effect sizes were “questionably – fairly effective,” “significant,” or “large.”
Functional Analytic Psychotherapy (FAP) is a contextual behavioral psychotherapy based
on the principles of behaviorism (e.g., Kohlenberg & Tsai, 1991; Tsai et al., 2009). FAP is
idiographic in nature, meaning that it often focuses on specific behaviors for individual clients
and proposes that the behaviors clients exhibit in sessions with a therapist are an index of
adaptive and problem behaviors that clients display in natural environments. These in-session
behaviors are referred to as Clinically Relevant Behaviors (CRBs). CRBs are divided into three
QUANTITATIVE SYNTHESIS OF FAP 2
categories: CRB1s are problematic behaviors, CRB2s are adaptive behaviors, and CRB3s are the
client’s descriptions of the topography and function his or her behaviors outside of the session.
Further, the FAP therapist acknowledges behavior outside of the context of therapy. For
example, outside of the therapy session problem behaviors (O1s) and outside of the therapy
session adaptive behaviors (O2s) can be targeted for change.
Well-established behavioral principles are used to promote in-session changes in CRB1s,
CRB2s, and CRB3s. This is accomplished by first carefully operationalizing CBR1s and CRB2s.
Following operationalization, the therapist uses reinforcement-based techniques (e.g., social
attention, verbal reinforcement, non-verbal reinforcement) to increase the frequency of CRB2s.
To decrease the frequency of CRB1s the therapist uses a combination of procedures such as
differential reinforcement of other/incompatible behavior, selective non-reinforcement/ignoring,
and verbal redirection (Tsai et al., 2009). In order to do this effectively, the FAP therapist is
trained to be (a) acutely aware CRB occurrences and (b) consistent with providing in-session
experiences that promote the acquisition, shaping, and maintenance of adaptive changes in
CRB1s and CRB2s. FAP utilizes a system of five rules to guide the therapist: (1) watch for
CRBs; (2) evoke CRBs; (3) reinforce CRB2s; (4) assess therapist impact on client behavior; and
(5) evaluate and generalize (for a detailed description of these five rules see Tsai et al., 2009).
Efficacy of FAP
The empirical evidence regarding the efficacy of FAP is limited (Hayes, Masuda,
Bissett, Luoma, & Guerror, 2005). Mangabeira, Kanter, and Del Prette (2012) conducted a
qualitative review of FAP publications from 1990 to 2010. The authors reported that the majority
of articles written about FAP were conceptual rather than empirical. Further, their analysis of
empirical studies indicated that a majority used single-subject data or were uncontrolled case
studies. The authors also noted that a number of these single-subject studies did not include
statistical analyses or graphical representation of treatment effects.
Öst (2008) conducted a meta-analysis of third-wave behavioral therapies including FAP,
Acceptance and Commitment Therapy, Dialectical Behavior Therapy, Cognitive Behavioral
Analysis System of Psychotherapy, and Integrative Behavioral Couple Therapy. The author
noted that FAP did not have any randomized control trials; Therefore, he could not include it in
his analyses. Although this meta-analysis was generally unfavorable of third-wave behavioral
therapies and has received criticism in the field, the noting of the lack of FAP randomized
control trials was understandable and reasonable. Corrigan (2001) and García (2008) also noted
the lack of randomized control trials and criticized FAP for making claims of efficacy without
high quality empirical evidence such as randomized control trials.
Since the aforementioned reviews, one small-sample randomized control trial has been
published by Maitland et al. (2016b). In this study, 11 individuals received a FAP intervention
while another 11 individuals participated in a “watchful waiting” control condition. The
researchers reported that the FAP group, relative to the control group, reported significantly
increased interpersonal functioning (as measured by fear of intimacy) and lower psychological
symptomology. The researchers concluded this was a modest study and that more research is
needed (Maitland et al., 2016b).
In summary, FAP is a contextual behavioral therapy that uses well-established behavioral
principles occurring within the context of the therapeutic relationship to promote adaptive in-
session behavior change that is then intended to generalize to outside of session contexts. FAP
uses five rules to decrease problematic behavior and increase adaptive behavior. The published
empirical evidence examining FAP is limited to one randomized control trial and many single-
subject studies. Additionally, to date, reviews of FAP research have used qualitative methods and
there has been no quantitative synthesis of this literature at this time.
Quantitative Synthesis of Single-Subject Research: Treatment Effect Indices
Single-subject research has been extensively used in several fields (e.g., Barlow, Nock, &
Hersen, 2009; Horner et al., 2005; Kazdin, 1982). Proponents of single-subject research suggest
it is one of the initial steps in identifying evidence-based practices (Horner et al., 2005). Further,
when designed appropriately, single-subject research can validly evaluate causal relationships
between interventions and outcomes (Haynes, O’Brien, & Kaholokula, 2011). Another
advantage of single-subject research is that it can be used to examine unique and/or rare
phenomena as well as idiosyncratic responses to interventions. Importantly, single-subject data
can be quantitatively synthesized (Manalov, Guilera, & Sierra, 2014; Shadish, Hedges, &
Pustejovsky, 2014).
In most applications, meta-analytic techniques are designed to aggregate and analyze
quantitative findings from multiple studies that generate group-based data and inferential
statistical testing. However, single-subject studies typically do not generate group-level data nor
group-level inferential statistical testing; single-subject studies may also include graphs without
numerical data. Therefore, a different approach is required.
Researchers have developed ways to generate quantitative data from single-subject
graphs using mapping and digitizing technology (e.g., Shadish & Sullivan, 2011). This allows
quantification of published single-subject graphs that do not provide numerical values and
aggregate effect sizes.
The results section of this paper will provide detailed information about the
methodological characteristics of FAP treatment outcome studies. However, here it is important

to briefly review the unique methodological features of FAP research that influence how these
studies can be quantified, aggregated, and interpreted. First, FAP has been almost exclusively
evaluated using an A/B (baseline/treatment) single-subject design. Second, FAP, like most
psychotherapies, is focused on treatment outcomes. Thus, trends and session-by-session changes
in client behavior are less relevant than measures of end-state functioning (i.e., level of
functioning at the conclusion of treatment relative to baseline). Third, FAP data are often
presented in graphs without numerical values for data points. Finally, a majority FAP studies
have a small number of data points.
The aforementioned characteristics of FAP research suggest that a quantitative synthesis
should: (a) generate and synthesize A/B or pre-post effect sizes, (b) use mapping technologies
that can reliably and accurately generate quantitative information from graphs that do not
provide data values, and (c) use effect sizes that do not require a large number of data points.
Four well-established single-subject effect size indices are well suited for a FAP quantitative
synthesis: Percentage of non-overlapping data (PND: Scruggs & Mastropieri, 1998), Split
Middle Trend Estimation (SMTE: White, 1974), Swanson’s dsw (Swanson, Hoskyn & Lee, 1999)
and the Reliable Change Index (RCI: Jacobson & Truax, 1991). Each of these metrics provide
some unique information that may be important to assess differences in FAP outcomes.
The PND (Scruggs & Mastropieri, 1998) is a common and well-established method for
evaluating single-subject effects. This metric is often considered one of the standard ways to
assess and aggregate single-subject design research. The SMTE (White, 1974) was developed to
address the challenge of outliers and trends in single-subject data that adversely affect PND
calculations. This metric provides unique information because it accounts for trends and outliers
in data. Additionally, a nonparametric binomial test of significance can be used to determine
likelihood of effect.
It should be noted that PND and SMTE do not provide an index of magnitude of effect.
Therefore, researchers developed effect size indices that could be used to supplement the
information provided by the PND and SMTE. Swanson, Hoskyn and Lee (1999) developed dsw in
order to generate an effect size that provides an estimate of treatment outcomes based on end-
state functioning. That is, the effect size is based on the participant’s level of functioning at the
conclusion of treatment relative to baseline.
Each of these effect sizes provides different information based on graphed data: (1) PND
offers overall effect and it is often the standard in single-subject design meta-analysis, (2) SMTE
provides a way to assess for outliers and trend, and (3) dsw provides magnitude of effect. In
addition to the aforementioned methods that can be used to quantify graphed data, the Reliable
Change Index (RCI) can be used in single-subject studies when questionnaires are used to
evaluate treatment outcomes (Jacobson & Truax, 1991).
Summary and Aims of the Present Investigation
FAP is a contextual behavioral therapy that applies behavioral principles to the
interpersonal relationship between therapist and client. Given that FAP has primarily been
evaluated with single-subject studies and that current reviews of FAP are qualitative, there is a
need to better understand the effectiveness of this therapy using a quantitative approach.
An examination of single-subject quantitative synthesis techniques indicated that PND,
SMTE, Swanson’s dsw, and RCI are well suited for aggregating data from FAP studies. There
were three principal aims of the current study. The first aim was to conduct a methodological
review of FAP single-subject studies. This included reviewing the demographic characteristics of
participants, length of treatment, and type of FAP therapy provided. The second aim was to
evaluate and synthesize treatment effects using different quantitative indicators of outcomes. A
third aim was to examine the extent to which the effectiveness of FAP varied as a function of
participant characteristics, treatment characteristics, or target characteristics.
Method
Selection Criteria
Several databases were used to find articles for the current study including PsycINFO
(1872 to present), Psychology and Behavioral Sciences Collection (1930s to present), and ERIC
(1966 to present). The following search terms were used: Functional Analytic Psychotherapy,
FAP treatment, FAP single-subject design, FAP single-case design, and FAP case study. The
search for relevant articles and studies occurred between February 2015 through January 2017.
To be included in this study, articles had to meet the following criteria: (a) the study used
single-subject methods, (b) a FAP based treatment was provided for individual clients, (c) the
study contained data that could be coded (graphs, pre- and post-measures), and (d) the study was
published in a peer reviewed journal or doctoral dissertation. The first author also solicited list-
serves related to FAP to receive unpublished manuscripts that were appropriate for the current
study. Studies that met these inclusion criteria were then reviewed for the quantitative synthesis.
Article Coding
Each qualifying article was examined and coded for the following information: number
of participants, age, gender, ethnicity, FAP-related self-report measures, self-report measures of
psychological wellness (e.g., Beck Depression Inventory), number of treatment sessions, length
of treatment, relevant treatment statistics (pre- and post-assessments), and graphical
representation of treatment. Very few studies provided information on CRB3s, for this reason,
only CRB1s and CRB2s were coded. Additionally, O1s and O2s were coded and included in the
current study when they were reported instead of CRB1s and CRB2s.
Given that the current study includes a mixture of CRBs and Os, in the following sections
we use the term Target Behavior 1 (TB1) to refer to problems behaviors (CRB1 and O1s) and
Target Behavior 2 (TB2) to refer to adaptive behaviors (CRB2s and O2s). If a participant had
several TB1s or TB2s reported in a single study, the individual TB1s and TB2s were averaged so
that each participant contributed only one TB1 and/or TB2 to the quantitative synthesis across
studies.
The first author and trained assistants independently coded articles using a coding form
that included article information (e.g., authors, year published, title, journal of publication,
affiliation of authors), participant information (e.g., number of participants per study, participant
demographics, participant diagnoses), methodology information (e.g., type of single-subject
design, number of baseline and treatment sessions, modality of therapy used), and effect sizes.
Disagreements that occurred between raters were resolved through consensus. The
methodological characteristics of all studies were double coded by first author and the trained
assistant. Inter-rater reliability for overall methodological coding was excellent (κ = .92). Inter-
rater reliability varied from 0.77 to 1.0 for different portions of coding (e.g., title, design,
participants, diagnoses).
A subset of studies (70%) were double coded for PND and SMTE. A high degree of
reliability was found between both PND and SMTE measurements. The average ICC for PND
was 0.98 with a 95% confidence interval that ranged from 0.97 to 0.99 (F(32,47)= 171.75,
p<.001).The average ICC for SMTE was 0.95 with a 95% confidence interval that ranged from
0.91 to 0.98 (F(23,34) = 39.53, p<.001). RCI and dsw were not double coded because excel
formulas were used to calculate these values.
Regarding the assessment of overall methodology, the authors used the Single-Case
Reporting Guideline in BEhavioural Interventions (SCRIBE) developed by Tate et al. (2016).
This system is a 26-item review checklist that assess the title, abstract, introduction, methods,
results, discussion, and documentation of single-subject design research. Tate et al. (2016)
suggest this method be used for development, replication, and evaluation of single-subject design
researchers.
Using the search terms “Functional Analytic Psychotherapy”, “FAP treatment”, “FAP
single-subject design”, “FAP single-case design”, and “FAP case study,” 179 studies were
initially identified using a review of titles. These 179 studies represented a mixture of narrative
case studies, theoretical articles, literature reviews, and empirical studies. The authors reviewed
the abstracts of all 179 studies to determine eligibility for the current study.
From the abstract review of the 179 studies, the authors excluded 138 because these
articles were conceptual reviews, theoretical articles, or studies about measurement of FAP. The
remaining 41 studies were fully reviewed. Upon full text review, 23 studies were excluded
because they were narrative reviews, did not contain data that could be used to calculate any
effect size, or included data not specifically related to the current study (e.g., therapist training
outcomes, group based outcomes). If data were reported but not in a manner in which the authors
could calculate effect sizes (e.g., mean scores), the authors contacted the authors of the
publications. No additional studies were added using this approach because the authors of the
studies under question did not respond or reported that the original data was unavailable
(destroyed or was no longer in their possession). This left a total of 18 studies for quantitative
synthesis.
The authors also used the invisible college approach and placed requests on FAP list-
serves and social media pages for articles relevant to this quantitative synthesis. Two additional
studies were collected using this strategy. The authors used an ancestry approach and examined
citations from the FAP articles to identify any studies that might have been missed in the search
strategy. No additional articles were located using this method. Finally, the descendency
approach was used to identify any additional articles that referenced the original FAP book by
Kohlenberg and Tsai (1991). No additional articles were located with these approaches. Thus, 20
qualifying studies were located using all of the aforementioned search and selection strategies.
Figure 1 provides visual representation of the article inclusion process.
Effect Size Calculation
Graph Digitization and Data Reduction. WebPlotDigitizer was used to generate data
from graphs without raw values. WebPlotDigitizer is a program that can be used to upload
graphs for mapping and obtaining values based on points on the graph. Once a graph was
uploaded, the researchers assigned a 10-point value to the x-axis and y-axis. After this, the
researchers identified each baseline and treatment point in each graph. After identifying each
point, WebPlotDigitizer generated X and Y coordinate values. In order to ensure that values were
appropriately extrapolated from WebPlotDigitizer, the researchers entered the values from the
program into Ploty, a web based graphing program. The researchers then compared the Ploty
graphs with the original graphs to assure that digitization was accurate.
Percentage of Non-Overlapping Data (PND). When a graph presented behaviors
targeted for reduction during treatment (i.e., TB1), the researcher identified the lowest data point
that occurred during the baseline phase of the study. Next, the researcher determined how many
data points in the treatment phase fell below the lowest baseline point. Finally, the researcher
calculated the PND by dividing the total number of points below the lowest baseline point by the
total number of interventions points and multiplying the result by 100.
When a graph presented behaviors targeted for increases during treatment (i.e., TB2), the
researcher identified the highest data point that occurred during the baseline phase of the study.
Next, the researcher determined how many data points in the treatment phase fell above the
highest baseline point. Then, the researcher calculated the PND by dividing the total number of
points above the highest baseline point by the total number of interventions points and
multiplying the result by 100.
PND values can vary from 0% to 100%; Scruggs and Mastropieri (1998) recommend that
PND scores be classified as follows: PND < 50%: unreliable treatment; PND 50% – 70%:
questionable effectiveness; PND 70 – 90% fairly effective; and PND > 90%: highly effective.
Split Middle Trend Estimation (SMTE). Split-middle trend of estimation was
calculated in four steps: (a) the baseline phase was divided into halves, (b) the median point on
the y-axis in each half of the baseline phase was identified, (c) a straight line connecting the
median points of each baseline half was drawn and extended into the treatment phase, and (d) the
number of data points in the treatment phase that fell above or below this line were counted and
divided by the total number of data points in the treatment phase. For example, if the treatment
target was a TB1, the number falling below the line were counted; If the treatment target was a
TB2, the number falling above the line were counted. The proportion of data points that fell
above or below the celeration line were converted into a percentage so that they could be
compared to the PND. Finally, a binomial calculation was performed to evaluate the probability
of obtaining each outcome.
A minimum of four baseline points are required to adequately create a celeration line.
Therefore, SMTE was not calculated for any graph with less than four baseline data points.
Given that the null hypothesis for SMTE would be less than 50%, the authors utilized similar
recommendations of PND for SMTE: SMTE < 50%: unreliable treatment; SMTE 50% – 70%:
questionable effectiveness; SMTE 70 – 90% fairly effective; and SMTE > 90%: highly effective.
Swanson’s dsw. Swanson’s dsw was calculated by: (a) forming a baseline mean using the
last three data points during the baseline phase, (b) forming a treatment outcome mean using the
last three data points in the treatment phase, (c) computing a difference score by subtracting the
baseline mean from the treatment mean, and (d) dividing the difference score by the pooled
standard deviation corrected for correlation (the correlation was between the last three baseline
data points and the last three treatment data points). If there were less than three points in the
baseline phase or treatment phase, then only two points in baseline and two points in treatment
were used to calculate this statistic. The formula for calculating Swanson’s dsw is as follows:
dsw = (Mt – Mb)/(SD/√2(1-r)
As is evident in this formula, higher dsw values indicate larger treatment effects. For
example, a dsw = 1it indicates that the level of functioning at the conclusion of treatment is 1
standard deviation higher than the level of functioning at the conclusion of baseline and when a
dsw = 2, it indicates that the level of functioning at the conclusion of treatment is two standard
deviations higher than the level of functioning at the conclusion of baseline. Swanson and
Sachse-Lee (2000) argued that dsw should be interpreted in the same way that Cohen’s d is
interpreted in the psychotherapy meta-analytic literature. Specifically, Cohen (1988) suggested
that effect sizes using d could be classified as “small” (.20), “medium” (.50) and “large” (.80).
Reliable Change Index. The RCI is a standardized score used to assess change in an
individual’s score on a survey measure and uses the participant’s pre- and post-treatment scores,
standard deviation, and reliability coefficients. Given that the studies had small sample sizes, the
standard deviations and reliability coefficients from large sample validation studies were used to
calculate RCIs. The formulas for calculating RCI and the standard error of the difference are
provided below.
RC = X2 – X1 Sdiff = √2(SE)2 SE = SD√(1-r)

Sdiff
In the above formulas, X2 is the post-treatment score, and X1 is pre-treatment score. The
standard error of the difference (Sdiff) is the square root of the standard error of measurement
(SE) squared and multiplied by two. The standard error of measurement is the standard deviation
multiplied by the square root of 1 minus the reliability coefficient for a particular measure.
The RCI can range from -∞ to +∞ with no pre-treatment to post-treatment difference
being zero. Larger RCIs indicate greater treatment effects. When the RCI is greater than +/-1.96
(the 95% confidence interval around the null of zero), then it is labelled “statistically reliable”
because the magnitude of difference is greater than what would be expected to occur by chance
or passage of time. If the RCI was less than +/-1.96, then it is labelled as “not statistically
reliable.” Individual RCI scores were aggregated by calculating an overall mean RCI for all
single-subject studies included in the current study. The RCI was only calculated for studies in
which self-report inventory data were provided.
Effect Size Aggregation and Analysis

Graph data for each TB1 and TB2 were converted to PND, SMTE, and Swanson’s dsw. A
RCI score was calculated for each self-report inventory for each participant where pre- and post-
treatment data were provided. In studies where multiple TB1, TB2, or questionnaires were
collected on a single participant, an average PND, SMTE, dsw, and RCI were calculated for that
participant. Additionally, when examining studies with multiple phases, only the first baseline
and treatment phase were used to calculate the effect size.
The overall mean PND, SMTE, dsw, and RCI were calculated across studies for each
outcome variable. The mean was used because it was not possible to use the Hedges-
Pustejovsky-Shadish (Shadish et al., 2014) combined effect size calculation which requires three
or more participants per study (only three studies had three or more participants). Follow-up
analyses were then conducted to determine whether there were significant differences in
outcomes due to participant characteristics, treatment characteristics, or targets of treatment.
Results
Methodological Characteristics of Studies
Publication Characteristics. Out of the 20 studies located, 18 studies were published
between 1994 and 2017. The two remaining studies are currently unpublished. These articles
were produced by 43 authors. The articles were published in 9 different journals with most being
reported in the journal of International Journal of Behavioral Consultation and Therapy, Clinical
Case Studies, and The Psychological Record. The methodological characteristics of studies are
reported in Tables 1 and 2.
Participant Characteristics. There were a total of 37 participants across all studies (19
males, 15 females, and 3 participants whose gender was not reported). The average number of
participants per article was 2 (Range 1 – 5). However, it should be noted that two studies utilized
data from the same participant resulting in the final count of 36 participants (Busch et al., 2009;
Kanter et al., 2006). This participant was a 24/25-year-old African American female and her data
were combined in reporting participant characteristics and effect size calculation for TB1.
Participants varied in age from 7 to 72 (M = 28.69, SD = 14.3) and the age of three participants
was unknown. Information regarding ethnicity was not provided for 21 participants. For the 14
participants whose ethnicities were reported, 10 were Caucasian, two were biracial, one was
American American, and one was Latin.
Diagnoses were provided for 18 participants. The most common diagnoses were: mood
disorders (n = 6), personality disorders (n = 3), co-morbid mood and personality disorders (n =
3), co-morbid mood, personality and substance disorders (n = 2), co-morbid mood, posttraumatic
stress disorder, and substance use disorder (n = 1), co-morbid mood, anxiety, and personality (n
= 1), co-morbid mood and psychotic disorder (n = 1), and co-morbid personality and psychotic
disorder (n = 1).
Targets of Treatment. A variety of TB1s and TB2s were targeted for treatment (see
Table 2). For review in the current study, TB1s and TB2s were categorized based on subscales of
the Functional Ideographic Assessment Template-Questionnaire (Callaghan, 2006). As detailed
in Table 2, the most common TB1s were: problematic disclosure, problematic emotional
expression, and conflict. The most common TB2s were: Effective disclosure, adaptive emotional
expression, and bidirectional communication.
Intervention Characteristics and Study Design. The average number of observations
during baseline was 6 (Range 2 – 12) and the average number of observations during treatment
was 11 (Range 4 – 25). Most studies provided FAP-Alone (n = 14). In other instances, FAP was
combined with Acceptance Commitment Therapy (n = 1), Cognitive-Behavior Therapy (n = 1),

Behavioral Activation (n = 2), Cognitive-Therapy (n = 1), and Child Behavior Analytic Therapy
(n = 1). Based on this information, two categories of studies were formed: (a) “FAP-Alone”
treatment and (b) “FAP-Enhanced” treatment. Seven studies utilized case study approaches, 10
utilized A/B or A/A+B design, two used a multiple baseline design, and one used a reversal
design.
The SCRIBE method indicated that none of the studies met the full 26 criteria of the
SCRIBE methodology. The range of studies based on the SCRIBE method score was 2 to 17.
The mean score was 11.95. There was variability in scoring given that some studies were case
studies, A/B designs, and more sophisticated single-subject designs. The overall grade for each
study are presented in Table 1.
Analysis of FAP Effects
Table 3 and 4 provide a summary of PND, SMTE, and Swanson’s dsw effect sizes for TB
1s and TB 2s. Table 5 provides a summary of RCI scores. All effect size metrics did not show
evidence of significant skew or kurtosis utilizing both standard error of skewness and standard
error of kurtosis for significance testing. Therefore, it was appropriate to average and conduct
statistical testing on measures when relevant and applicable. The mean PND for all TB1s and
TB2s were respectively 58.70% (n = 21, SD = 40.76, 95% CI = 41.31 – 76.07) and 79.39% (n =
24, SD = 31.71, 95% CI = 67.67 – 92.51). Using Scruggs and Mastroprieri’s (1998)
classification, the mean PND for TB1s was classified as “questionably effective” with the 95%
confidence interval ranging from “ineffective” to “fairly effective”. The mean PND for TB2s
was be classified as “fairly effective” with a 95% confidence interval ranging from “questionably
effective” to “highly effective.”

The overall mean SMTE for TB1s and TB2s were respectively 69.43% (n = 16, SD =
36.28, 95% CI = 50.70 – 88.17) and 80.66% (n = 18, SD = 30.29, 95% CI = 69.27 – 96.85). The
mean SMTE for TB1s fell into the upper range of the “questionably effective” classification with
the 95% confidence interval ranging from “questionably effective” to the upper end of “fairly
effective.” Of the 15 SMTE analyses that could be conducted for TB1s for all participants, 7
(47%) were significant. The mean SMTE for TB2s fell into the “fairly effective” classification
with a 95% confidence interval ranging from the upper end of “questionably effective” to
“highly effective.” Out of the 18 SMTE analyses that could be conducted for TB2s, 15 (83%)
were statistically significant.
The overall mean Swanson’s dsw for TB1s and TB2s were respectively 1.33 (n = 21, SD =
0.87, 95% CI = 0.95 – 1.71) and 1.85 (n = 24, SD = 0.97, 95% CI = 1.49 – 2.25). Of the 21
Swanson’s dsw for TB1s, 71% (n = 15) were large classification, 10% (n = 2) were medium, and
19% (n = 4) were small. Of the 24 Swanson’s dsw for TB2s, 83% (n = 20) were large, 4% (n = 1)
was medium, and 13% (n = 3) were small. Taken together, these results indicated that both TB1s
and TB2s reliably decreased from pre-treatment to post-treatment (none of the 95% confidence
intervals contained zero) and that for a majority of studies, the effect sizes were large. Finally,
the average effect size for TB2s was higher than the average effect size for TB1s.
RCI scores were divided into symptom-based RCI scores and quality of life-based RCI
scores. The symptom-based RCIs are analogous to TB1s and were expected to show a decrease
with FAP. Self-report survey data from seven participants reported in seven studies were used to
calculate the average symptom-based RCI. The quality of life-based RCIs are analogous to TB2s
and were expected to increase with FAP. Self-report survey data from ten participants reported in
eight studies were used to calculate the overall quality of life-based RCI. The means for
symptom-based RCIs and quality of life-based RCIs were respectively 5.36 (n = 7, SD = 3.57,
95% CI = 3.09 – 7.61) and 2.93 (n = 10, SD = 3.16, 95% CI = 1.36 – 4.51). This indicated that
that both sets of RCIs were large and positive. Further, these RCIs were statistically reliable
given that the 95% confidence intervals did not include zero.
Analysis of Variation in FAP Effects
Given the variation in the metrics across studies, analyses were conducted to evaluate the
extent to which outcomes differed as a function of gender, ethnicity, and age. There were no
significant relationships observed between any of these demographic characteristics and any
outcome measure using PND, SMTE, dsw, or RCI.
A second set of analyses examined whether outcomes varied as a function of number of
sessions and whether FAP-Alone outcomes differed from FAP-Enhanced outcomes. Results
indicated that number of sessions was not significantly associated with any outcome measure
using PND, SMTE, dsw, or RCI. In order to compare FAP-Alone and FAP-Enhanced outcomes,
independent t-tests were conducted using the PNDs, SMTEs, and dsw as dependent variables.
Results indicated that the mean FAP-Alone PND for TB1s (M = 71.98, SD = 37.68) was
significantly higher (t (20) = 2.62, p = <0.05) than the mean FAP-Enhanced PND (M = 30.24,
SD = 33.42). Cohen’s d = 1.17 indicating large effect. All other t-tests were non-significant.
Failsafe Calculations. Noting that journals are biased toward publishing significant
findings, Rosenthal (1979) developed what has been termed a “failsafe number.” Rosenthal’s
failsafe number is the number of non-significant studies (or hypothesis tests) stored in file
drawers that would be needed to raise the overall p-value in a conventional meta-analysis that
aggregates group-level data to > .05. Because the current study is aggregating single-subject
data, Rosenthal’s failsafe calculation cannot be used. However, Orwin (1983) and Wolf (1986)
developed a failsafe calculation that is based on d as follows:
Nfs = No(do – dc) dc.
In this formula, Nfs is the failsafe number, No is the number of observed effect sizes, do is
the average d observed across studies, and dc is the criterion. Orwin (1983) and Wolf (1986)
recommend using Cohen’s (1988) effect size classification scheme for small or medium effects
(small d = .2, medium d = .5) to set the value of dc. Their argument is that using criterion effect
sizes of .2 or .5 represent minimal to modest responsiveness to a treatment which would thought
of as “clinically non-significant.”
The average Swanson’s dsw obtained in this study is analogous do in Orwin (1983) and
Wolf’s (1986) failsafe equations. As such, their equation can be used as a heuristic technique to
estimate failsafe numbers for the current synthesis of single-subject data. For TB, the average dsw
was 1.33 based on 21 effect sizes. Thus, the failsafe number compared against hypothetical small
or medium file drawer effect sizes are:
Small: Nfs = 21 (1.33 - .2)/.2 = 118.65 and
Medium: Nfs =21(1.33 - .5)/.5 = 34.86.
For TB2s, the average dsw was 1.85 based on 24 effect sizes. Thus, the failsafe number compared
against hypothetical small and medium file drawer effect sizes are:
Small: Nfs = 24 (1.85 - .2)/.2 = 198 and
Medium: Nfs = 24(1.85 - .5)/.5 = 67.5.
These failsafe numbers are addressing the following question: “How many unpublished
FAP single-subject treatment outcome studies demonstrating no improvement from baseline to
post-treatment are needed to reduce the overall Swanson’s dsw to the small (.2) or medium (.5)
classification level?” As is evident in these calculations, the failsafe findings suggest that the
FAP outcomes found in this quantitative synthesis are quite robust when contrasted against the .2
and .5 criteria.
Discussion
The current study assessed the methodology of the FAP outcome studies by examining
and reporting the methods of all studies included. In order to better understand the effects of
FAP, overall effect sizes were calculated using PND, SMTE, dsw, and RCIs. Variation in effect
sizes was also evaluated.
The current review located 18 published FAP studies with outcome data and two
unpublished studies with outcome data for a total of 20 studies reviewed. A majority of the
studies used an A/B design and one used a reversal design. Participants varied in age, ranging
from childhood to late adulthood. A majority of participants were male. A majority of
participants were Caucasian. However, 13 out of the 20 studies did not report ethnicity. Few
studies reported DSM diagnoses.
The SCRIBE method (Tate et al., 2016) was used to evaluate the methodological rigor of
FAP studies None of the studies met all 26 criteria designed by Tate et al. (2016) to assess the
methodological rigor of single-subject studies. This indicated that the FAP literature can be
improved with more rigorous design and treatment outcome evaluation methods. A majority of
the studies provided basic information related to the SCRIBE criteria (e.g., background
information, aims, study design, and description of intervention). However, some of the more
sophisticated criteria were not met by the studies in the current review (e.g., statement of adverse
events, availability of study protocol, or explicit statement of whether or not any funding sources
were provided for the study).

In summary, the FAP literature has some positive methodological features. First, there is
variation in targets of treatment with measures ranging from depression, anxiety, personality
disorders, and several other interpersonal issues. Second, there is good variation in age, gender,
and ethnicity. Finally, articles came from a diversity of researchers from several different
continents including North America, South America, and Europe.
Despite the above mentioned methodological strengths, most FAP outcome studies are
limited by the use of designs which provide only weak causal inference, uncertain construct
validity, and questionable generalizability. First, the extensive use of A/B designs provide weak
evidence of causality. Several other well-known causal inference threats could produce A/B
changes in behavior, such as regression to the mean, maturation, and reactivity to
observation/repeated measurement. Secondly, the absence of placebo comparison conditions
(e.g., non-contingent but affirming therapist responding during sessions) hampers the construct
validity of FAP studies because one cannot infer that it is FAP techniques per se that are
responsible for A/B changes. Finally, the near exclusive use of single-subject investigations
constrains generalizability to other persons and contexts.
Based on our quantitative analysis of outcomes, there is evidence that there were reliable
differences from pre-treatment to post-treatment. Further, the magnitude of observed differences
varied as a function of outcome metric used, target of treatment, and treatment type. For PND
and SMTE analyses the overall mean effect sizes fell into the “questionably effective” to “fairly
effective” classification. Alternatively, the Swanson’s dsw analyses indicated that pre-treatment
to post-treatment differences were consistently large and reliable. Similarly, the mean RCIs were
also large and consistently classified as “statistically reliable.” In terms of treatment targets, the
pre-treatment to post-treatment differences tended to be larger for TB2s relative to TB1s. Finally,
outcomes from FAP-Alone interventions were more favorable than FAP-Enhanced interventions
for TB1s.
The quantitative outcomes, combined with methodological considerations, can be used to
address three important questions about the FAP literature. These questions are: (a) is there
evidence of statistically reliable and clinically significant pre-treatment to post-treatment
differences; (b) are the pre-treatment to post-treatment differences greater than what would have
been expected to have occurred by chance or the passage of time; and (c) can the pre-treatment
to post-treatment differences be unambiguously attributed to FAP?
Regarding the first question, the results of this quantitative synthesis indicate that across
studies, metrics, and targets of treatment, there is evidence that TB 1s reliably declined from pre-
treatment to post-treatment and that TB 2s reliably increased from pre-treatment to post-
treatment. The clinical significance of treatment effects varied as a function of the metric used to
quantify outcomes. Specifically, PND and SMTE analyses yielded more conservative estimates
of FAP effects relative to Swanson’s dsw.
The differences between PND, SMTE, and dsw can be attributed an important difference
in how these metrics are calculated. The PND and SMTE are derived from comparisons between
baseline and measures collected in the early, middle, and end points of therapy whereas dsw is
derived from comparisons between baseline and the end-of-treatment measurement. As such, the
PND and SMTE include data points that were collected before the intervention was completed.
This would inevitably reduce estimates of effectiveness given that behavior change is expected
to occur gradually as a function of therapist reinforcement and shaping.
PND and SMTE have been used many single-subject meta-analytic reviews. The
popularity of these metrics likely arises from their ease of calculation (simply counting data
points in graphs that fall above and below some reference) and need for visual inspection only.
Alternatively, Swanson’s dsw requires that the study author provide numerical data for each point
on a graph (which very rarely occurs) or the use digitizing mapping technology in order to
generate values for data points on a graph.
Given that PND and SMTE are derived from less relevant data in the FAP literature and
that mapping and digitizing data are now more readily available, we recommend that future
single subject quantitative use dsw metrics for evaluating treatment outcomes where (a) end-state
functioning is the focus of treatment and (b) behavior change is expected to occur gradually
across time.
For the second question, some of the results of this review indicate that the pretreatment
to post treatment differences are greater than what would have occurred due to the passage of
time. This position is primarily based on RCI, dsw, and SMTE data. Specifically, the RCI
calculation takes into account the standard error of measurement which is an index of the amount
of change in a score that would be expected to occur by chance and/or with repeated
administration across time. Most of the RCIs in our analyses exceeded the 95% confidence
interval by a large margin. The dsw findings also support this conclusion. Correcting the
standardized pre-treatment-post-treatment difference for correlation reduces the influence of
serial dependency and trends in the dsw calculation. This is important because serial dependency
and trends would be the principal mechanism through which non-treatment related factors (e.g.,
regression, maturation, etc.) would produce pre-treatment-post-treatment differences in TB1s and
TB 2s. Finally, the SMTEs corrected for pre-treatment trends. Taken together, the RCI, dsw, and
SMTE outcomes support an argument that the pre-treatment to post-treatment changes were not
exclusively due to random variation, passage of time, or trends.

Regarding the final question, it is unlikely the differences can be unambiguously
attributed to FAP. As noted earlier, the absence of placebo comparison conditions (e.g., an
ABAB designed varying contingent and non-contingent therapist responding or random
assignment to a placebo control group using a group design), makes it impossible to attribute the
pre-treatment to post-treatment differences to FAP. Several other factors may have promoted
changes in participant behavior (e.g., therapist attention and empathy) that were not explicitly
part of the FAP intervention. Finally, a number of the classic causal inference threats could
account for some of the pre-post differences. The more salient of these threats would be: history,
maturation, regression to the mean, and reactivity to observation/measurement.
In examination of the potential moderators, one interesting finding was that FAP-Alone
outperformed FAP-Enhanced interventions based on the PND effect sizes for TB1s. This finding
is logical if one considers the nature of FAP sessions and therapist-client interactions.
Specifically, in FAP-Alone, the therapist will engage in actions that are systematically,
consistently, and directly targeting specific behaviors. It is possible in the FAP-Enhanced
interventions, the therapist was providing more didactic material and possibly focused on other
behaviors that were not directly related to interpersonal interactions. Thus, in any given session,
there would be fewer opportunities for the client to emit target behaviors. Similarly, the therapist
would have fewer opportunities to provide systematic consequences for the targeted behaviors.
Another consistent finding was that there were larger effects observed for TB2s relative
to TB1s. This finding is congruent with FAP principles and learning theory. Specifically, FAP
emphasizes the use of reinforcement to promote acquisition of adaptive TB2s in session. While
extinction and punishment can be used to suppress TB1s, these techniques are not preferred
because they may have an adverse impact on the therapeutic relationship. Instead, the therapist
aims to increase TB2s with the notion that this increase in adaptive behavior will simultaneously
reduce TB1s as in differential reinforcement of other/incompatible behavior. As such, it would
be plausible to argue that the direct and more frequent reinforcement of TB2s would yield a
larger treatment effect. Further, it may be that it is more challenging for therapists to address
TB1s (e.g., re-direction, selective ignoring, blocking) than it is to reinforce TB2s. Previous
research has shown component-process analysis of reinforcing TB2s (Haworth et al., 2015), and
further research may benefit from exploring this process for blocking TB1s.
Limitations
One major consideration of the current findings is the “file drawer” problem. The failsafe
question for this literature is: “How many times have FAP researchers initiated a single-subject
treatment study but failed to report or publish the result because the client did not respond to
treatment?” The failsafe analyses in this paper indicate that a substantial number of single-
subject studies with nonresponsive clients would be needed to reduce the average Swanson’s dsw
to small or medium effect size classifications. However, given that there is ample evidence of a
bias toward publishing significant results relative to non-significant results in psychological
research, it is reasonable to argue that there are at least some studies with nonresponsive clients
in the file drawers of FAP researchers. Adding data from these studies would reduce the
magnitude and reliability of the overall effect sizes reported in this paper. Thus, it is likely that
the findings reported in this quantitative synthesis overestimate the effectiveness of FAP to some
extent. However, the precise amount of overestimation cannot be calculated.
Conclusion
The current study is a quantitative analysis of the existing FAP single-subject design
treatment literature. It provides an estimate of the efficacy of FAP based on the currently
available single-subject studies. These results indicate that FAP may be associated with reliable
treatment effects for a variety of behaviors based on pre-post comparisons. As such, FAP may be
a promising approach that is based on an innovative application of behavioral principles to the
therapeutic relationship. However, it is difficult to assess if changes in the participants in this
study are due solely to FAP due to the limitations of the research designs used. It is also difficult
to assess how many unpublished, failed trials exist that may nullify the results presented in this
paper.
There remains a clear need for more systematic and methodologically rigorous FAP
research. Most importantly, the authors recommend that researchers conduct more randomized
control trials so that stronger causal statements can be made about FAP effectiveness. Although
the turn-by-turn coding of the FAP rating system (FAPRS: Callaghan & Follette, 2008) may be
cumbersome for randomized control trials, the recent development of treatment adherence
measures (e.g., Maitland et al., 2016a; Maitland et al., 2016b) and self-report measures targeting
FAP-specific constructs (e.g., Darrow, Callaghan, Bonow, & Follette, 2014; Leonard et al.,
2014) may assist with the development of randomized control trials.
If FAP researchers continue to utilize single-subject design studies, then it may also be
beneficial to use guidelines established for strong methodological rigor (e.g., Tate et al., 2016).
Moreover, the use of multiple baseline designs, A/B/C designs, or reversal/withdrawal designs
could more fully assess the causal effects of FAP. In addition to stronger methodology, the
authors recommend that researchers place a greater emphasis on collecting and reporting
participant demographic characteristics in order to more clearly understand the efficacy of FAP
and the different populations with which it may be effective. Finally, the use of placebo
comparisons conditions would permit a better evaluation of the specific effects of FAP
techniques relative to nonspecific supportive listening and responding.

References
Barkham, M., Margison, F., Leach, C., Lucock, M., Mellor-Clark, J., Evans, C., . . . McGrath, G.
(2001). Service profiling and outcomes benchmarking using the CORE-OM: Toward
practice-based evidence in the psychological therapies. Journal of Consulting and
Clinical Psychology, 69(2), 184-196. doi:10.1037/0022-006X.69.2.184
Barlow, D. H., Nock, M. K., & Hersen, M. (2009). Single case experimental designs: Strategies
for studying behavior change (3rd ed.). Boston, MA: Allyn and Bacon.
Barraca, J. (2004). Spanish Adaptation of the Acceptance and Action Questionnaire
(AAQ). International Journal of Psychology and Psychological Therapy, 4, 505-
515.
Baruch, D. E., Kanter, J. W., Busch, A. B., & Juskiewicz, K. (2009). Enhancing the therapy
relationship in Acceptance and Commitment Therapy for psychotic symptoms. Clinical
Case Studies, 8, 241-257
Bond, F. W., Hayes, S. C., Baer, R. A., Carpenter, K. M., Guenole, N., Orcutt, H. K., ... & Zettle,
R. D. (2011). Preliminary psychometric properties of the Acceptance and Action
Questionnaire–II: A revised measure of psychological inflexibility and experiential
avoidance. Behavior therapy, 42(4), 676-688.
Busch, A. M., Kanter, J. W., Callaghan, G. M., Baruch, D. E., Weeks, C. E., & Berlin, K. S.
(2009). A micro-process analysis of functional analytic psychotherapy's mechanism of
change. Behavior Therapy, 40(3), 280-290. doi:10.1016/j.beth.2008.07.003
Callaghan, G. M. (2006). The Functional Idiographic Assessment Template (FIAT) System: For
use with interpersonally-based interventions including Functional Analytic

Psychotherapy (FAP) and FAP-Enhanced treatments. The Behavior Analyst Today, 7,
357-398. doi:10.1037/h0100160
Callaghan, G. M., & Follette, W. C. (2008). Coding Manual for the Functional Analytic
Psychotherapy Rating Scale (FAPRS). The Behavior Analyst Today, 9, 57-97.
doi:10.1037/h0100649
Callaghan, G. M., Summers, C. J., & Weidman, M. (2003). The treatment of histrionic and
narcissistic personality disorder behaviors: A single-subject demonstration of clinical
improvement using functional analytic psychotherapy. Journal of Contemporary
Psychotherapy, 33(4), 321-339. doi:10.1023/B:JOCP.0000004502.55597.81
Cattivelli, R., Tirelli, V., Berardo, F., & Perini, S. (2012). Promoting appropriate behavior in
daily life contexts using functional analytic psychotherapy in early-adolescent children.
International Journal of Behavioral Consultation and Therapy, 7(2-3), 25-32.
doi:10.1037/h0100933
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hilsdale. NJ: Lawrence
Earlbaum Associates, 2.
Corrigan, P. W. (2001). Getting ahead of the data: A threat to some behavior therapies. The
Behavior Therapist, 24(9), 189-193.
Darrow, S. M., Callaghan, G. M., Bonow, J. T., & Follette, W. C. (2014). The Functional
Idiographic Assessment Template-Questionnaire (FIAT-Q): Initial psychometric
properties. Journal of Contextual Behavioral Science, 3(2), 124–135.
Ferro-Garcia, R., Lopez-Bermudez, M. A., & Valero-Aguayo, L. (2012). Treatment of a disorder
of self through functional analytic psychotherapy. International Journal of Behavioral
Consultation and Therapy, 7(2-3), 45-51. doi:10.1037/h0100936

García, R. F. (2008). Recent Studies in Functional Analytic Psychotherapy. International
Journal of Behavioral Consultation and Therapy. 4(2), 239-249. doi:10.1037/h0100846
Haynes, S. N., O’Brien, W. H., & Kaholokula, J. K. (2011). Behavioral Assessment and Case
Formulation. Hoboken, N.J.: John Wiley & Sons.
Hayes, S. C., Masuda, A., Bissett, R., Luoma, J., & Guerror, L. F. (2005). DBT, FAP, and ACT:
how empirically oriented are the new behavior therapy technologies? Journal of Behavior
Therapies, 35, 35–54. doi:10.1016/S0005-7894(04)80003-0
Horner, R. H., Carr, E. G., Halle, J., McGee, G., Odom, S. L., & Wolery, M. (2005). The use of
single-subject research to identify evidence-based practice in special education.
Exceptional Children, 71, 165–179.
Jacobson, N. S., & Truax, P. (1991). Clinical significance: A statistical approach to defining
meaningful change in psychotherapy research. Journal of Consulting and Clinical
Psychology, 59,12-19.
Kanter, J. W., Rusch, L. C., Busch, A. M., & Sedivy, S. K. (2009). Validation of the Behavioral
Activation for Depression Scale (BADS) in a community sample with elevated
depressive symptoms. Journal of Psychopathology and Behavioral Assessment, 31(1),
36-42.
Kanter, J., Landes, S., Busch, A., Rusch, L., Brown, K., Baruch, D., & Holman, G. (2006). The
effect of contingent reinforcement on target variables in outpatient psychotherapy for
depression: A successful and unsuccessful case using functional analytic psychotherapy.
Journal of Applied Behavior Analysis, 39(4), 463-467. doi:10.1901/jaba.2006.21-06
Kanter, J. W., Parker, C. & Kohlenberg, R. J. (2001). Finding the self: A behavioral measure and
its clinical implications. Psychotherapy: Theory, Research and Practice, 38, 198-211.
Kazdin, A. E. (1982). Single-case research designs: Methods for clinical and applied settings.
New York: Oxford University Press, Inc.
Kohlenberg, R. J., & Tsai, M. (1991). Functional Analytic Psychotherapy: A guide for creating
intense and curative therapeutic relationships. New York: Plenum.
Kohlenberg, R., & Tsai, M. (1994). Improving cognitive therapy for depression with functional
analytic-psychotherapy – theory and case study. Behavior Analyst, 17(2), 305-319.
Kroenke, K., Spitzer, R. L., & Williams, J. B. W. (2001). The PHQ-9: Validity of a brief
depression severity measure. Journal of General Internal Medicine, 16(9), 606-613.
doi:10.1046/j.1525-1497.2001.016009606.x
Lambert, M. J., Burlingame, G. M., Umphress, V., Hansen, N. B., Vermeersch, D. A., Clouse, G.
C., & Yanchar, S. C. (1996). The reliability and validity of the outcome questionnaire.
Clinical Psychology & Psychotherapy, 3(4), 249-258.
Landes, S. J., Kanter, J. W., Weeks, C. E., & Busch, A. M. (2013). The impact of the active
components of Functional Analytic Psychotherapy on idiographic target behaviors.
Journal of Contextual Behavioral Science, 2(1), 49-57. doi:10.1016/j.jcbs.2013.03.004
Leonard, R. C., Knott, L. E., Lee, E. B., Singh, S., Smith, A. H., Kanter, J., … Wetterneck, C.
T. (2014). The development of the functional analytic psychotherapy intimacy scale. The
Psychological Record, 64(4), 647-657.
Lizarazo, N. E., Muñoz-Martínez, A. M., Santos, M. M., & Kanter, J. W. (2015). A within-
subjects evaluation of the effects of Functional Analytic Psychotherapy on in-session and
out-of-session client behavior. The Psychological Record, 65(3), 463-474.
doi.org/10.1007/s40732-015-0122-7
Lopez, F. J. C. (2002) Jealousy: A case of application of functional analytic psychotherapy.
Apuntes de Psicologia, 20(3), 347-368.
Maitland, D. W. M., Kanter, J. W., Tsai, M., Kuczynski, A. M., Manbeck, K. E., & Kohlenberg,
R. J. (2016b). Preliminary findings on the effects of online Functional Analytic
Psychotherapy training on therapist competency. The Psychological Record, 66(4), 627-
637. doi.org/10.1007/s40732-016-0198-8
Maitland, D. W. M., Petts, R. A., Knott, L. E., Briggs, C. A., Moore, J. A., & Gaynor, S. T.
(2016a). A randomized controlled trial of Functional Analytic Psychotherapy versus
watchful waiting: Enhancing social connectedness and reducing anxiety and avoidance.
Behavior Analysis: Research and Practice, 16(3), 103-122. doi.org/10.1037/bar0000051
Manalov, R., Guilera, G., & Sierra, V. (2014). Weighting strategies in the meta-analysis of
single-case studies. Behavior Research, 46, 1152-1166.
Manduchi, K., & Schoendorff, B. (2012). First steps in FAP: Experiences of beginning
functional analytic psychotherapy therapist with an obsessive-compulsive personality
disorder client. International Journal of Behavioral Consultation and Therapy, 7(2-3),
72-77. doi:10.1037/h0100940
Mangabeira, V., Kanter, J., & Del Prette, G. (2012). Functional analytic psychotherapy (FAP): A
review of publications from 1990 to 2010. International Journal of Behavioral
Consultation and Therapy, 7(2-3), 78-89. doi:10.1037/h0100941
Manos, R. C., Kanter, J. W., Rusch, L. C., Turner, L. B., Roberts, N. A., & Busch, A. M. (2009).
Integrating functional analytic psychotherapy and behavioral activation for the treatment
of relationship distress. Clinical Case Studies, 8(2), 122-138.
doi:10.1177/1534650109332484
McCarthy-Larzelere, M., Diefenbach, G. J., Williamson, D. A., Netemeyer, R. G., Bentz, B. G.,
& Manguno-Mire, G. M. (2001). Psychometric properties and factor structure of the
worry domains questionnaire. Assessment, 8(2), 177-191.
doi:10.1177/10731911010080020
McClafferty, C. (2012). Expanding the cognitive behavioural therapy traditions: An application
of functional analytic psychotherapy treatment in a case study of depression.
doi:10.1037/h0100942
Meyer, T. J., Miller, M. L., Metzger, R. L., & Borkovec, T. D. (1990). Development and
validation of the Penn state worry questionnaire. Behaviour Research and Therapy,
28(6), 487-495. doi:10.1016/0005-7967(90)90135-6
Mundt, J. C., Marks, I. M., Shear, M. K., & Greist, J. M. (2002). The Work and Social
Adjustment Scale: A simple measure of impairment in functioning. The British Journal
of Psychiatry, 180(5), 461-464.
Orwin, R. G. (1983). A failsafe N for effect size in meta-analysis. Journal of Educational
Statistics, 8 157-150. DOI: 10.2307/1164923
Oshiro, C. K. B., Kanter, J., & Meyer, S. B. (2012). A single-case experimental demonstration of
functional analytic psychotherapy with two clients with severe interpersonal problems.
doi:10.1037/h0100945
Öst, L. G. (2008). Efficacy of the third wave of behavioral therapies: a systematic review and
meta-analysis. Behaviour Research and Therapy, 46(3), 296-321.
doi:10.1016/j.brat.2007.12.005
Pedersen, E. R., Callaghan, G. M., Prins, A., Nguyen, H. V., & Tsai, M. (2012). Functional
analytic psychotherapy as an adjunct to cognitive-behavioral treatments for posttraumatic
stress disorder: Theory and application in a single case design. International Journal of
Behavioral Consultation and Therapy, 7(2-3), 125-134. doi:10.1037/h0100947
Rosenthal, R. (1979). The “file drawer problem” and tolerance of null results. Psychological
Bulletin, 85, 638-641. doi/10.1037/0033-2909.86.3.638
Scruggs, T. E., & Mastropieri, M. A. (1998). Summarizing single-subject research: Issues and
applications. Behavior Modification, 22, 221-242.
Shadish, W. R., Hedges, L. V., & Pustejovsky, J. E. (2014). Analysis and meta-analysis of
single-case designs with a standardized mean difference statistic: A primer and
applications. Journal of School Psychology, 52(2), 123. doi:10.1016/j.jsp.2013.11.005
Shadish, W. R., & Sullivan, K.J. (2011). Characteristics of single-case designs used to assess
intervention effects in 2008. Behavior Research, 43, 971-980.
Singh, S., & O’Brien, W. H. (2016). Functional analytic psychotherapy for nursing home
residents: A single-subject investigation of session-by-session changes. Journal of
Contemporary Psychotherapy, doi:10.1007/s10879-016-9352-5
Spanier, G. B. (1976). Measuring dyadic adjustment: New scales for assessing the quality of
marriage and similar dyads. Journal of Marriage and the Family, 15-28.
Spitzer, R. L., Kroenke, K., Williams, J. B. W., & Löwe, B. (2006). A brief measure for
assessing generalized anxiety disorder: The GAD-7. Archives of Internal Medicine,
166(10), 1092.
Steer, R. A., Ball, R., Ranieri, W. F., & Beck, A. T. (1999). Dimensions of the beck depression
inventory‐II in clinically depressed outpatients. Journal of Clinical Psychology, 55(1),
117-128. doi:10.1002/(SICI)1097-4679(199901)55:1
Swanson, H. L., Hoskyn, M., & Lee, C. (1999). Interventions for students with learning
disabilities: A meta-analysis of treatment outcomes. New York: Guilford.
Tate, R. L., Perdices, M., Rosenkoetter, U., McDonald, S., Togher, L., Shadish, W., ... &
Sampson, M. (2016). The Single-Case Reporting Guideline In BEhavioural Interventions
(SCRIBE) 2016: Explanation and elaboration. Archives of Scientific Psychology, 4(1), 1-
9.
Tsai, M., Kohlenberg, R. J., Kanter, J. W., Kohlenberg, B., Follette, W. C., & Callaghan, G. M.
(2009). A guide to functional analytic psychotherapy: Awareness, courage, love, and
behaviorism. New York, NY US: Springer Science + Business Media.
Villas-Bôas, A., Meyer, S. B., & Kanter, J. W. (2016). The effects of analyses of contingencies
on clinically relevant behaviors and out-of-session changes in functional analytic
psychotherapy. The Psychological Record, 66(4), 599-609. doi:10.1007/s40732-016-
0195-y
Virella, B., Arbona, C., & Novy, D. M. (1994). Psychometric properties and factor structure of
the spanish version of the state-trait anxiety inventory. Journal of Personality
Assessment, 63(3), 401-412. doi:10.1207/s15327752jpa6303_1

White, O. R. (1974). The “split middle” a “quickie” method of trend estimation. University of
Washington, Experimental Education Unit, Child Development and Mental Retardation
Center.
Wolf, F. M. (1986). Meta-analysis: Quantitative methods for research synthesis. London: Sage.
Xavier, R. N., Kanter, J. W., & Meyer, S. B. (2012). Transitional probability analysis of two
child behavior analytic therapy cases. International Journal of Behavioral Consultation
and Therapy, 7(2-3), 182-188. doi:10.1037/h0100954

Table 1
Brief description of studies used in quantitative synthesis
Participant
Total SCRIBE
Author Description Study Design Treatment Baseline Treatment
Sessions Score
(Age)
Baruch, Kanter,
Busch, & 1 Male (21) Case Study FAP-Enhanced ACT 37 9
Juskiewics (2009)
Busch et al., (2009) 1 Female (25) A/B Design FAP-Enhanced CBT 5 15 20 15
Callaghan,
Summers, & 1 Male (30) A/B Design FAP 23 13
Weidman (2003)
1 Unknown 2 6 8
Cattivelli Multiple
2 Unknown FAP 2 6 8 2
(Unpublished) Baseline
3 Unknown 2 8 10
1 Male (12) 4 11 15
Cattivelli, Tirelli, 2 Male (11) 7 9 16
Multiple
Berardo, & Perini 3 Male (12) FAP 7 8 15 14
Baseline
(2012) 4 Male (13) 6 14 20
5 Male (15) 8 18 26
Ferro-Garcia,
Lopez-Bermudez,
1 Female (24) Case Study FAP 7 16 23 9
& Valero-Aguayo
(2012)
Kanter et al., 1 Female (24) 12 8 20

A/B Design FAP 13
(2006) 2 Male (42) 8 4 12
Kohlenberg, &
1 Male (35) A/B Design FAP-Enhanced CT 8 7 15 12
Tsai (1994)
6 4 10
1 Female (44)
Landes, Kanter, 6 7 13
2 Female (20)
Weeks, & Busch A/A+B Design FAP 10 4 14 17
3 Male (28)
(2013) 7 7 14
4 Male (26)
4 10 14
Lizarazo, Muñoz- 1 Male (25)
5 14 19
Martínez, Santos, 2 Female (47) A/A+B Design FAP 18
6 10 16
& Kanter (2015) 3 Female (21)
Lopez (2002) 1 Male (31) A/B Design FAP 3 32 35 12
Manduchi, &
1 Female (36) Case Study FAP 52 10
Schoendorff (2012)
Manos et al.,
1 Female (22) Case Study FAP-Enhanced BA 8 13
(2009)
McClafferty (2012) 1 Male (35) Case Study FAP-Enhanced BA 30 9

McCluskey 2
1 Male (25) A/B Design FAP-Enhanced BA 10 10 3
(Unpublished) 0
2
Oshiro, Kanter, & 1 Female (46) Reversal 11 9 0
FAP 16
Meyer (2012) 2 Male (18) Design 12 8 2
0
Pedersen,
Callaghan, Prins,
1 Female (41) Case Study FAP 14
Nguyen, & Tsai
(2012)
1 Male (72) 2 4 6
Singh & O'Brien
2 Male (52) A/B Design FAP 2 4 6 15
(2016)
3 Female (31) 2 4 6
3
Villas-Bôas,
1 Female (38) A/B/BC/B2/ 5 33 8
Meyer, & Kanter FAP 14
2 Female (32) BC2 5 28 3
(2016)
3
1
Xavier, Kanter, & 1 Female (10) FAP-Enhanced 7
Case Study 11
Meyer (2012) 1 Male (7) Child Therapy 3
1
Table 2
Description of Diagnoses, TB1s, and TB2s per study
Authors P DSM Disorder CBR1 Dimension CRB2 Dimension
Disclosure Disclosure
Dysthymia
Baruch et al. (2009) 1 Emotional Emotional
Psychotic Symptoms
Expression Expression
Disclosure
Major Depressive Disorder
Busch et al. (2009) 1 Conflict Emotional
Histrionic Personality Disorder
Expression
Assertion of Needs Assertion of Needs
Bidirectional Bidirectional
Narcissistic Personality Disorder Communication Communication
Callaghan, et al. (2003) 1
Histrionic Personality Disorder Disclosure Disclosure
Emotional Emotional
Cattivelli et al. (2012) 1 No description
Disclosure
Ferro-Garcia et al. (2012) 1 Major Depressive Disorder Disclosure Emotional
Expression
Major Depressive Disorder Conflict
Kanter et al. (2006) 1 Disclosure
Histrionic Personality Disorder Disclosure
Major Depressive Disorder Bidirectional
Bidirectional
2 Personality Disorder, NOS Communication
Communication
Past polysubstance dependence Conflict
Kohlenberg & Tsai
1 Depression No description
(1994)
Landes et al. (2013) 1 Generalized Anxiety Disorder Assertion of Needs Assertion of Needs
Depressive Personality Disorder
Avoidant Personality Disorder Bidirectional Bidirectional
2
Obsessive Compulsive Personality Disorder Communication Communication
Past alcohol abuse Emotional Emotional
3
Avoidant, Depressive, and Borderline Expression Expression
Personality Disorder
Emotional Emotional
4 Expression Expression
Assertion of Needs Assertion of Needs
Lizarazo et al. (2015) 1 Borderline Personality Disorder Disclosure Disclosure
Bidirectional Bidirectional
2
Communication Communication
3 Disclosure Disclosure
Bidirectional
Lopez (2002) 1 Communication Disclosure
Conflict
Bidirectional
Communication Disclosure
Manduchi & Schoendorff Obsessive Compulsive Personality Disorder
1 Disclosure Emotional
(2012) Borderline Personality Disorder
Emotional Expression
Expression
Conflict
Disclosure
Disclosure
Manos et al. (2009) 1 Emotional
Emotional
Expression
Expression
McClafferty (2012) 2 Depression Emotional Emotional
Borderline Personality Disorder

Oshiro et al. (2012) 1 No description
Schizophrenia
Posttraumatic Stress Disorder Disclosure Disclosure
Pedersen et al. (2012) 1 Alcohol Dependence Emotional Emotional
Dysthymia Expression Expression
Singh & O'Brien (2016) 1 Major Depressive Disorder Emotional Emotional
2 Major Depressive Disorder Disclosure Disclosure
Emotional Emotional
3 Major Depressive Disorder
Villas-Bôas et al. (2016) 1 Conflict Conflict
2 Assertion of Needs Assertion of Needs
Xavier et al. (2012) 1 No description
Cattivelli (Unpublished) 1 No description

McCluskey
1 No description
(Unpublished)
Table 3
Effect Size Calculation per Participant for Graphical Data for Target Behaviors-1
Participant Baseline Treatment
Swanson’
Author Description Points for Points for PND SMTE
sd
(Age) Calculation Calculation
1 Unknown 2 6 1.63 83.33%
Cattivelli
(Unpublished) 2 Unknown 2 6 1.70 100%
3 Unknown 4 6 1.56 100% 100%*
Busch et al., (2009) 1 Female (24/25) 6/9 9/14 0.84 0% 88.89%*

Kanter et al., (2006) 2 Male (42) 6 3 0.14 0% 100%
Kohlenberg, & Tsai

1 Male (35) 8 5 1.19 60% 20%
(1994)
Landes et al., (2013) 1 Female (20) 6 7 2.76 100% 100%*
1 Male (25) 4 10 0.26 60% 20%

Lizarazo et al.,
2 Female (47) 5 14 2.99 92.31% 100%*†
(2015)
3 Female (21) 6 10 0.48 0% 60%
Lopez (2002) 1 Male (31) 3 32 1.22 16.67% 79.16%*
1 Female (46) 4 5 2.04 100% 100%*

Oshiro et al., (2012)
2 Male (18) 4 4 1.56 100% 100%†
Pedersen et al.,
1 Female (41) 3 4 0.40 75%
(2012)
1 Male (72) 2 4 1.30 100%
Singh & O'Brien 2 Male (52) 2 4 1.56 25%
(2016)
3 Female (31) 2 4 1.47 100%
Villas-Bôas et al., 1 Female (38) 5 8 0.18 16.67% 16.67%
(2016) 2 Female (32) 5 7 2.09 85.71% 100%*
1 Female (10) 4 5 2.47 60% 40%
Xavier et al., (2012)
2 Male (7) 4 6 0.05 16.67% 16.67%
Total Participants 21 21 16
Means 1.33 58.70% 69.43%
(0.95 – (41.31 – (50.70 –
Confidence Intervals
1.71) 76.07) 88.17%)
Note. * p < .05, † p < .01
Table 4
Effect Size Calculation per Participant for Graphical Data for Target Behaviors-2
Participant Baseline Treatment
Author Description Points for Points for Swanson’s d PND SMTE
(Age) Calculation Calculation
Busch et al., (2009) 1 Female (25) 6 9 0.62 77.78% 100%*
1 Gender/Age
2 6 1.71 100%
Unknown
Cattivelli 2 Gender/Age
2 6 1.73 100%
(Unpublished) Unknown
3 Gender/Age
4 6 0.08 100%
Unknown
1 Male (12) 4 11 1.70 100% 100%*
2 Male (11) 7 9 2.06 100% 100%*
Cattivelli et al.,
3 Male (12) 7 8 1.95 100% 100%*
(2012)
4 Male (13) 6 14 2.43 100% 100%*
5 Male (15) 8 18 3.33 100% 100%*
1 Female (44) 6 4 1.99 62.50% 100%†

Landes et al., (2013) 2 Male (28) 10 4 2.50 0% 100%†
3 Male (26) 7 7 2.30 100% 100%*
1 Male (25) 4 10 2.77 80% 80%*

Lizarazo et al.,
2 Female (47) 5 14 1.70 76.92% 84.60%*
(2015)
3 Female (21) 6 10 0.07 20% 20%
1 Female (46) 4 5 2.17 100% 100%*

Oshiro et al., (2012)
2 Male (18) 4 4 1.54 100% 100%†
1 Male (72) 2 4
1.72 100%
Singh & O'Brien 2 Male (52) 2 4 2.47 25%
(2016)
3 Female (31) 2 4 1.17 100%
Villas-Bôas et al., 1 Female (38) 5 8 2.77 66.67% 100%*
(2016) 2 Female (32) 5 7 3.28 100% 57.14%
1 Female (10) 4 5 2.61 80% 20%
Xavier et al., (2012)
2 Male (7) 4 6 0.07 33.33% 33.33%
Total Participants 24 24 18
Means 1.85 79.39% 80.66%
(1.49 – (67.67 – (69.27 –
Confidence Intervals
2.24) 92.51%) 96.85%)
Note. *p < 0.05 †p<0.01
Table 5
Reliable Change Scores
RCI
Author P Measure
Score
Symptom-Based Measures
Baruch et al., (2009) 1 Beck Depression Inventory-II 2.33*
Busch et al., (2009) 1 Beck Depression Inventory-II 3.96*
Callaghan et al., (2003) 1 Beck Depression Inventory-II 0.93
State-Trait Anxiety Inventory 5.62*
Lopez (2002) 1
Penn State Worry Questionnaire 6.21*
Beck Depression Inventory-II 2.57*
Manduchi & Schoendorff (2012) 1
Worry Domains Questionnaire 4.39*
Patient Health Questionnaire-9 11.00*
McClafferty (2012) 1
Generalized Anxiety Disorder-7 11.88*
McCluskey (Unpublished) 1 Beck Depression Inventory-II 4.66*
Total Participants 7
Mean 5.36*
Confidence Interval (3.09 – 7.62)
Quality of Life Measures
Baruch et al., (2009) 1 Outcome Questionnaire – 45 2.58*
Experience of Self Scale 6.00*
Ferror-Garcia et al., (2012) 1
Acceptance & Action Questionnaire-Spanish 1.98*
Lopez (2002) 1 Acceptance & Action Questionnaire-II 5.13*
Manduchi & Schoendorff (2012) 1 Acceptance & Action Questionnaire-II 3.84*
Behavioral Activation for Depression Scale -0.60
Manos et al., (2009) 1
Dyadic Adjustment Scale -0.37
Work and Social Adjustment Scale 7.16*
McClafferty (2012) 1
CORE Outcome Measure 9.01*
Functional Ideographic Assessment Template – Short Form 2.79*
1
Acceptance & Action Questionnaire-II 1.54
Functional Ideographic Assessment Template – Short Form -2.79*
Singh & O’Brien (2016) 2
Acceptance & Action Questionnaire-II -1.02
Functional Ideographic Assessment Template – Short Form 5.09*
3
Acceptance & Action Questionnaire-II 3.33*
McCluskey (Unpublished) 1 Acceptance & Action Questionnaire-II 3.33*
Total Participants 10
Mean 2.94*
Confidence Interval (1.35 – 4.51)
Note. * denotes statistically reliable RCI score.
Table 6
Reliability and Standard Deviations Used for Reliable Change Index Calculation
Measure Citation Cronbach’s  SD
BDI-II Steer, Ball, Ranieri, & Beck (1999) 0.93 11.46
STAI-State Virella, Arbona, & Novy (1994) 0.91 9.46
STAI-Trait Virella, Arbona, & Novy (1994) 0.86 8.88
PSWQ Meyer, Miller, Metzger, & Borkovec (1990). 0.97 13.80
Worry Domains Questionnaire McCarthy-Larzelere et al., (2001) 0.94 19.52
PHQ-9 Kroenke, Spitzer, & Williams (2001) 0.89 6.10
GAD-7 Spitzer, Kroenke, Williams, & Löwe (2006). 0.89 3.41
Outcome Questionnaire – 45 Lambert et al., (1996) 0.93 24.14
Experience of Self Scale Kanter, Parker, & Kohlenberg (2001) 0.91 1.33
AAQ-Spanish Barraca (2004) 0.74 8.42
AAQ–II Bond et al., (2011) 0.88 7.97
BADS Kanter, Rusch, Busch, Sedivy (2009) 0.92 20.15
Dyadic Adjustment Scale Spainer (1976) 0.96 28.30
Work and Social Adjustment Scale Mundt, Marks, Shear, & Greist (2002). 0.80 6.40
CORE Outcome Measure Barkham et al., (2001) 0.94 0.75
FIAT-Q-SF Darrow, Callaghan, Bonow, & Follete (2014) 0.85 18.31
Note. BDI-II: Beck Depression Inventory-II
STAI: State-Trait Anxiety Inventory
PSWQ: Penn State Worry Questionnaire
PHQ-9: Patient Health Questionnaire-9
GAD-7: Generalized Anxiety Disorder Assessment-7
AAQ: Acceptance & Action Questionnaire
FIAT-Q-SF: Functional Ideographic Assessment Template-Questionnaire-Short Form
References retrieved from

initial article search
(n = 179)
Ineligible
Conceptual Reviews (n = 75)
Theoretical articles (n = 56)
Measurement articles (n = 7)
Assessed for eligibility

(n = 41)
Ineligible
Narrative reviews (n = 16)
Therapist-focused outcomes (n = 3)
Group-based outcomes (n = 2)
Data unavailable (n = 2)
Published studies meeting
criteria for review
(n = 18)
Unpublished data located

(n = 2)
Studies included in
quantitative synthesis
(n = 20)
Figure 1.
Flow Chart of Study Selection

Sighn &amp; O&#39;Brien-Quantitaive Review-JCBS

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Sighn &amp; O&#39;Brien-Quantitaive Review-JCBS

Uploaded by

Copyright:

Available Formats

Author’s Accepted Manuscript

A Quantitative Synthesis of Functional Analytic

R. Sonia Singh, William H. O’Brien

A Quantitative Synthesis of Functional Analytic Psychotherapy Single-Subject Research

R. Sonia Singh, M.A.

William H. O’Brien, Ph.D.

Bowling Green State University

 The current study attempts to summarize FAP single-subject research.

Functional Analytic Psychotherapy (FAP) is a contextual behavioral psychotherapy based

session adaptive behaviors (O2s) can be targeted for change.

Well-established behavioral principles are used to promote in-session changes in CRB1s,

Following operationalization, the therapist uses reinforcement-based techniques (e.g., social

attention, verbal reinforcement, non-verbal reinforcement) to increase the frequency of CRB2s.

differential reinforcement of other/incompatible behavior, selective non-reinforcement/ignoring,

statistical analyses or graphical representation of treatment effects.

Öst (2008) conducted a meta-analysis of third-wave behavioral therapies including FAP,

Acceptance and Commitment Therapy, Dialectical Behavior Therapy, Cognitive Behavioral

high quality empirical evidence such as randomized control trials.

while another 11 individuals participated in a “watchful waiting” control condition. The

needed (Maitland et al., 2016b).

In summary, FAP is a contextual behavioral therapy that uses well-established behavioral

there has been no quantitative synthesis of this literature at this time.

Quantitative Synthesis of Single-Subject Research: Treatment Effect Indices

phenomena as well as idiosyncratic responses to interventions. Importantly, single-subject data

In most applications, meta-analytic techniques are designed to aggregate and analyze

numerical data. Therefore, a different approach is required.

Researchers have developed ways to generate quantitative data from single-subject

aggregate effect sizes.

methodological characteristics of FAP treatment outcome studies. However, here it is important

psychotherapies, is focused on treatment outcomes. Thus, trends and session-by-session changes

have a small number of data points.

The aforementioned characteristics of FAP research suggest that a quantitative synthesis

in data. Additionally, a nonparametric binomial test of significance can be used to determine

conclusion of treatment relative to baseline.

evaluate treatment outcomes (Jacobson & Truax, 1991).

Summary and Aims of the Present Investigation

FAP is a contextual behavioral therapy that applies behavioral principles to the

An examination of single-subject quantitative synthesis techniques indicated that PND,

participant characteristics, treatment characteristics, or target characteristics.

of participants, age, gender, ethnicity, FAP-related self-report measures, self-report measures of

of treatment, relevant treatment statistics (pre- and post-assessments), and graphical

demographics, participant diagnoses), methodology information (e.g., type of single-subject

formulas were used to calculate these values.

Reporting Guideline in BEhavioural Interventions (SCRIBE) developed by Tate et al. (2016).

Figure 1 provides visual representation of the article inclusion process.

Effect Size Calculation

Percentage of Non-Overlapping Data (PND). When a graph presented behaviors

total number of interventions points and multiplying the result by 100.

multiplying the result by 100.

Split Middle Trend Estimation (SMTE). Split-middle trend of estimation was

of obtaining each outcome.

dsw = (Mt – Mb)/(SD/√2(1-r)

interpreted in the psychotherapy meta-analytic literature. Specifically, Cohen (1988) suggested

RC = X2 – X1 Sdiff = √2(SE)2 SE = SD√(1-r)

The RCI can range from -∞ to +∞ with no pre-treatment to post-treatment difference

which self-report inventory data were provided.

Effect Size Aggregation and Analysis

and treatment phase were used to calculate the effect size.

outcomes due to participant characteristics, treatment characteristics, or targets of treatment.

Methodological Characteristics of Studies

Publication Characteristics. Out of the 20 studies located, 18 studies were published

reported in Tables 1 and 2.

American American, and one was Latin.

the Functional Ideographic Assessment Template-Questionnaire (Callaghan, 2006). As detailed

Sighn & O'Brien-Quantitaive Review-JCBS

Sighn & O'Brien-Quantitaive Review-JCBS