You are on page 1of 4

journal club (6)

Journal club 6: single subject designs


Jennifer Reids series aims to help you access the speech and language therapy literature, assess its credibility and decide how to act on your findings. Each instalment takes the mystery out of critically appraising a different type of journal article. Here, she looks at single subject designs.
ave you ever felt uneasy about speech and language therapy clients being lumped together for group intervention studies? Arent our client groups simply too heterogeneous to expect that one intervention will be effective for them all? How do you know if a complex intervention will be right for your clients if it has been tested with participants and clinicians whose characteristics are described only in very broad terms? After all, speech and language therapy interventions need to be moulded to meet a clients individual needs and circumstances to be successful, dont they? It feels against the grain to allocate clients randomly to different intervention groups so that all their individual differences are washed out! Perhaps such misgivings about group intervention methods are one of the reasons that single case designs remain popular in speech and language therapy research, despite their low ranking in the evidence hierarchy. We are not alone in this for example, psychologists working in the field of acquired brain injury also continue to employ single subject designs, valuing them in particular for their flexibility and sensitivity to individual differences. Thanks to this, there is a practical and validated appraisal tool we can use for these sorts of studies the Single Subject Experimental Design Scale (SCED) (Tate et al., 2008). Note the word experimental in the name. We are not talking here about case studies in which clinicians simply describe clients and their care pathway. In order to contribute to the evidence base, single subject studies need to be of good quality and that means robust methods, pre-planned interventions and accurate, reliable measurement. Not unlike group studies then Do not be misled by the name, though, as single subject design does NOT mean that the studies necessarily involve only one participant. You still need to work out which method has been used group or single subject design to choose the right appraisal framework for an intervention study. READ THIS SERIES IF YOU WANT TO yy BE MORE EVIDENCE-BASED IN YOUR PRACTICE yy FEEL MOTIVATED TO READ JOURNAL ARTICLES yy INFLUENCE DEVELOPMENT OF YOUR SERVICE

So when a study presents results from a number of participants, how do I know if this is a group study or a single subject one? If participants are allocated to groups receiving different intervention regimes, and group results, rather than individual, are reported, then you are most likely to be dealing with a group intervention study. Your appraisal tool of choice will be one

...to contribute to the evidence base , single case subject studies need to be of good quality - and that means robust methods, pre-planned interventions and accurate, reliable measurement
for randomised controlled trials (RCTs) and other group intervention studies such as the one I presented in Journal club 4 (Speech & Language Therapy in Practice, Summer 2011). Single case design, single subject design, n-of-1 trial if any of these terms are used in the title or abstract, then you are probably dealing with a single subject design with several participants. Single subject experimental studies involve repeated measures from

individual participant(s). The wording you will often find is that participants served as their own control, which means that the study used repeated measures over time of the individuals performance in an area not being treated as the comparison for measures in the treated area. The quality essentials of single subject experimental designs are: Performance is measured repeatedly to ensure that any intervention effects are sustained over time. Repeated measures designs show how performance varies over time in a way that is usually not possible with group designs reporting group averages. Sometimes it is hard to tell from the title or abstract whether a single subject or group design has been used; indeed, sometimes researchers report both group results and repeated measures. In any case, the SCED Scale will allow you to appraise how well a study using repeated measures was conducted. Even if you are reading a simple (anecdotal) case report, such as those often seen in Speech & Language Therapy in Practice or the Bulletin of the Royal College of Speech & Language Therapists, the domains will help you think about the reasons why the author may be barking up the wrong tree in their conclusions, especially if causal relationships are being implied.

Appraisal

The SCED Scale is available as a single sheet pdf file to download from the PsycBITE website (http://www.psycbite.com/docs/The_SCED_ Scale.pdf). It has 11 domains for appraisal, which I have converted into questions and explained. You may, however, wish to start your appraisal with the general questions we usually ask about a study, including whether the question being asked is one that is important for your practice or for your service, and whether a single subject design was a sensible choice of method to answer that question. You should also consider whether the intervention is described in enough detail for you to implement it yourself.

18

SPEECH & LANGUAGE THERAPY IN PRACTICE WINTER 2011

journal club (6)


Question 1: Was the participants clinical history adequately described? out which aspect of the clients functioning is being addressed in the intervention. Then check whether the intervention goals have been defined for the purposes of the intervention (an operational definition) in such a way that allows change to be observed. It might help here to apply your knowledge of so-called SMART targets (specific, measurable, attainable, relevant, time-framed). Are the measures likely to be reliable across different raters or contexts? The intervention programme in the Fry et al. (2009) study included fluency management and cognitive behaviour therapy techniques to target both overt and covert stammering symptoms. Appropriate measures were selected to measure change in both these targeted areas, and therefore to assess intervention success. Measures of overt stammering included relatively objective, quantitative measures, which are precisely described and replicable (percentage stammered syllables and mean of the three longest stammered syllables from the first 500 syllables of 5-minute video recordings made by the participant at home while talking to a family member or friend). Covert symptoms are assessed via three externally validated self-report measures. So the study gets a tick for this question too. Question 3: Is the design good enough to provide evidence of an intervention effect? The SCED Scale specifies as the minimum for acceptability a 3-phase design, which should be either: A reversal or withdrawal design (A-B-A) in which baseline performance is established before treatment is given, performance measured during treatment and then again after treatment has been withdrawn (or switched to another goal), or A multiple baselines method across different behaviours where only one behaviour is being treated at a time. These designs introduce essential controls which allow you to see whether or not any changes in performance appear to be associated with the intervention. The association should show that change is specific to the intervention goals, and also linked in time with the phases of the study. Here is the description of the 4-phase design adopted by Millard et al. (2009 p.63) to enable any evidence of a treatment effect to be provided. The authors also display the phases and timeline in a helpful figure. This was a single subject design replicated across participants. There were four phases, each lasting 6 weeks. The length of the phases and the data collection points were arranged to coincide with the current delivery of the [therapy for children who stammer] program. The duration of the study (from the first week of phase A1 to the last week of phase A2) was matched to the time that families were on the waiting list for an assessment appointment, so that taking part in the study did not disadvantage those who did not receive therapy. This allowed us to establish a no treatment group. During each phase parents video recorded parentchild play sessions at home, once a week. Children who were allocated to the therapy condition completed all phases, while those who were allocated to the waiting list condition completed only the assessment phases (A1 and A2).

One of the main advantages of a single subject design study is its flexibility; it can provide a lot of scope for individualisation of the intervention. Consequently, you may see a more direct application to your own context if there is enough information on the participant(s) to make a reasoned judgement on how similar they are to one or more of your own clients. The SCED Scale suggests age, sex, aetiology and severity must be reported but you may want to know about other issues, for example, response to any previous speech and language therapy intervention. Here is an extract from a helpful description of a participant in a study of an intensive group intervention for young adults with a stammer: TM was a male, mono-lingual English speaker of African ethnic background, aged 18;0 at the beginning of the study. He had no history of identified speech, language, communication or other difficulties. There was a family history of persistent stuttering, with both TMs father and one brother stuttering into adulthood. TM was reported to have started stuttering at 11 years of age Limited referral information identified that TM had been known to his local speech and language therapy service for several years and had periodically received both individual and group therapy since the age of 13. He had not attended therapy in the 12 months prior to the start of the study (Fry et al., 2009, p.13). Question 2: Does the study identify measures that can be used to evaluate intervention success? The intervention goals need to be precise and properly defined so that they can be measured accurately and reliably. The first thing to do is to work

Question 4: Was an adequate baseline established before intervention commenced?

We are in the realms of causality here, and, as discussed in Journal Club 5 on observational designs (Speech & Language Therapy in Practice, Autumn 11), water-tight evidence of cause-and-effect can be elusive even when we are using reasonably robust research methods. A single subject design study never provides definitive evidence of intervention efficacy but if participants show large amounts of specific changes, the results from such a study can be pretty compelling, providing useful preliminary evidence of an intervention effect and therefore of approaches that look promising.
SPEECH & LANGUAGE THERAPY IN PRACTICE WINTER 2011

19

JOURNAL CLUB (6)


Baseline assessment provides information on a participants performance in the period before intervention begins. It is good practice to establish performance trends during the baseline period, such as whether performance is stable, fluctuating, deteriorating or improving. If this trend reverses or changes dramatically during the intervention, this is evidence to support an intervention effect. Trends can only be established if the baseline phase is long enough to allow sampling of performance over time. Here is Christina Samuelssons (2011, pp.59-60) description of her baseline assessment from a multiple baseline study of prosodic intervention: The participating child was a boy of 4;6 years. Before the intervention was introduced, the childs prosody was assessed repeatedly (3 times over a period of 9 weeks) using the previously described assessment tool [which] covers production of prosody at word, phrase and discourse level. The baseline assessment was carried out every third week over the 9-week period. In addition, assessment was also made of other linguistic skills the boy had problems with prosodic production [which] were shown to be stable across baseline observations [presented with a bar chart display of these data]. Question 5: Can a treatment response be distinguished from fluctuations resulting from other factors? change in frequency of stammering cannot be attributed to chance alone. Graphs of the results from individual participants provide compelling visual evidence of a treatment effect in some. However, at least one of the no treatment participants showed significant improvement during the second assessment phase so, as the authors point out, other factors must therefore have been operating for this child. Question 6: Is data displayed to show variability? are difficult to measure objectively, such as perceptual measures of voice quality. With regard to inter-rater reliability, Fry et al. (2009, p. 643) report that, the transcriptions from one point in each phase of the study were randomly selected for blind analysis by a second rater. Percentage interrater agreement was based on point-by-point agreement for the presence of stuttering in each syllable (Hubbard & Yairi, 1988). Interrater agreement was calculated using the percentage agreement index (Suen & Ary, 1989): the number of agreements divided by the sum of the number of agreements and the number of disagreements, multiplied by 100. Interrater agreement was 96.9%. Okay, no argument there then not only careful consideration of the issue of interrater reliability, but also measurement using approaches supported by previous research. Tick! Question 8: Were independent assessors used? Remember that one of the strengths of single subject designs is preservation of individual variation, so studies should employ good visual displays of variability data. Graphs or tables of raw, rather than converted, scores or data from pre-, during and post-intervention phases are usually recommended. So, in the data displays from the study being appraised, can you see at a glance how things vary over time? Millard et al.s (2009) charts are a good example of appropriate visual displays of individual variation, both of within-phase fluctuations in individuals and in differences in trends across individuals. Question 7: Are measures used reliable?

The design should control for undue influence on assessment from over-familiarity with the participants and the phase of the study (more observer bias). It is good practice for assessment data to be analysed blind to the participant and / or their study phase. Here is an example from a study of constraint-induced therapy for aphasia (Faroqi-Shah & Virion, 2009): All tests were independently scored for accuracy by both authors and a third research assistant who was blind to the treatment conditions. All discourse samples were transcribed by one of the authors, and 20% of randomly selected samples were transcribed by an independent research assistant who was blind to the treatment condition and time of testing for reliability purposes. Morphosyntactic codes were independently assigned by both authors. Of these samples, 20% were also coded by a research assistant for reliability purposes. Coding reliability exceeded 90%.

There are two issues at stake here. First, an adequate baseline will have captured information on the range of fluctuation present prior to intervention. Second, there needs to be sufficient sampling of performance during intervention to be able to differentiate changes that appear to go beyond the range of normal fluctuations seen in the baseline. In the Millard et al. (2009) study, percentage words stuttered for each participant is calculated from a weekly video-recording throughout the 6-week baseline phase. This allowed calculation of a mean percentage words stuttered for the baseline phase and then of a range for percentage words stuttered beyond which a

Remember that reliability is about getting consistency of results. You will want to be reassured that there is good agreement between different assessors in how they measure or rate the performance in question, otherwise systematic differences between assessors could skew the results (observer bias). If assessment was done by a single individual, is there evidence of intra-rater reliability? Interor intra-rater reliability will be particularly important for any aspects of functioning that

20

SPEECH & LANGUAGE THERAPY IN PRACTICE WINTER 2011

JOURNAL CLUB (6)


Question 9: Have the data been analysed statistically? provide only anecdotal information on carryover to other settings of the particpants increased responsiveness and use of AAC.

To evaluate a study on this SCED Scale domain, you simply have to find out whether any statistical analysis was used to demonstrate an intervention effect by comparing the results over the phases of the study. You dont have to know whether it was an appropriate statistical technique that was used. Phew! It appears that authors get Brownie points simply for trying to use inferential stats. Millard et al. (2009) show changes in stammering over the phases of their study using a statistical technique called cusum analysis which they report has been applied to naturally fluctuating data. That sounds appropriate, doesnt it? Moreover, the cusum charts of repeated measures from each participant have the added advantage of displaying the raw data on percentage words stuttered, the upper and lower limits supporting the statistical (cusum) analysis and the changes over the timescales of the phases of the study. Neat! Question 10: Is there evidence that any intervention effect can be replicated?

Critical appraisal for speech and language therapists (CASLT) Download the SCED Scale from www.psycbite.com/ docs/The_SCED_Scale. pdf, or get Jennifers version with cartoons from www.speechmag. com/Members/CASLT. Use it yourself or with colleagues in a journal club and let us know how you get on.
Question 11: Is there evidence for generalisation and carryover?

Unwanted bias

In conclusion, when considering the merits of single-subject design studies, please remember that, however methodologically good they are, lack of randomisation does allow unwanted bias to creep in. It may be a lot easier to judge

from the results of a single subject study how the intervention might impact on one of your own clients, but the bottom line is that the study cannot provide the answer to questions about the overall efficacy of this intervention SLTP for all potential recipients. Jennifer Reid is a consultant speech and language therapist with NHS Fife, email jenniferreid@nhs.net. Cartoons are by Fran, www.francartoons.co.uk. References Beck, A.R., Stoner, J.B. & Dennis, M.L. (2009) An investigation of aided language stimulation: Does it increase AAC use with adults with developmental disabilities and complex communication needs?, Augmentative and Alternative Communication 25(1), pp.42-54. Ebert, K.D. & Kohnert, K. (2009) Non-linguistic cognitive treatment for primary language impairment, Clinical Linguistics & Phonetics 23(9), pp.647664. Faroqi-Shah, Y. & Virion, C. (2009) Constraintinduced language therapy for agrammatism: role of grammaticality constraints, Aphasiology 23(7-8), pp.977-988. Fry, J., Botterill, W., & Pring, T. (2009) The effect of an intensive group therapy program for young adults who stutter: a single subject study, International Journal of Speech-Language Pathology 11(1), pp.12-19. Millard, S.A., Edwards, S. & Cook, F.M. (2009) Parent-child interaction therapy: Adding to the evidence, International Journal of SpeechLanguage Pathology 11(1), pp.6176. Samuelsson, C. (2011) Prosody intervention: A single subject study of a Swedish boy with prosodic problems, Child Language Teaching and Therapy 27(1), pp.5667. Tate, R.L., McDonald, S., Perdices, M., Togher, L., Schultz, R. & Savage, S. (2008) Rating the methodological quality of single-subject designs and n-of-1 trials: Introducing the Single-Case Experimental Design (SCED) Scale, Neuropsychological Rehabilitation 18(4), pp.385401.

To make use of this intervention in your own context, you need to be confident that the apparent response to intervention is not a one-off. Its not much use to know that something worked if it is limited to that particular individual in that particular context. Has the effect been demonstrated with other clients, different therapists or in other settings?

For much of what speech and language therapists do, the emphasis in the long term is on the clients self-management. It is therefore important to know whether the changes kick-started by the intervention are shown to impact on functioning in other areas. For example, if the intervention was impairment-based, has there been any carryover to functional communication? Ebert & Kohnert (2009) suggest that treatment of non-linguistic cognitive processing skills may facilitate change in some areas of language processing for children with primary language impairment, but they go no further than demonstrating changes in performance on standardised language tests we are given no information on impact, if any, on everyday functioning. If the intervention targeted expressive communication using AAC, has there been any impact on the persons social participation? Beck et al. (2009) used group language stimulation to teach use of AAC techniques to seven participants with complex communication needs but they

SPEECH & LANGUAGE THERAPY IN PRACTICE WINTER 2011

21

You might also like