You are on page 1of 5

Pain, 5(1 (1992) 281-285 281

© 1992 Elsevier Science Publishers B.V. All rights reserved 0304-3959/92/$05.00

PAIN 02095

Regression to the mean in treated versus untreated chronic pain *

Coralyn W. Whitney a and Michael Von Korff b


"Departments of Dental Public Health Sciences and Oral Medicine, SM-35, Unicersity of Washington, Seattle, WA 98195 (USA)
and b Center for Health Studies, Group Health Cooperatice of Puget Sound, Seattle, WA (USA)
(Received 7 June 1991, revision received 12 February 1992, accepted 25 February 19921

Summary The course of pain associated with temporomandibular disorders (TMD) and other chronic pain
conditions is typically episodic. Its expression may influence when a person seeks treatment, for example, when the
level of pain flares up or exceeds its characteristic severity. Improvement in pain status subsequent to entering
treatment may be due to: (1) specific effects of treatment; (2) non-specific effects of treatment ('placebo effects'); or
(3) regression to the mean. Due to regression to the mean, uncontrolled evaluation of treatment in persons
self-selected by a pain flare-up may lead to erroneous conclusions concerning effects of treatment by patients,
providers, and/or researchers. For this report, the magnitude of regression to the mean due to self-selection for
treatment is estimated by comparing subjects who sought treatment for TMD pain (n = 1471 to a random sample of
subjects with TMD pain not seeking treatment (n = 95). Among subjects seeking treatment, a significant 14.7-point
reduction in VAS pain intensity was observed at 1-year follow-up. A control group of TMD subjects not seeking
treatment showed no mean reduction in pain intensity but reported lower pain intensity at baseline than the group
seeking care. When both groups of subjects were stratified on baseline VAS pain values, the reduction in pain
increased as the baseline pain level increased, but no differences between comparable treated and untreated cases
in the extent of improvement were observed. The before-after differences in both groups may be attributed to
regression to the mean. We conclude that before-after differences in pain intensity can be large and that such
improvement may be largely due to regression to the mean. This sugg~:sts the need for research which differentiates
change due to regression to the mean (due to homeostatic processes, random within-subject variation, or
measurement error) from change due to specific and non-specific effects of treatment. In clinical practice, the
personal experience of patients and clinicians who observe improvement after initiation of treatment should be
regarded as an unreliable guide to treatment efficacy due to regression to the mean. This phenomenon may
contribute to the proliferation and continued use of treatments of unproven efficacy for pain management and
suggests caution in the use of costly or risky pain treatments the efficacy of which is unknown.

Key words: Regression to the mean

Introduction the dominant characteristic of most long-lived condi-


tions" (Sartwell and Merrell 19521.
"One characteristic of almost everything that per- Like many other pain conditions, the course of
sists for a long time is that it changes with time. It is temporomandibular disorder (TMD) pain is episodic
changeability and variation, not stability, that is in fact (Von Korff et al. 1988). Thus, the level of pain at any
single point in time does not indicate where individuals
are in relation to their characteristic level of pain (or
* Supported by NIDR Grant No. P01 DE08773 and AHCPR Grant across-time central tendency). The present status of a
No. R01 HS06168. person in their cycle of pain, along with the variability
and central tendency of their cycle, determines the
Correspondence to: Dr. Coralyn W. Whitney, Dept. of Dental potential magnitude of change in the level of pain
Public Health Sciences, SM-35, University of Washington, Seattle,
WA 98195, USA. Tel.: (206) 543-2034.
between the present and some future point in time.
282

The likelihood of a change of a particular magnitude is within-subject variation across time, or to homeostatic
determined by the proximity of the initial measurement processes tending to bring an individual's status back
to its characteristic level. The closer the level of pain is to its characteristic level or set point. McDonald and
to an extreme of an individual's across-time distribu- Mazzuca (1983), based on a literature review compar-
tion of pain, the greater the likelihood of the next ing randomized controlled trials with and without mul-
measurement being less extreme. In statistical terms, tiple baseline assessments, have argued that regression
this phenomenon is known as 'regression to the mean' to the mean has often been incorrectly attributed to
(Galton 1886; Davis 1986). That is, a variable that is non-specific, or placebo, effects of treatments. They
extreme when it is measured will tend, by chance suggest that the potency of the placebo effect may have
alone to be closer to its central tendency on a subse- been overstated due to its confusion with regression to
quent measure. Studies which base subject selection on the mean and advise exercising "caution in interpreting
the outcome variable of interest, in this case pain, are patient improvements as causal effects of our actions
at risk of inducing this phenomenon (Davis 1976). As a ... [we] should avoid the conceit of assuming that our
consequence, if someone seeks treatment when their personal presence has strong healing powers".
level of pain is at its peak, then pain can be expected to While regression to the mean has been studied in
decrease towards its characteristic level whether or not the context of epidemiologic and clinical research on
treatments are given. Similarly, if someone is experi- medical conditions (Ederer 1972; Gardner and Heady
encing the lowest level of pain in their cycle, then an 1973; Davis 1976; Shepherd 1981) it has not been
increase in pain would be expected. Pain cycles with extensively studied among pain patients. The present
little within-cycle variability will yield small changes report uses empirical data from an on-going longitudi-
over time and cycles with greater variability will yield nal study of pain associated with TMD to elucidate the
greater changes. extent to which statistical regression to the mean might
The phenomenon of cyclic variation is significant in contribute to the magnitude of apparent effect size in
clinical settings. Subjects seeking treatment for pain do pain treatment outcome.
so when there is a flare-up in the level of pain or when The method for calculating the amount of reduction
the level of pain is no longer tolerable (Von Korff et al. in pain due to regression to the mean requires the
1991). If the level of pain and treatment-seeking behav- assumption that the before- and after-treatment levels
ior are related, the efficacy of pain treatment as it of pain follow a bivariate normal distribution and are
appears to the patient and to the clinician may exceed correlated (James 1973; McDonald and Mazzuca 1983).
the actual reduction in pain due to treabner, t. In such Measurements of pain over time on the same individ-
situations treatment may appear efficacious due to the ual are typically correlated but do not follow a bivari-
initiation of treatment at a point in the pain cycle when ate normal distribution and are generally too skewed to
pain would naturally decrease (Turk and Rudy 1990). be normalized. This situation arises, for example, when
Without additional data to adjust for episodic varia- post-treatment measures reflect an appreciable num-
tion, it is not possible to determine what part of pain ber of subjects who report zero pain on a visual analog
reduction is attributable to treatment and what is due scale (VAS), The result is a non-normal, positively
to cyclic variation. skewed distribution of post-treatment pain scores. Al-
While regression to the mean is a well-established though the impact of regression to the mean cannot be
and fundamental statistical principle, pain researchers calculated by established methods under these condi-
have paid little attention to establishing its empirical tions, this does not mean the effect cannot be evalu-
validity in pain research or to demonstrating its poten- ated.
tial for influencing interpretations of pain treatment Estimation of the magnitude of regression to the
outcome studies. mean due to cyclic variation is possible through the use
When a patient seeks treatment for a pain condi- of a control group. The purpose of control groups is to
tion, subsequent improvement may be due to 3 types of adjust for sources of variation other than the specific
effects: (1) specific effects of treatment, such as the effects of treatment, yielding a less biased estimate of
biologic effects of a medicine or a procedure; (2) treatment effect. The sources of variation of interest
non-specific effects of treatment, sometime referred to here are the variability of pain over time and the
as 'placebo effects'. Miller (1989) identified non-specific self-selection for treatment when pain is higher than
treatment effects as including: the providers attitude usual. However, most controlled studies cannot distin-
toward the treatment; the provider's attitude toward guish l~on-specific effects of treatment from regression
the patient (warmth, interest, empathy); the faith of to the mean. In addition, studies which classify treated
the patient in the treatment; the reputation, expense, subjects into responders and non-responders may con-
or impressiveness of the procedure; suggestibility; and fuse factors related to elevated pain at baseline with
other psychological mechanisms. And, (3) regression to those predicting treatment response. The primary pur-
the mean due to measurement error, to random pose of this paper is to estimate empirically the possi-
283

ble m a g n i t u d e o f r e g r e s s i o n to t h e m e a n by c o m p a r i n g magnitude of regression to the mean is shown graphically by select-


p e r s o n s s e e k i n g p a i n t r e a t m e n t to a c o n t r o l g r o u p w i t h ing subjects who, based on their level of pain at baseline, exceeded a
series of VAS threshold values: pain greater than 0, 5, 10, 20, 30, 40,
t h e s a m e p a i n c o m p l a i n t i d e n t i f i e d by a p o p u l a t i o n 50.
s u r v e y r a t h e r t h a n by s e e k i n g care. T h i s o b s e r v a t i o n a l
comparison may provide some insight into the magni-
t u d e of r e g r e s s i o n to t h e m e a n r e l a t i v e to t h e com-
Results
b i n e d specific a n d n o n - s p e c i f i c e f f e c t s o f t r e a t m e n t .

E a c h c a s e w a s c l a s s i f i e d into a series o f g r o u p s
b a s e d o n t h e t h r e s h o l d s s t a t e d above. T h e m e a n level
Method
o f p a i n w a s e s t i m a t e d for b o t h C O C A s a n d C L C A s at
Tile data used are from a longitudinal study evaluating the b a s e l i n e a n d 1-year f o l l o w - u p . F o r this c o m p a r i s o n , the
clinical signs and symptoms associated with TMD (Dworkin et al. C O C A s p r o v i d e a n e s t i m a t e o f t h e a v e r a g e c h a n g e in
1990). The treated cases and untreated controls were identified from p a i n in subjects n o t t r e a t e d in t h e i n t e r i m . T h e result-
the enrollees in the Group Health Cooperative of Puget Sound, a ing m e a n V A S v a l u e s a n d p e r c e n t c h a n g e s a r e s h o w n
Health Maintenance Organization with over 320,000 enrollees at the
in T a b l e I.
time of this study. The case group consisted of a consecutive series of
patients who sought treatment and were referred for evaluation of T h e c h a n g e in m e a n p a i n i n t e n s i t y b e t w e e n b a s e l i n e
TMD pain. This group of enrollees will be referred to as clinic cases a n d 1-year f o l l o w - u p is i l l u s t r a t e d for C L C A s (Fig. l a )
(CLCA, n = 147). The control group consists of all individuals identi- a n d C O C A s (Fig. l b ) for t h e set of t h r e s h o l d values.
fied in a random sample survey of Group Health Cooperative en- O v e r a l l , t h e C L C A s (n = 147) s h o w e d a 14.7-point
rollees who reported TMD pain in the prior 6 months but who had
r e d u c t i o n in t h e i r p a i n i n t e n s i ~ . T h i s p r e - p o s t differ-
not sought referral for treatment. This group is referred to as the
community cases (COCA, n = 95). TMD pain was defined as "facial e n c e w a s statistically s i g n i f i c a n t (t = 5 . 4 8 , df= 146,
ache or pain in the jaw muscles, the joint in front of the ear or inside P < 0.01). In c o n t r a s t , t h e e n t i r e s a m p l e o f C O C A s
the ear (excluding infection)" in the 6 months prior to interview. s h o w e d n o r e d u c t i o n in p a i n i n t e n s i t y r a t i n g f r o m
The intensity of pain was evaluated by three 100-mm VAS b a s e l i n e to follow-up. T h e i r m e a n i n t e n s i t y w a s 12.6 at
ratings: 'worst pain', 'average pain', and 'pain right now' at the time
b a s e l i n e a n d 14.7 at 1-year follow-up. H o w e v e r , t h e
of interview. 'Pain right now', a measure commonly used to assess
the level of pain for pre-treatment and post-treatment evaluations, is C O C A s s h o w e d t h e s a m e p a t t e r n in p a i n r e d u c t i o n as
used in this paper. These pain ratings were obtained at entry into the t h e C L C A s w h e n r e s t r i c t e d to t h e s a m e b a s e l i n e
study (baseline) and at l-year follow-up. COCAs receiving treatment t h r e s h o l d c a t e g o r i e s as t h e C L C A s . F o r t h e t h r e s h o l d
between baseline and I-year follow-up were excluded. Some COCAs v a l u e o f V A S p a i n g r e a t e r t h a n 20, a 5 3 % r e d u c t i o n in
had a zero score on the 'pain right now' measure at baseline. m e a n V A S p a i n f r o m b a s e l i n e was o b s e r v e d in the
The overall comparison of pain intensity across time between
C L C A s , c o m p a r e d to t h e 5 9 % r e d u c t i o n a m o n g C O -
CLCAs and COCAs, with the COCAs serving as a control group,
was done by analysis of cowu'iance adjusting for pain at baseline. The C A s (see Fig. 1). A s i m i l a r p a t t e r n held for t h e h i g h e r

TABLE !
MEAN LEVEL OF PAIN AT BASELINE AND l-YEAR FOLLOW-UP WITH MEAN AND PERCENT CHANGE ACROSS TIME FOR
CASES EXCEEDING BASELINE PAIN THRESHOLD

Pain n Baseline 1 Year Change


threshold Mean S.D. * Mean S.D. Mean Percent
(I CLCA 147 33.70 28.10 19.00 25.20 - 14.70 -44
COCA 59 20. I 1 22.66 15.42 23.64 - 4.80 - 24
All ÷ COCAs 95 12.56 2(I.35 14.65 21.73 2.09 17
5 CLCA 116 42.10 25.74 22.29 26.65 - 19.81 - 47
COCA 36 32.06 21.91 22.56 26.71 - 9.50 - 30
10 CLCA 103 46.44 24.03 24.83 27.20 - 21.61 - 47
COCA 30 36.63 21.20 20.10 24.69 - 16.53 - 45
20 CLCA 83 53.94 20.57 25.40 26.78 - 28.54 - 53
COCA 22 44.73 18.96 18.23 24.04 - 26.50 - 59
30 CLCA 70 59.21 17.49 26.09 27.90 -33.12 -56
COCA 16 52.56 16.12 "~2.06 26.75 - 30.50 - 58
40 CLCA 58 64.60 14.69 27.7; 28.60 - 36.86 - 57
COCA 12 58.00 14.94 27.17 28.66 - 30.83 - 53
50 CLCA 46 69.35 12.66 31.93 29.93 - 37.42 - 54
COCA 7 65.43 15.94 26.43 32.30 - 39.00 - 60

÷ Includes COCAs with no pain at baseline.


* S.D. = standard deviation.
284

(a) (b)
Pain I n t e n s i t y
Pain Intensity 80
80

• 50 70
70 ~ ' 50
40
60-t 30 60 40
20 30
50 50
10 20
5 40
40 , 10
0 5
3O 30

20 0

>= 0
t 10

t t t 0 t I
Baseline One Year Baseline O n e Year
Clinic Cases Community Cases
Fig. i. Mean pain intensity at baseline and l-year fl~llow-up of cases exceeding the specified VAS baseline threshold: (a) clinic cases (CLCA)
(b) community cases (COCA).

thresholds. Thus, an appreciable reduction in pain pain levels, would have had the same change in pain
intensity occurred in patients not seeking treatment intensity as the COCAs over time in the absence of
when they were sampled on the same basis as patients treatment (Anderson et al. 1980). Given self-selection
entering treatment. and the cyclic nature of pain intensity, such an assump-
Finally, the baseline distributions of pain for the tion is not tenable. The results of an analysis of covari-
CLCAs and COCAs were different. Fig. 2 shows the ance showed that the two groups of subjects (CLCAs
relative frequency (%) distributions of VAS pain for and COCAs) did not differ in mean changes of pre-post
both groups. In the CLCAs 56% of the subjects had a VAS scores (F = 0.80, df = 1,239, P = 0.37) after con-
VAS rating greater than 20 and 31% had a VAS rating trolling for pain level at baseline.
greater than 50. This compared to 22% and 7%, re-
spectively, for the COCAs. For those subjects not seek-
ing treatment the level of pain at baseline was skewed
Discussion
towards zcro with far more cases at the lower levels of
pain compared to the CLCAs. in this situation, an
analysis of covariance might bc used to adjust h)l" As previously shown, subjects seeking treatment have
baseline differences in pain intensity. Comparing tl.e higher levels of pain at baseline compared to persons
two groups using their respective mean changes in with pain not seeking treatment. Under such circum-
pre-post VAS scores ( - 14.7 tbr the CLCAs compared stances, there is potential for pain levels to show re-
to 2.09 for the COCAs)is not valid unless it can be gression to the mean. After stratifying on dtreshold
assumed that the CLCAs, with their higher baseline levels for baseline pain intensity, it was apparent that
reduction in pain at follow-up was greater as the initial
level of pain increased for both the treatment and
40 7
Relative Frequency (%) contro! groups. Since regression to the mean is ob-
served for both groups, the greater reductions observed
I In CLCA '
J COCA
in the CLCAs are in all likelihood an artifact of self-
selection. Apparently, effects of self-selection am5 re-
3° i gression to the mean can be large, in our sample we
!
estimate, overall, that a 50% reduction in mean pain
i
f
intensity from baseline to l-year follow-up among cases
!
entering treatment may be attributed to this effect.
Thus, in the absence of any comparison group, it is not
i possible to interpret changes in pain intensity over time
! as due to treatment efficacy, given subjects self-selected
0 -- -- _ _ for treatment based on heightened pain experience.
~' 1- 11- 21- 31- 41- 51- 61- 71- 81- 91-
Baseline Pain
Proper evaluation of treatment efficacy requires a con-
Fig. 2. Relati~,,: frequency (C,~) distributions of baseline pain for clinic trol group. Thus, the trend toward evaluatiott of pain
(CLCA) and com,,unity (COCA) cases. Corresponding frequencies treatments using randomized, controlled designs, with
(no.) for CLCAs are: O, 44, 2(I, 13, 12, 12, 15, 12, Ill, 6. 3: for COCAs: the decreased frequency of uncontrolled designs in the
36,29.8,6,3,6,4, 1,0, I, 1. literature, is appropriate.
285

Moreover, control groaps are also susceptible to effects. Possible explanations of regression to the mean
regression to the mean so it is important that the may include measurement error, within-subject ran-
distribution of pain levels is comparable to the distri- dom variation in pain levels and homeostatic processes
bution in the treated group. Thus, the cyclic variability that tend to return pain to a characteristic level after a
of pain and self-selection for treatment of pain patients flare-up. Among patient groups in whom significant
requires that the assumption of baseline comparability regression to the mean occurs, the ~:atural process of
of pain level be explicitly tested. When it is not possi- improvement may provide opportunities to reinforce
ble to construct a completely comparable control group, patient self-care behaviors and enhance patient beliefs
an analysis of covariance can be used to adjust for in their own abilities to control pain, rather than rein-
imbalances. Controlled trials of pain treatments might forcing patient beliefs in the efficacy of medical care
benefit from measuring pain on multiple occasions for chronic pain. If so, the phenomenon of regression
prior to treatment intervention to reduce regression to to the mean may have the potential to enhance patient
the mean as a source of within-subject change, as is autonomy in managing chronic pain just as it may now
increasingly done in clinical trials in other areas of contribute to the dependence of pain patients on health
medicine (McDonald and Mazzuca 1983). care providers.
The phenomenon of regression to the mean among
patients self-selecting treatment during a pain flare-up
may be important in shaping clinicians' beliefs regard- References
ing treatment efficacy. The clinician who routinely Anderson, S., Auquier, A., Hauck, W.W., Oakes, D., Vandaele, W.
observes improvement in patients following initiation and Weisberg, H.I., Statistical Methods for Comparative Studies,
of pain treatment may attribute the improvement to Wiley, New York, 1980.
the treatment rather than to the natural history of the Davis, C.E., The effect of regression to the mean in epidemiologic
and clinical studies, Am. J. Epidemiol., 104 (1976)493-498.
condition. Such faulty reasoning may lead clinicians to Davis, C.E., Regression to the mean. in: N.L. Johnson and S. Kotz
regard expensive or risky treatments of limited efficacy (Eds.), Encyclopedia of Statistical Sciences, Vol. 7, Wiley, New
as being valuable in the management of the patients York, 1986, pp. 706-708.
they see. The antidote for such faulty reasoning is Dworkin, S.F., Huggins, K.H., LeResche, L., Von Korff, M., Howard,
insistence on evaluation of treatments by controlled J., Truelove, E. and Sommers, E., Epidemiology of signs and
symptoms in temporomandibular disorders. Clinical signs in cases
trials. and controls, J. Am. Dent. Assoc., 120 (1990) 273-281.
The phenomenon of regression to the mean also has Ederer, F., Serum cholesterol changes: effects of diet and regression
implications for patients' beliefs regarding treatment toward the mean, J. Chronic Dis., 25 (1972) 277-289.
efficacy. Patient populations may embrace ineffective Galton, F., Regression towards mediocrity in hereditary stature, J.
pain treatments that are expensive, risky, or both when Anthro. Inst., 15 (1886) 246-263.
Gardner, M..I. and Heady, J.A., Some effects of within-person vari-
their use is followed by a reduction in pain. In a study ability in epidemiologic studies, J. Chronic Dis., 26 (1973) 781-
currently underway, 164 TMD pain patients were iden- 795.
tified who said they had improved somewhat or a great James, K,E., Regression toward the mean in uncontrolled clinic~d
deal 3-6 weeks after seeking care. Among these pa- studies, Biometrics, 29 ( ! 973) 121 - 13[].
McDonald, C.J. and Mazzuca, S.A., How much of the placebo
tients, 85% attributed some of their improvement to
~effect' is really statistical regression?, Star. Med., 2 (1983) 417-
the recommended treatments, suggesting that this phe- 427.
nomenon is common. It has been observed (Shapiro Miller, N.E., Placebo factors in lreatment: views of a psychologist.
1960; Miller 1989) that useless or dangerous treatments In: M. Shepherd and N. Sartorius (Eds.), Non-Specific Aspects of
have been prescribed by physicians throughout the Treatment, Hans Huber, Toronto, 1989, pp. 39-55.
Sartwell, P.E. and Merrell, M., Influence of the dynamic character of
history of medicine, seemingly without undermining
chronic disease on the interpretation of morbidity rates, Am. J.
patient beliefs in the efficacy. Regression to the mean Public Hlth, 42 (1952) 579-584.
may play a role, along with non-specific effects of Shapiro, A.K., A contribution to a history of the placebo effect.
treatment, in maintaining patient beliefs in the potency Behav. Sci., 5 (1960) 109-135.
of medical management of chronic pain. If the phe- Shepherd, D.S., Reliability of blood pressure measurements: implica-
tions for designing and evaluating programs to control hyperten-
nomenon of regression to the mean plays a significant
sion, J. Chronic Dis., 34 (1981) 191-209.
role in shaping treatment choices of patients, profes- Turk, D.C. and Rudy, R.E., Neglected factors in chronic pain treat-
sionals who treat patients in pain need to consider the ment outcome studies: referral patterns, failure to enter treat-
ethics of use of costly or risky treatments the worth of ment, and attrition, Pain, 43 (1990) 7-25.
which has not been evaluated by controlled clinical Von Korff, M., Dworkin, S.F., LeResche, L. and Kruger, A.. An
epidemiologic comparison of pain complaints, Pain, 32 (1988)
trials.
173-183.
Finally, the mechanisms of regression to the mean Von Korff, M., Wagner, E.H., Dworkin, S.F. and Saunders, K.W..
deserve increased research attention along side re- Chronic pain and use of ambulatory health care. Psychosom.
search evaluating specific and non-specific treatment Med., 53 (1991) 61-79.

You might also like