You are on page 1of 6

Quality ofL$ Research, 2, pp.

221-226

Commentary

Interpretation of quality of life changes

E. Lydick* and FL S. Epstein


Merck Research Laboratories, Box 4, West Point, Pennsylvania, PA 19486, USA,
(E. Lydick and R. S. Epstein).

The clinical significance of quallty of life changes has common clinical measures such as blood pressure,
little to do with the QOL as a measure, but is rather a haemoglobin or serum creatinine. Additionally, as
reflection on the novelty of the measures and re- QOL measures are perceptual and not physio-
searcher’s inexperience in their use. This article
discusses possible ways of evaluatlng quallty of life logical, they are often viewed skeptically as ‘soft’
measures in terms of patient-clinician interactlons or subjective, thus often expected to be less
and how the clinician can assess the importance of meaningful than are physiological measures.
these changes of QOL in terms of treatment and In addition, QOL measures are really outcomes,
management of dlsease. not measures of pathology or underlying disease
Key words: Clinical significance, epidemiological studies, state. Confusion arises in that QOL results are
outcomes research, quality of life. compared to laboratory test results or other indica-
tions of progression of the underlying disease
state. As outcomes, QOL responses are more
Introduction comparable to the disease state itself. Thus, it has
been argued that any change in a QOL measure
Small quality-of-life (QOL) changes may be statist- should be considered as clinically significant, as
ically significant, especially with large studies, but any change represents a patient’s perception of a
the clinical relevance of these changes may be diminished health outcome. No one questions
difficult to interpret. While numerous books and whether a myocardial infarction is clinically signifi-
papers have been published pertaining to the cant. Yet a report of diminished quality of life due
correct calculation and expression of statistical to a myocardial infarction is expected to be framed
significance, a clear understanding of the concept in terms of clinical relevance.
of clinical significance and its application to study We present here a review of the QOL literature
results has not received the same degree of in which investigators have made an attempt to
attention. qualify or quantify clinical significance. Specific-
This necessity of defining clinical meaningful- ally, we were interested in how changes seen with
ness is certainly not unique to QOL research. QOL instruments can be translated into clinically
However, the need for demonstrating clinical meaningful terms.
significance is intensified in QOL research for the
following reasons. First, there is a general lack of
familiarity with QOL measurement. Since practi- Perspectives of clinical
tioners have only recently been confronted with meaningfulness
publications of QOL results, they have not yet had
the time to acquire the same level of intuition Researchers have taken at least two different
about the relevance of changes in these measures perspectives. Some frame their study results in
that they have about the relevance of more terms of the impact on the individual, whereas
others choose to look at the impact on a popula-
tion. This paper will focus largely on the first
Presented at Quality of Life Assessment, DIA Workshop, Nice,
France, 23 February, 1993. perspective, that is, the problem in interpreting
results in terms that are clinically meaningful to
l To whom correspondence should be addressed the patient-clinician interaction. However, a few

@ 1993 Rapid Communications of Oxfird Ltd Quality of Lifi Research . Vo12 .1993 221
E. Lydickand R. S. Epstein

examples of the population perspective may prove also interprets QOL changes in terms of a popula-
useful and are provided in Table 1. tion perspective.
Generally, these interpretations revolve around
three types of benchmarks. First, there is a
measure of attributable risk where the differences Definitions of clinical
seen in a study are applied to a much larger significance from the general
population, and the significance of the change in medical literature
QOL is related to the number of individuals
estimated to be affected in this larger hypothetical Let us first look at what is meant by clinical
population. A second method is to derive the significance in general terms. Understanding how
population-level clinical significance by relating this awareness occurs in current clinical practice
changes in QOL to other population-level may help in understanding how clinical meaning-
measures such as resource utilisation, thus em- fulness can be achieved in QOL studies. In their
ploying the construct of medical care resources to book on clinical epidemiology, Sackett et al .4
benchmark the QOL change. conclude that clinical significance refers to the
Finally, the clinical significance of a change in importance of a difference in clinical outcomes
QOL can be measured by putting the change in the between treated and control patients and is usually
context of cost, and comparing this to other described in terms of the magnitude of a study
interventions. If it is assumed that each response result. Jacobson and Truax5 state that clinical
of a health status question represents a discrete significance refers to the ability of an intervention
and different level of health status, one can to meet standards of efficacy and is usually based
estimate, on a population basis, the cost of moving on external standards provided by interested
a population of individuals from a lower health parties in the community. In our opinion, the most
state to a higher one. This method, which results accurate description of a ‘meaningful difference’ as
in cost per QOL change (or sometimes QALY), applied in clinical practice today comes from

Table 1. Population measures of clinical meaningfulness

Measure Definition Example

Population Twenty per cent of the patients receiving methyldopa


Attributable risk and 15% of patients receiving propranolol would
have remained stable or improved had they been
treated with captopril. Given the an average hyper-
tensive clinical practice of size 500, this means that
100 individual patients receiving treatment with me-
thyldopa could have maintained their quality of life on
medication rather than experiencing a worsening. Of
those 100,451 could have been spared substantial
worsening in their levels of positive well-being,
vitality, depression and anxiety. This model extended
to a population of 1 million hypertensive patients
translates into g0 000 individuals spared substantial
decreases in their general well-being’
Resource utilisation Mental health status, as measured by the MHI, is a
major predictor of the use of outpatient mental health
services. The average patient scoring in the lowest
tertile of the MHI score distribution spent over three
times more per year for mental health care than the
average person in the highest tertile . . .2
Cost of moving 1 unit Cost of moving the patient average of 1
unit of Cost of moving a certain percen-
tage of patients 1 or more units of Cost
of moving a certain percentage of pati-
ents 1 or more units on k or more
dimension@

222 Quality of Lifi Research * Vol2 .1993


Quality of L# changes

Jaeschke ef al .6 Clinicians familiar with the test in domain of interest which patients perceive as
question are able to conclude that a specified beneficial and which would mandate, in the
change is clinically important, because they have absence of troublesome side effects and excessive
observed a large number of patients and have seen cost, a change in the patient’s management. This
changes in function and in clinical course that definition incorporates several important con-
correspond to variations in the test results. cepts. Firstly, it focuses attention on the patient’s
These definitions are all somewhat unsatisfying, perception of benefit from therapy. Also, it sug-
In fact, they are reminiscent of the recent state- gests that this benefit would prompt a change in
ment by a US Supreme Court Justice regarding management which emphasises the clinical aspect.
pornography. While admittedly very difficult to Finally, it incorporates the risk/benefit equation of
define, he ‘knew it when he saw it’. Thus, costs and side effects. While this is an excellent
clinicians may ‘know clinical significance when definition, it does not directly suggest an opera-
they see it’, but they may not realize that their tional method for defining clinical meaningful-
understanding of clinical significance of objective ness.
measures is based on their experience (or the
experience of their teachers) with a large number
of patients followed over time. Taxonomy of operational
definitions
Defining clinical significance Through a literature search and discussions with
for an ‘objective’ test colleagues, we have tried to list and catalogue in
Tables 2 and 3 the various ways in which QOL
When faced with a completely new test and new researchers have operationally defined clinical
units, interpretation of ‘objective’ measures are no meaningfulness. That is, how did they express
easier than interpretation of QOL results. Clinical their findings such that clinicians can understand
trials of a new therapy for benign prostatic hyper- the impact of the differences and can modify their
plasia used urine flow as an outcome measure.7 approach to the patient based on these findings?
Following 1 year of treatment, the urine flow in We have chosen to divide their efforts into two
individuals who received the active treatment broad categories which we have termed as either
improved on average by 3 ml/s over those who distribution-based or anchor-based interpreta-
received placebo. Faced with this outcome of a tions. This is a taxonomy that has not been
3 ml/s improvement, clinicians found it very diffi- previously proposed, but which we hope has some
cult to judge whether this change was ‘clinically heuristic value.
meaningful’. A subsequently published epidemio- Distribution-based interpretations are those
logical study in men aged 40-74 yearss found that based on the statistical distributions of the results
urine flow rates decline on average approximately obtained from a given study. Most of these
0.2-0.3 ml/s per year of life. A 3 ml/s improvement interpretations are permutations of the means and
in urine flow is therefore equivalent to restoring an standard deviations of changes seen in a particular
individual’s urinary status to one of approximately study or comparisons of study results to the means
10-15 years earlier. Thus, the clinical trial results or standard deviations of some reference popula-
were not clear until put into context with the tion. The most commonly cited of these measures
findings of the epidemiological study. is the effect size in which the importance of the
change is scaled by comparing the magnitude of
the change to the variability in stable subjects, for
Clinical significance in QOL example on baseline or among untreated indi-
studies viduals. Guyatt ef al. *’ believe that this is more a
measure of responsiveness and, at best, an under-
What can we do to put results from QOL studies estimate of the minimal clinically important differ-
into context such that the relevance and impact of ence. Nevertheless, this measure has been used in
the changes seen can be understood by the a number of publications as an estimate or as
clinician and the patient? Jaeschke ef al.,6 provide evidence of clinical significance.9t17 Other similar
us with a wonderful definition of what they have definitions based on distribution statistics are
termed ‘minimal clinically important difference’. listed in Table 2.
This is the smallest difference in a score, in a In contrast to these definitions, our second

Quality of Lt+ Research . Vol2 .I993 223


E. Lydickand R. S. Epstein
Table 2. Distribution-based interpretations of clinical meaningfulness

Measure Definition Example

Effect size Mean chanae with treatment Effect-size benchmarks may serve as a set of guide-
Variability in stable subjects lines for assessing the relative magnitude of changes
Statistical significance Post-test result a A new therapy would happen by chance only 5% of the
Pre-test result +2 (within individual) time and should alert the physician that the new
SD therapy might be responsible for a negative impact on
the quality-of-life of the patient’
(Post-test-Pre-test)
Reliable change index This definition tells us whether the change reflects
more than the fluctuations of an imprecise measuring
whe??i2ss Pre VA xx
instrument; that is, it measures whether the magnitude
of change is statistically reliable5
sprs Mf=bst + spmt M-nprs
Proximity to mean The level of functioning subsequent to therapy places
spre+ sposi the individual closer to the mean of the functional
population than it does to the mean of the dysfunctional
population5
I unit of change Change in scores associated with change from one of
an ordered set of scenarios to a nearby scenario. This
would represent one unit of effectiveness3
Normative level of Test result z= The level of functioning subsequent to therapy should
functioning Mean of normal population +2 SD fall within the range of the functional or normal
population6

broad taxonomic grouping is that of anchor-based al.” in 1983 was to correlate observed life events
interpretations of clinical meaningfulness (see seen in a population with changes on a concomi-
Table 3). These definitions represent instances tantly administered quality-of-life questionnaire.
where the changes seen in quality of life measures Other anchors may be time, as in the urine flow
were compared, or anchored, to other clinical example above, or changes with therapy. For
changes or results. Similar to methods for estab- example, those patients who respond to digoxin
lishing the validity of quality of life measures these therapy report an improvement as measured by
‘anchors’ can be thought of in terms of construct, the Chronic Heart Failure Questionnaire of 1.6 to
discriminative and predictive references. 2.1 points.6 Thus changes seen with other thera-
The most commonly reported anchor-based in- pies can be tied to the clinician’s understanding of
terpretation is that suggested by Jaeschke et al .6 the efficacy of digoxin. Responses to question-
based on a patient and/or clinician global rating naires may also be anchored to diagnosis or the
question. To use these authors’ example, they impact of living with a specific disease condition.
looked at changes in a global assessment question Thus the responses of a patient with a less
over time and compared changes seen on their well-studied or less well-understood condition can
disease-specific questionnaire between four be put in context with responses of patients with
groups of patients defined as those having no more familiar diseases or conditions.
change in their global health status, those having Being able to predict future outcomes is an
small changes in function (absolute obvious anchor that has not been as commonly
changes of l-3 response items on the global used as one would expect. Marder ef al.14 de-
question), those having moderate changes in scribes levels at which changes on the Brief
function (absolute changes of 4 or 5 response Psychiatric Rating Scale are shown to be predictive
items) and, those having large changes in function of a psychotic exacerbation within 4 weeks. Thus,
(absolute changes of 6 or 7 on the global question). a clinician may judge at which level of change to
Thus, the changes seen in a disease-specific ques- institute a change in the patient’s management.
tionnaire are ‘anchored to reported changes in Deyo and Inui15 correlated changes in the Sickness
overall health status. The clinically meaningful Impact Profile with changes in more traditional
differences can be expressed as changes in the total measures of functional status in patients with
questionnaire score or changes per item. arthritis.
An interesting approach reported by Brook ef The interpretations of clinical significance listed

224 Quality of L$ Resew% . Vol2 .1993


Quality of L@ changes

Table 3. Anchor-based interpretations of clinical meaningfulness

Measure Example

Global ratings The minimal clinically important difference on the CRQ/CHQ would be represented by
changes in questionnaire score associated with global ratings of -3 to - 1 and + 1 to
+36
Life events Measure the change in certain life events, such as family illness, divorce, loss of a job,
having work-related problems, or death of a friend or spouse during the course of a
clinical trial. These objective changes can then be correlated with changes in the QOL
measures.10 For example, a three-point difference on standardized mental health
scale = the impact of being fired or laid off from a job”
Threshold effect As a score on the Brief Pain Inventory increases from 0 to I or from 1 to 2, none of
seven interference questions are endorsed. However, a score of 3 on the BPI is related
to interference in enjoyment, a score of 4 is related to enjoyment and work interference,
while a score of 5 adds interference with sleep, activities and mood. Thus there is a real
jump in interference when moving from 4 to 5 on this scale12
Changes with time Disability in rheumatoid arthritis appears to increase by approximately 0.1 units on
Health Assessment questionnaire each year for the first few years of disease and then
rises more slowly, at a rate of approximately 0.02 units per year after that13
Changes with therapy The heart failure score, as measured by Chronic Heart Failure Questionnaire, improved
by a mean of 2.1 points, when patients received digoxin. In another study, the heart
failure score improved 1.6 points when patients received digoxine
Disease conditions A 5point difference on a standardised health perception scale = the effect of having
been diagnosed as having hypertension. A lo-point difference on a standardised
physical functioning scale = the effect of having chronic, mild osteoarthritis11
Predictive: receiver operator True Positive = Questionnaire suggest prodrome at time of onset of psychotic
characteristic curves exacerbation. False Positive = Questionnaire suggests prodrome that does not
eventuate in a psychotic exacerbation. Numbers along the curve represent individual
cutoff points14
Predictive: correlation of changesGiven a score improvement, how likely is this to be clinically correct? Using the
predictive values of score changes, a 3-point SIP score change in most cases is equal
or superior to the other scales. We have emphasised predictive value because it
addresses the problem most often faced by a clinician: given a test result, how likely is it
to be correct?15

in Table 3 all relate the change in QOL not to the pletely linear. That is, a change of two units on the
distribution of scores, but to some outside measure low end of the scale is rarely exactly equivalent to a
that is more clearly understood or familiar to their change of two units on the high end of the scale.
audience than the QOL scores themselves. In addition, the perspective of the researcher
should be clear. If we are speaking about the
impact of an intervention on a population, perhaps
Recommendations we should not talk of clinical significance, but of
public health significance or economic signifi-
Ideally, an attempt should be made to describe cance. A change in a measure that has been
changes in every QOL instrument in a way that calibrated to be meaningful in population terms
will convey meaning to a clinician. Expressing would not be expected to have the same relevance
changes in relation to an external anchor is of more on an individual basis.
value than the tautalogic reference of clinical
importance back to statistical significance. It may
even be that the relevant anchor, or the amount of Conclusions
change judged clinically significant, may change
with different populations and that one may need In conclusion, we hope that by summarising
to estimate clinical significance in different patient efforts to date, we can provide a framework from
groups, just as one needs to validate the same which investigators can further refine means of
questionnaire in each different patient group. This assessing changes seen with QOL instruments.
is particularly true as many scales are not com- The need for defining clinically meaningful

Quality of L+ Research ’ Vol2 ~1993 225


E. Lydick and R. S. Epstein

changes is likely to receive increased attention 5. Jacobson NS, Truax I’. Clinical significance: a statist-
from clinicians, patients, policy-makers, public ical approach to defining meaningful change in
health personnel and quality of life researchers in psychotherapy research. J Cor&t CZin Psychol 1991;
59: 12-19.
the future. This increased attention will spur 6. Jaeschke R, Singer J, Guyatt GH. Measurement of
investigators to develop new paradigms and health status: Ascertaining the minimal clinically
methodologies to express quality of life changes in important difference. Conf~ol Clin Trial 1989; 10:
clinically meaningful terms. 407-415.
7. Gormley GJ, Stoner E, Bruskewitz RC, et aZ. The
In the meantime, we, as quality of life research-
effect of finasteride in men with benign prostatic
ers should be as clear as possible when speaking of hyperplasia. N Engl JMed 1992; 327: 1185-1191.
clinically significant differences. We should not 8. Girman CJ, I’anser LA, Chute CG, et aZ. Natural
use the term loosely in the discussion without history of prostatism: Urinary flow rates in a
some framework for justifying why we believe our community-based study. J Urology (in press).
9. Kazis LE, Anderson JJ, Meenan RF. Effect sizes for
results are clinically significant. We should not
interpreting changes in health status. Med Care 1989;
confuse statistical significance with clinical signifi- 27: Sl78-Sl89.
cance . We should clarify our frame of reference. 10. Testa MA, Lenderking WR. Interpreting pharmaco-
Finally, we should not apologise. Results from economic and quality-of-life clinical trial data for use
‘objective’ tests are seen to be ‘clinically meaning- in therapeutics. PharmacoEconomics 1992; 2: 107-117.
11. Brook RH, Ware JE Jr, Rogers WH, et aZ. Does free
ful’ only because of historical context. The problem
care improve adults’ health? Results from a random-
with defining clinical significance of quality of life ized controlled trial. N EngZ J Med 1983; 309:
changes has nothing to do with any innate 1426-1434.
inferiority of QOL as a type of measure. It is 12. Cleeland CS. Assessment of pain in cancer measure-
merely a reflection on the newness of these ment issues. Adv Pain Res They 1990; 16: 47-55.
13. Spitz PW, Fries JF. The present and future compre-
measures and our inexperience with them. As all
hensive outcome measures for rheumatic diseases.
parties involved gain increased familiarity with Clin RheumafoZ 1987; 6(Suppl2): 105-11.
these measures, their clinical significance will 14. Marder SR, Mintz J, Van Putten R, Lebell M,
become more obvious and less problematic. Wirshing WC, Johnston-Cronk K. Early prediction
of relapse in schizophrenia: an application of re-
ceiver operating characteristics (ROC) methods.
References Psychopharmacol Bull 1991; 27: 79-82.
15. Deyo RA, Inui TS. Toward clinical applications of
Testa MA. Interpreting quality-of-life clinical trial health status measures: sensitivity of scales to
clinical important changes. HeaZfh Svcs Res 1984; 19:
data for use in the clinical practice of antihypertens-
275-289.
ive therapy. JI-Iypertens 1987; 5(Suppll): S9-S13.
16. Guyatt G, Walter S, Norman G. Measuring change
Ware JE Jr, Manning WG Jr, Duan N, Wells KB,
over time: Assessing the usefulness of evaluative
Newhouse JI’. Health status and the use of outpati-
instruments. J Chron Dis 1987; 40: 171-178.
ent mental health services. Am Psychol 1984; 39:
17. Fletcher A, Gore S, Jones D, Fitzpatrick R, Spiegel-
1090-1100.
halter D, Cox D. Quality of life measures in health
Salsburg DS, Turner RS. Defining clinical meaning-
care. II. Design, analysis, and interpretation. BY Med
ful units of change for health outcome research.
J 1992; 305: 1145-1148.
Quality of Lij? Newsletter 1992; 3: 1.
Sackett DL, Haynes RB, Tugwell P. Clinical epi-
demiology: a basic science for clinical medicine. Boston/ (Received 7 May 2993;
Toronto: Little, Brown & Co, 1985: 181-182. accepted 2 June 1993)

226 Quality of L+ Research . Vol2 .1993

You might also like