Lack of Efficacy of Acetaminophen in Treating

Symptomatic Knee Osteoarthritis
A Randomized, Double-blind, Placebo-Controlled Comparison
Trial With Diclofenac Sodium
John P. Case, MD; Algis J. Baliunas; Joel A. Block, MD

Background: Recommendations state that acetamino- radiographic features. At 2 and 12 weeks, clinically and
phen should be used in preference to nonsteroidal anti- statistically significant (P⬍.001) improvements were
inflammatory drugs in the initial treatment of symptom- seen in the diclofenac-treated group; however, no sig-
atic osteoarthritis (OA) of the hip or knee, because of lesser nificant improvements were seen in the acetamino-
toxicity and the pervasive belief that acetaminophen is phen-treated group (P = .92 at 2 weeks and .19 at 12
not only effective in treating OA pain but is of equal an- weeks). Stratification of subjects according to baseline
algesic efficacy as nonsteroidal anti-inflammatory drugs. pain, prestudy OA medication, and radiographic grade
showed no clear pattern of preferential response to
Methods: This was a randomized, double-blind, placebo- diclofenac, and did not reveal a subset of subjects who
controlled trial of diclofenac sodium, 75 mg twice daily, responded to acetaminophen.
vs acetaminophen, 1000 mg 4 times daily, in 82 sub-
jects with symptomatic OA of the medial knee. Osteo- Conclusions: Diclofenac is effective in the symptom-
arthritis was quantitated radiographically, and subjects atic treatment of OA of the knee, but acetaminophen is
met stringent baseline pain criteria. The primary evalu- not. A review of the literature reveals that there is scanty
ation of efficacy used the Western Ontario and McMas- published evidence for a therapeutic effect of acetami-
ter Universities Osteoarthritis Index, with evaluations at nophen relative to placebo in patients with OA of the knee,
screening, baseline, and 2 and 12 weeks after treatment. because most published studies use active comparators
Intention-to-treat analysis was used. (ie, nonsteroidal anti-inflammatory drugs) only. The ad-
vocacy of acetaminophen use in subjects with OA of the
Results: Twenty-five subjects were randomized to di- knee should be reconsidered pending further placebo-
clofenac, 29 to acetaminophen, and 28 to placebo. The controlled studies.
groups were closely matched for age, sex, body mass
index, prior use of OA medications, baseline pain, and Arch Intern Med. 2003;163:169-178

ARGELY BASED on a single in- randomized double-blind trial compar-
fluential study by Bradley ing acetaminophen, 2600 mg/d, with an
and colleagues, 1,2 clinical NSAID (naproxen, 750 mg/d) had been
guidelines for the pharma- published in the English-language litera-
cological management of os- ture, a study by Williams and colleagues8
teoarthritis (OA) of the hip and knee, ema- that also showed analgesic equivalence.
nating from both sides of the Atlantic Despite the recommended practice
Ocean, continue to emphasize the initial guidelines and the underlying evidence
use of the pure analgesic, acetamino- from the randomized clinical trials, pa-
phen, over nonsteroidal anti-inflamma- tients seem to prefer NSAIDs to acetami-
tory drugs (NSAIDs).3-7 In the initial study nophen in the symptomatic treatment of
by Bradley and colleagues,1 acetamino- OA based on perceived increased efficacy,
phen, 4000 mg/d, was equivalent to ibu- according to 2 studies9,10 that examined the
From the Section of profen, 1200 or 2400 mg/d; the investi- issue. Moreover, recent data support the
Rheumatology, Rush Medical gators subsequently reanalyzed their data presence of a significant inflammatory
College, Rush University
to show that the magnitude of underly- component in the pathophysiology of
(Drs Case and Block), the
Division of Rheumatology, ing knee pain did not alter the apparently OA, indirectly challenging the notion that
Cook County Hospital equivalent pain relief afforded by the pure the anti-inflammatory effect of an NSAID
(Dr Case), and Rush Medical analgesic (acetaminophen) relative to provides no added analgesic benefit rela-
College, Rush University analgesic/anti-inflammatory (NSAID) tive to the pure analgesia of agents such
(Mr Baliunas), Chicago, Ill. therapy.2 Until recently, only one other as acetaminophen.11,12

Some studies8,10 suggest that tolerance, particularly at week 6 to document any medical events and to encourage
gastrointestinal, of acetaminophen is greater than that follow-up for the final (12-week) study visit. Clinical assess-
of cyclooxygenase-nonselective NSAIDs. However, the ments were performed at weeks −2 (for the purpose of enroll-
advent and increasingly widespread use of cyclooxygen- ment and screening), 0, 2, and 12. The study was approved and
ase 2–specific NSAIDs, such as celecoxib and rofecoxib, monitored by the Rush-Presbyterian-St Luke’s Medical Center
Human Investigation Committee (Institutional Review Board).
which seem to have a more favorable gastrointestinal
tolerability profile,13,14 renders the toxicity profile of ace- RADIOGRAPHIC ASSESSMENT
taminophen less compelling.
As a component of an investigation of the effect of pain At baseline, standard anteroposterior weight-bearing x-ray films
relief on dynamic loading in patients with OA of the me- of the knees and the mechanical axis of the study extremity were
dial knee,15 we performed a randomized, double-blind, 12- obtained by standard clinical methods.22 X-ray films were in-
week trial of diclofenac sodium, 75 mg twice daily; ace- terpreted in a random and blinded fashion by a rheumatolo-
taminophen, 1000 mg 4 times daily; and placebo in patients gist ( J.P.C.) experienced in the Kellgren-Lawrence scale,16 and
with symptomatic radiographically defined OA of the me- assigned a grade according to the published method as modi-
dial knee across a broad spectrum of disease, similar to what fied for the knee by Felson et al,17 which weighs the presence
would be seen in clinical practice. of qualitative joint space narrowing equally with osteophytes
in early (stage 1 or 2) OA.


PATIENT POPULATION This consisted of an interview and examination by a physician,

and the supervised self-administration of the 2 disease-specific
Subjects (aged 40-75 years) with unilaterally symptomatic study instruments, the WOMAC19 (which is specific for OA of
idiopathic OA of the knee were drawn from the clinical pa- the hips or knees) and the Lequesne Algofunctional Index for
tient population of the Section of Rheumatology, Rush- the Knees (Lequesne index).23 These 2 validated indexes are the
Presbyterian-St Luke’s Medical Center, Chicago. To be consid- most widely used OA-specific instruments for outcome mea-
ered for the study, patients had to meet radiographic and clinical surement in patients with OA and are recommended as core out-
enrollment criteria. The radiographic criteria consisted of the come measures by OMERACT (Outcome Measures in Rheuma-
presence of radiographic OA (modified Kellgren-Lawrence toid Arthritis Clinical Trials).24 Subjects’ conditions were evaluated
grade16,17 ⱖ1) in addition to evidence of medial compartment by the WOMAC and the Lequesne index at enrollment (week
involvement, as evidenced by possible or definite medial joint −2), baseline (week 0) (corresponding to randomization to treat-
space narrowing or osteophytes.18 The clinical criteria in- ment arm), and weeks 2 and 12. The WOMAC consists of 24
cluded the presence of preenrollment ambulatory pain (de- questions grouped in 3 subscores: pain (5 questions), stiffness
fined as a visual analog scale score of ⱖ30 mm on the 100-mm (2 questions), and function (17 questions). The Lequesne in-
scale corresponding to question 1 of the Western Ontario and dex consists of 3 sections: pain or discomfort (which includes
McMaster Universities Osteoarthritis Index19 [WOMAC] pain stiffness) (5 questions), walking (maximum capable distance and
section [visual analog version 3.0]); moderate pain, by a 5-point use of a walking aid, comprising 2 questions), and function (4
Likert scale; or increased pain (defined as an increase of ⱖ10 questions). The WOMAC was selected prospectively as the in-
mm by the visual analog scale or ⱖ1 by the Likert scale) dur- strument of primary analysis; a recent study25 suggests that it has
ing a 2-week washout period following discontinuation of a better responsiveness than the Lequesne index. In addition, sub-
preexisting analgesic and/or anti-inflammatory OA medica- ject and physician global assessment of arthritis activity was as-
tions. In addition, patients had to be capable of independent sessed at each visit by a 5-point Likert scale (ranging from “asymp-
ambulation without the aid of a cane or walker and had to fall tomatic” to “very severe”). Assessment for the presence of
within 1 SD of weight for their height and age according to the ambulatory pain by a 5-point Likert scale was also an enroll-
Metropolitan Life Insurance Company tables.20 Exclusion cri- ment criterion, as previously described. Ancillary data included
teria included prior intolerance to either of the study medica- subject and physician global assessment of arthritis activity by a
tions or a history of an NSAID allergy or intolerance, func- 5-point Likert scale.
tional class I or IV,21 history of peptic ulcer disease or of
significant other gastrointestinal disease, significant hepatic ab- TREATMENT ALLOCATION
normality (aspartate aminotransferase, alanine aminotransfer-
ase, or alkaline phosphatase level ⬎20% above the upper limit Subjects were randomized to treatment with diclofenac so-
of normal), renal insufficiency (creatinine level ⬎1.2 mg/dL dium (Ciba-Geigy Corp, Summit, NJ), 75 mg twice daily;
[⬎106 µmol/L]), hematologic disease (hemoglobin level ⬍9 acetaminophen (McNeil Consumer Products Co, Fort Wash-
g/dL, white blood cell count ⬍4.0 ⫻ 103/µL, or platelet count ington, Penn), 1000 mg (two 500-mg tablets) 4 times daily; or
⬍120⫻103/µL), presence of joint disease other than OA, pres- either or both of matching diclofenac and placebo capsules.
ence of joint replacements in the lower extremity, use of sub- The placebos were supplied by the respective manufacturers
stances that might interfere with pain perception (tranquiliz- of the active medications and were identical in appearance to
ers, hypnotic agents, or excessive alcohol intake), and the active medications. All subjects, hence, took diclofenac or
anticoagulation therapy. Subjects were not permitted the use placebo, 1 tablet twice daily, and acetaminophen or placebo,
of nonstudy pain medications during the trial. 2 tablets 4 times daily (a total of 10 tablets daily). Compliance
The study was a randomized, double-blind, placebo- was assessed by pill count at each study visit.
controlled trial consisting of a screening and enrollment visit,
when all prestudy OA medications were discontinued (week STATISTICAL ANALYSES
−2); a randomization visit corresponding to baseline (and fol-
lowed immediately by treatment allocation) (week 0); and 2 By using the total WOMAC score as the primary outcome vari-
study visits while taking the assigned study medication (weeks able, it was estimated, based on pilot data, that to achieve 95%
2 and 12). In addition, a telephone interview was performed power to detect significant between-group differences with a

Table 1. Baseline Characteristics by Treatment Group*

Treatment Group

Diclofenac Sodium Acetaminophen Placebo

Characteristic (n = 25) (n = 29) (n = 28)
Age, y 62.9 ± 7.6 62.1 ± 11.4 61.7 ± 9.0
Male-female ratio 10:15 14:15 17:11
Height, m 1.68 ± 0.10 1.70 ± 0.11 1.73 ± 0.12
Weight, kg 76.9 ± 11.5 76.8 ± 13.0 81.3 ± 12.7
BMI 27.0 ± 2.6 26.4 ± 3.7 27.0 ± 2.8
Study knee, right-left ratio 13:12 17:12 12:16
Prestudy pain medications (week −2)†
None 7 (28) 7 (24) 10 (36)
NSAID or high-dose aspirin‡ 17 (68) 18 (62) 17 (61)
Acetaminophen 0 3 (10) 0
NSAID or high-dose aspirin‡ and acetaminophen 1 (4) 1 (3) 1 (4)
Prestudy WOMAC variables (week −2)
Pain 185.5 ± 87.8 158.9 ± 60.1 172.9 ± 108.5
Stiffness 95.1 ± 47.7 82.8 ± 55.5 95.3 ± 51.8
Function 643.0 ± 297.8 536.0 ± 260.3 655.2 ± 385.6
Total 923.6 ± 388.1 777.7 ± 350.6 923.4 ± 530.2
Radiographic features§
Modified Kellgren-Lawrence grade (0-4) 2.3 ± 0.8 2.0 ± 1.0 2.2 ± 0.8
Medial joint space narrowing (0-3) 1.8 ± 0.7 1.5 ± 0.8 1.5 ± 0.7
Medial osteophytes (0-3)
Femoral 0.9 ± 0.9 0.8 ± 1.1 0.8 ± 0.9
Tibial 0.7 ± 0.7 0.7 ± 0.6 0.9 ± 0.7
Lateral joint space narrowing (0-3) 0.5 ± 0.9 0.2 ± 0.4 0.4 ± 0.7
Lateral osteophytes (0-3)
Femoral 0.9 ± 1.0 0.8 ± 1.1 0.9 ± 1.1
Tibial 0.7 ± 0.7 0.5 ± 0.7 0.7 ± 1.0
Mechanical axis, degrees varus 4.4 ± 7.0 4.8 ± 5.3 4.1 ± 5.6

Abbreviations: BMI, body mass index (calculated as weight in kilograms divided by the square of height in meters); NSAID, nonsteroidal anti-inflammatory
drug; WOMAC, Western Ontario and McMaster Universities Osteoarthritis Index.
*Data are given as mean ± SD unless otherwise indicated. There were no significant between-group differences for any variable (P⬎.05).
†Data are given as number (percentage) of patients. Percentages may not total 100 because of rounding.
‡Defined as greater than 325 mg/d.
§Numbers in parentheses are ranges.

significance level of .05 would require approximately 23 sub- randomization are given in Table 1. There were no sig-
jects in each treatment group, defining a clinically significant nificant differences between the groups in any of the in-
difference as 20% improvement in the total WOMAC scale.26 dicated variables. Overall, more subjects took NSAIDs or
Data analysis was performed on an intention-to-treat basis, with high-dose (⬎325 mg/d) aspirin (55 subjects [67%]) than
imputation of missing data points as suggested in the WOMAC
acetaminophen (3 subjects [4%]) or no prestudy OA medi-
guidelines27 and by using last-observation-carried-forward
methods.28 A 1-way analysis of variance (ANOVA) was used cations (24 subjects [29%]); subjects taking NSAIDs or
to compare differences in treatment groups at 0 (baseline), high-dose aspirin and acetaminophen (n = 3) were in-
2, and 12 weeks, and groupwise changes between weeks 0 and cluded in the NSAID/high-dose aspirin group in the per-
2 and weeks 0 and 12. All post hoc analyses were performed centile analysis.
assuming equal variances using the Bonferroni method. Paired- Seven subjects withdrew from the trial between
sample t testing (2 tailed) was used to compare changes in the weeks 0 and 2 (3 in the diclofenac-treated group and 2
WOMAC and Lequesne index and their subsections between each in the acetaminophen-treated and placebo groups).
weeks 0 and 2 and weeks 0 and 12. Fourteen subjects withdrew from the trial between
Software used for analysis was Statistical Product and Ser- weeks 2 and 12 (2 in the diclofenac-treated group, 5 in
vice Solutions, version 10 (SPSS Inc, Chicago, Ill). Statistical sig-
the acetaminophen-treated group, and 7 in the placebo
nificance was defined as Pⱕ.05. Clinical significance was inter-
preted as an improvement of at least 20% in the study variable.26 group). Overall, during the 12-week trial, 5 subjects
withdrew from the diclofenac-treated group, 7 from the
acetaminophen-treated group, and 9 from the placebo
group. Adverse effects were the leading cause of with-
BASELINE SUBJECT CHARACTERISTICS drawal from the diclofenac-treated group (3 of 5 sub-
AND STUDY FLOW jects) and inefficacy from the acetaminophen-treated
group (5 of 7 subjects). The 9 withdrawals from the pla-
Eighty-two subjects met the clinical and radiographic cri- cebo group were due to inefficacy (n=4), nonknee pain
teria at enrollment (week −2) and were randomized (week (n=2), and other reasons (n=3). None of the groupwise
0): 25 to diclofenac, 29 to acetaminophen, and 28 to pla- differences reached the level of statistical significance
cebo. The characteristics of the subjects at enrollment and (P=.11, by ANOVA).

Table 2. Clinical Response at Week 2 and Week 12 Compared With Baseline (Week 0), by Treatment Group

Week 2 − Week 0 Week 12 − Week 0

Week 0 Week 2 Week 12
Variable Score* Score* Difference* P Value % Improvement† Score* Difference* P Value % Improvement†
Diclofenac Sodium–Treated Patients (n = 25)
Pain 199.8 ± 101.5 139.6 ± 105.2 −60.2 ± 68.3 ⬍.001 30.1 146.0 ± 101.2 −53.9 ± 79.3 .002 27.0
Stiffness 97.1 ± 50.4 66.1 ± 49.5 −31.0 ± 31.8 ⬍.001 31.9 66.4 ± 48.6 −30.7 ± 39.4 .001 31.6
Function 669.3 ± 371.6 499.8 ± 395.6 −169.6 ± 185.0 ⬍.001 25.3 506.3 ± 383.2 −163.0 ± 201.5 ⬍.001 24.4
Sum 966.2 ± 498.7 705.5 ± 535.8 −260.7 ± 268.0 ⬍.001 27.0 718.6 ± 515.7 −247.6 ± 294.1 ⬍.001 25.6
Pain 4.7 ± 1.9 3.6 ± 2.3 −1.1 ± 2.0 .01 23.4 4.0 ± 2.0 −0.7 ± 2.0 .11 ...
Walking 2.0 ± 1.6 1.7 ± 1.5 −0.2 ± 1.0 .23 ... 1.8 ± 1.6 −0.2 ± 1.7 .57 ...
Function 3.6 ± 1.2 1.7 ± 1.5 −1.9 ± 1.7 ⬍.001 52.8 3.3 ± 1.3 −0.4 ± 1.2 .15 ...
Sum 10.3 ± 3.5 8.6 ± 4.3 −1.7 ± 3.0 .009 16.5 9.0 ± 4.0 −1.2 ± 3.0 .05 11.7
Acetaminophen-Treated Patients (n = 29)
Pain 210.8 ± 86.3 206.1 ± 101.2 −4.7 ± 58.4 .67 ... 186.9 ± 121.5 −23.8 ± 83.2 .13 ...
Stiffness 95.7 ± 57.2 88.9 ± 57.2 −6.9 ± 30.9 .24 ... 86.8 ± 59.3 −8.9 ± 24.2 .06 ...
Function 657.0 ± 262.5 664.8 ± 315.3 7.8 ± 123.1 .74 ... 615.2 ± 360.2 −41.8 ± 205.6 .28 ...
Sum 963.5 ± 374.8 959.8 ± 452.5 −3.8 ± 197.1 .92 ... 889.0 ± 520.4 −74.6 ± 300.0 .19 ...
Pain 4.9 ± 2.0 4.7 ± 2.3 −0.3 ± 1.9 .45 ... 4.3 ± 2.6 −0.8 ± 2.3 .08 ...
Walking 1.9 ± 1.6 1.9 ± 1.6 −0.1 ± 1.3 .78 ... 2.1 ± 1.5 0.2 ± 1.8 .52 ...
Function 3.7 ± 0.9 1.9 ± 1.6 −1.7 ± 1.8 ⬍.001 45.9 3.4 ± 1.5 −0.3 ± 1.5 .28 ...
Sum 10.4 ± 3.3 9.9 ± 4.0 −0.6 ± 3.4 .36 ... 9.6 ± 4.7 −1.0 ± 4.0 .22 ...
Placebo-Treated Patients (n = 28)
Pain 198.6 ± 110.9 197.1 ± 118.8 −1.5 ± 52.3 .88 ... 183.4 ± 122.9 −15.3 ± 98.7 .42 ...
Stiffness 97.8 ± 49.3 88.1 ± 50.4 −9.6 ± 27.6 .08 ... 80.6 ± 50.9 −17.1 ± 41.4 .04 17.5
Function 697.1 ± 375.2 661.5 ± 359.4 −35.6 ± 129.9 .16 ... 611.5 ± 365.4 −85.6 ± 223.2 .05 ...
Sum 993.5 ± 519.7 946.8 ± 516.3 −46.8 ± 197.3 .22 ... 875.5 ± 520.5 −118.0 ± 348.9 .08 ...
Pain 5.0 ± 2.0 4.9 ± 2.3 −0.1 ± 2.2 .79 ... 4.7 ± 2.4 −0.3 ± 2.3 .77 ...
Walking 1.9 ± 1.6 1.9 ± 1.8 0.0 ± 1.1 .86 ... 1.8 ± 1.9 −0.1 ± 1.3 .88 ...
Function 3.0 ± 1.6 1.9 ± 1.8 −1.1 ± 1.7 .002 36.7 3.1 ± 1.7 0.1 ± 1.1 .73 ...
Sum 9.9 ± 4.1 10.0 ± 4.7 −0.1 ± 2.7 .89 ... 9.6 ± 4.9 −0.3 ± 3.5 .66 ...

Abbreviations: LeQ, Lequesne Algofunctional Index for the Knees; WO, Western Ontario and McMaster Universities Osteoarthritis Index.
*Data are given as mean ± SD.
†Improvement from baseline; tabulated only for statistically significant results (Pⱕ.05).

Pill counts performed at each study visit demon- at 12 weeks, there was variability in the significance level
strated greater than 90% compliance, and there was no of the Lequesne index subscales. The Lequesne index pain
difference in compliance between the 2- and 4-times- and function subscales were statistically significantly im-
daily medications (data not shown). proved at 2 weeks, but not at 12 weeks, and the Lequesne
index distance-walked subscale was not statistically sig-
CLINICAL RESPONSE nificantly improved at either time point. Clinically, greater
than 20% improvement was seen in the Lequesne index
Using the primary end point, the WOMAC, only the di- pain and function subscales at 2 weeks only. Twenty per-
clofenac-treated group was significantly improved at 2 and cent clinical improvement in the total Lequesne index was
12 weeks compared with baseline (Table 2). This was not seen at either time point.
true for the 3 WOMAC subscales and the total WOMAC. In marked contrast, in the acetaminophen-treated
At a significance level of P⬍.001, clinically significant im- group, the results by WOMAC (subscales and sum) at 2
provement (ⱖ20%) was seen at 2 weeks in the pain, stiff- and 12 weeks were statistically and clinically insignifi-
ness, and function subscales and in the total WOMAC. cant. These results were indistinguishable from those for
Similar improvement was seen at 12 weeks in the diclo- placebo. At 2 weeks, the Lequesne index function sub-
fenac-treated group in the pain, stiffness, and function sub- scale improved in the acetaminophen-treated group and
scales and in the total WOMAC, although the level of sig- in the placebo group. The Lequesne index pain, distance-
nificance was somewhat less for pain and stiffness than for walked, and sum variables were never statistically or clini-
function and the total WOMAC. Although the total cally significantly improved from baseline, mirroring the
Lequesne index was significantly improved in the diclo- results in the placebo group. These results are summa-
fenac-treated group at 2 weeks and marginally improved rized in Table 2.

The groupwise responses over time in the WOMAC A Week 0 (Baseline)
pain, stiffness, and function subscales and in the total 400 Week 2
P = .002
WOMAC are shown graphically in the Figure. Week 12
P < .001
There were no significant between-group changes
in subject or physician global assessment of arthritis 300

activity as analyzed by ANOVA (data not shown), with

Pain Score
the exception of a difference in the physician global
assessment at week 12 relative to week 0 that favored
the acetaminophen-treated group (a mean±SD improve-
ment of 2.7±0.5) relative to the diclofenac-treated group 100
(a mean ± SD improvement of 2.4 ± 0.5), a difference
of borderline statistical significance (P = .04). The P = .003
P = .002
mean±SD change in the placebo group (2.4±0.5) at 12 0
weeks was not significantly different from the change in Diclofenac Sodium Acetaminophen Placebo
Treatment Group
the diclofenac- and acetaminophen-treated groups
(P=.81 and .19, respectively). There were no between- B
group differences in the subject or physician global 180
P = .001 P = .04
assessments at 2 weeks. To determine whether the 160 P < .001
response to acetaminophen or to diclofenac was in part 140
dependent on the degree of underlying OA pain,2 sub-
jects were stratified into tertiles of baseline pain accord-

Stiffness Score
ing to the WOMAC (where the maximum pain section 100

score is 500). The tertiles corresponded to 144.9 or less 80

for the lowest pain tertile (27 subjects in the 3 treatment 60
groups), 145.0 to 230.9 for the middle pain tertile (28
subjects), and 231.0 or greater for the highest pain ter-
tile (27 subjects). 20 P = .01
P = .04
For the diclofenac-treated group, in the lowest pain 0
Diclofenac Sodium Acetaminophen Placebo
tertile (n = 10), there was significant improvement by
Treatment Group
t test between weeks 0 and 2 in the WOMAC pain (52.7%
improvement, P = .002), stiffness (44.7% improvement, C
P=.02), and function (40.6% improvement, P=.02) sub- P < .001
scales and in the total WOMAC (43.4% improvement, P < .001
P=.007). At 12 weeks, persistent marginally significant
improvement was seen in the pain (29.4%, P =.04) and 800
stiffness (24.1%, P = .05) subscales, but not in the func-
Function Score

tion subscale (P = .08) or in the total WOMAC (P=.06). 600

In the middle pain tertile, statistically significant im-
provement at 2 weeks was seen in the WOMAC stiff- 400
ness (29.7% improvement, P = .03) and function (24.7%
improvement, P=.02) subscales and in the total WOMAC 200
P < .001
(23.4% improvement, P = .02). At 12 weeks, statistically P = .004
significant improvement was seen in the WOMAC func- 0
Diclofenac Sodium Acetaminophen Placebo
tion subscale only (26.3% improvement, P = .03). Fi- Treatment Group
nally, in the highest pain tertile, statistically significant
improvement was seen at 2 weeks in the pain (29.4% im-
provement, P= .04) and stiffness (24.1% improvement, P < .001
P =.05) subscales and at 12 weeks in the WOMAC pain 1400 P < .001
(29.0% improvement, P= .03) and stiffness (24.4% im- 1200
provement, P= .04) subscales and in the total WOMAC
(18.8% improvement, P = .05). Responses at 2 weeks are 1000
Total Score

shown by P value in Table 3. 800

In the acetaminophen-treated and placebo groups,
in contrast, there were no statistically significant changes
in the WOMAC pain, stiffness, or function subscales or 400
in the total WOMAC at 2 or at 12 weeks for any of the 3 200 P < .001
pain tertiles (data not shown). P = .002
Application of ANOVA according to pain tertile 0
Diclofenac Sodium Acetaminophen Placebo
yielded results similar to the t test: diclofenac efficacy was Treatment Group
inconsistent and not clearly related to the degree of base-
Mean ± SD Western Ontario and McMaster Universities Osteoarthritis Index
line pain, shown in Table 4, at 2 weeks. Acetamino- pain (A), stiffness (B), and function (C) subscale scores and total score (D),
phen was never superior to placebo (data not shown). A by treatment group over time. P values are shown where significant.

Table 3. Stratification by Baseline Variables of WOMAC Response (Using the t Test)
at Week 2 in the Diclofenac Sodium–Treated Group

P Value†

Baseline Variable No. of Subjects* Pain Stiffness Function Total

Pain tertile
Lowest 10 .002 .02 .02 .007
Middle 8 .11 .03 .02 .02
Highest 7 .04 .05 .08 .06
Prestudy OA medications
Non-NSAIDs‡ 7 .12 .08 .05 .05
NSAIDs 17 .001 .001 .002 .001
Kellgren-Lawrence grade
1 or 2 12 .009 .006 .007 .005
3 or 4 7 .03 .02 .01 .01
Medial joint space narrowing
0 or 1 7 .04 .02 .005 .009
2 or 3 12 .004 .008 .03 .02

Abbreviations: NSAID, nonsteroidal anti-inflammatory drug; OA, osteoarthritis; WOMAC, Western Ontario and McMaster Universities Osteoarthritis Index.
*In the diclofenac sodium–treated group.
†For statistically significant (boldface) results (Pⱕ.05).
‡Subjects taking no prestudy OA medications (n = 24) or acetaminophen (n = 3).

Table 4. Stratification by Baseline Variables of WOMAC Response (Using ANOVA) at Week 2

in the Diclofenac Sodium–Treated Group

P Value†

Baseline Variable Comparison* Pain Stiffness Function Total

Pain tertile
Lowest (n = 27) Diclofenac with acetaminophen .05 .16 .054 .03
Diclofenac with placebo .04 .50 .06 .05
Middle (n = 28) Diclofenac with acetaminophen .49 .11 .02 .06
Diclofenac with placebo .23 .19 .05 .07
Highest (n = 27) Diclofenac with acetaminophen .11 .96 .09 .11
Diclofenac with placebo .09 .71 .64 .37
Prestudy OA medications
None or acetaminophen* (n = 27‡) Diclofenac with acetaminophen .15 .23 .007 .01
Diclofenac with placebo .80 .99 .02 .08
NSAIDs (n = 52) Diclofenac with acetaminophen .01 .10 .004 .004
Diclofenac with placebo .004 .08 .80 .03
Kellgren-Lawrence grade
1 or 2 (n = 47) Diclofenac with acetaminophen .002 ⬍.001 ⬍.001 ⬍.001
Diclofenac with placebo .003 .002 .002 .001
3 or 4 (n = 24) Diclofenac with acetaminophen .85 ⬎.999 .28 .42
Diclofenac with placebo .37 ⬎.999 .48 .46
Medial joint space narrowing
0 or 1 (n = 34) Diclofenac with acetaminophen .005 ⬍.001 ⬍.001 ⬍.001
Diclofenac with placebo .002 ⬍.001 ⬍.001 ⬍.001
2 or 3 (n = 34) Diclofenac with acetaminophen .63 .99 .41 .47
Diclofenac with placebo .29 .99 .70 .56

Abbreviations: ANOVA, analysis of variance; NSAID, nonsteroidal anti-inflammatory drug; OA, osteoarthritis; WOMAC, Western Ontario and McMaster
Universities Osteoarthritis Index.
*Acetaminophen was never significantly different from placebo.
†For statistically significant (boldface) results (Pⱕ.05).
‡Three patients took no prestudy OA medications.

sample size limitation may have compromised the power tion, n=24). Because of the few individuals who took ace-
to detect a difference in the diclofenac-treated groups by taminophen before the study, these subjects were grouped
t test and ANOVA. along with the no prestudy OA medication group to form
Subjects were stratified according to prestudy OA a non–NSAID-pretreated group (n=27). By t test, there
medication (NSAID, n = 52; acetaminophen, n =3; or no were no significant differences at 2 or 12 weeks for the
analgesic or anti-inflammatory prestudy OA medica- non–NSAID-pretreated subjects in the diclofenac- or ace-

taminophen-treated groups (data not shown). In the COMMENT
NSAID-pretreated subgroup, in contrast, a statistically sig-
nificant response was seen in treatment with diclofenac, This study is in agreement with prior studies30-32 in
but not with acetaminophen (shown for diclofenac at week showing that diclofenac, 75 mg twice daily, effectively
2 in Table 3). An ANOVA in general was similar, with a relieved knee pain in patients with OA at 2 and 12
more consistent response to diclofenac relative to ace- weeks and resulted in improvements in stiffness and
taminophen or placebo seen in the NSAID-pretreated sub- function; in contrast, acetaminophen, 1 g 4 times daily,
group compared with the non–NSAID-pretreated group did not differ from placebo in any of these variables.
(shown at 2 weeks in Table 4). Differences as a function More subjects withdrew from the diclofenac-treated
of the prestudy OA medication regimen may, however, group because of intolerance to the medication rather
represent an artifact of sample size, there being roughly than inefficacy; the opposite was true in the acetamino-
twice as many subjects in the NSAID-pretreated than the phen-treated group.
non–NSAID-pretreated groups. Consistent with an apparent patient preference for
Grouping by Kellgren-Lawrence grade into ques- NSAID therapy relative to acetaminophen therapy in those
tionable or mild radiographic OA (grades 1 or 2 [n=47]) with OA,9,10 2 recently published studies demonstrate
and moderate or severe radiographic OA (grades 3 or 4 that greater efficacy is obtained with NSAIDs than with
[n=24]) showed significant improvement in both group- acetaminophen. Pincus et al33 showed that diclofenac (ad-
ings at 2 and 12 weeks in the diclofenac-treated group ministered in a regimen identical to that used in the pres-
only (t test; shown for 2 weeks for diclofenac in Table ent study) plus misoprostol was superior to acetamino-
3). The relative efficacy of diclofenac to acetaminophen phen in OA of the hip or knee in a randomized crossover
and to placebo was reflected similarly by ANOVA at 2 study, at a cost of greater adverse effects. Similarly, Geba
weeks, but not at 12 weeks (data shown for 2 weeks in et al,34 in a randomized double-blind trial of acetamino-
Table 4). phen and 2 cyclooxygenase 2–specific NSAIDs, cele-
Finally, stratification by degree of qualitative me- coxib and rofecoxib, demonstrated that rofecoxib, 25
dial joint space narrowing18 as 0 or 1 (n = 34) or 2 or 3 mg/d, was superior to acetaminophen and had a similar
(n = 34) yielded results by t test at 2 weeks similar to tolerability profile.
those demonstrated in subgroup stratification by Kell- On the other hand, the 2 older studies cited earlier,
gren-Lawrence grade: diclofenac was effective regard- by Bradley et al1 and Williams et al,8 reached the appar-
less of the degree of joint space narrowing (Table 3), ently opposite conclusion, namely, that NSAIDs (ibu-
and acetaminophen was conversely ineffective (data profen and naproxen, respectively) are of no greater
not shown). The efficacy of diclofenac was not seen, efficacy in OA of the knee than is acetaminophen. Rec-
however, at 12 weeks (data not shown). The ANOVA onciling the seemingly disparate results of the 2 re-
by degree of joint space narrowing showed diclofenac cently published studies and this study with the older and
superior to acetaminophen and to placebo at 2 weeks widely cited earlier investigations3-7 may be approached
for grade 0 or 1 joint space narrowing, but not for by careful consideration of the differing underlying meth-
greater degrees of qualitative joint space narrowing ods (summarized in Table 5). Systematically, these may
(Table 4). The efficacy of diclofenac relative to aceta- be broadly classified as differences in (1) study design
minophen and placebo in this subanalysis was not seen and analysis, (2) patient and disease selection, and (3)
at 12 weeks. Sample size limitation may explain the lack outcome measurement and interpretation.
of statistically significant efficacy at 12 weeks or, alter- Regarding study design and analysis, to our knowl-
nately, waning of therapeutic efficacy with time.29 edge, this study represents only the second published,
randomized, double-blind, placebo-controlled trial of
POST HOC acetaminophen in patients with OA (MEDLINE search,
POWER VALIDATION keywords osteoarthritis or osteoarthrosis and acetamino-
phen or paracetamol), following a study in 25 patients
To determine the power of the study to reject the hy- with OA of the knee by Amadio and Cummings35 that
pothesis of no difference between the acetaminophen showed efficacy of acetaminophen at a dose of 1 g four
and placebo groups, a post hoc analysis based on the times daily (the dose used in the present study). The
data of Table 2 was performed: 20% improvement26 in studies by Bradley,1 Williams,8 and Geba34 and col-
the initial mean total WOMAC for the acetaminophen- leagues were randomized double-blind studies without a
treated group (963.5) represents a difference at 2 weeks placebo, and the study by Pincus et al33 was a random-
of 192.7. The SD of this hypothetical difference was ized, double-blind, crossover study (also without a
taken as 268.0, representing the SD of the difference in placebo). Hence, all comparison studies of NSAIDs vs
the total WOMAC for the diclofenac-treated group at 2 acetaminophen subsequent to that of Amadio and Cum-
weeks. This SD is larger than that of the observed SD of mings are grounded in the fundamental belief of the
the difference at 2 weeks in the acetaminophen-treated validity of that 1 small study.
group (197.1) and, hence, is more conservative. For a Second, research practice dictates that primary analy-
size of 29 subjects in the acetaminophen-treated group, sis be undertaken with intention to treat, generally us-
these estimates yield a calculated power to reject the ing last-observation-carried-forward methods for data im-
1-sided null hypothesis of no difference between the putation.36,37 The older studies either did not use8 or did
acetaminophen-treated and placebo groups of greater not indicate the use1,35 of such methods (which had not
than 98%. yet been widely accepted as the standard); more re-

Table 5. Published Trials of Acetaminophen and Acetaminophen/NSAIDs

OA Severity Improvement*

No. of ITT/LOCF X-ray Duration, Outcome Aceta- NSAID

Source Subjects Design Placebo Analysis Film Clinical Site wk Measure minophen NSAID Superior†
Amadio and 25 RDB Yes ? Bilateral Pain, Knee 6 ? + NA NA
Cummings,35 OPs tenderness,
1983 and and swelling
JSN or warmth
Bradley et al,1 184 RDB No ? K-L 2 “Knee pain” Knee 4 HAQ pain + + −
1991 or 3 (Likert scale)
Walking pain − + +
(Likert scale)
Rest pain − + −
(Likert scale)
Williams et al,8 178 RDB No No/No OPs “Knee pain” Knee 6 Rest VAS − + +
1993 Walking VAS + + −
Pincus et al,33 227 RDB No ? K-L 2-4 VAS pain Knee and 6 Targeted NA NA +
2001 crossover score, 30 of hip WOMAC
Geba et al,34 382 RDB No Yes/No No ACR clinical Knee 6 WOMAC + + +
2002 criteria, subscales
degree of (not the
pain varied whole
by prestudy instrument)
Present study 82 RDB Yes Yes/Yes K-L 1-4 VAS and Likert Knee 12 WOMAC ⬎ LeQ − + +

Abbreviations: ACR, American College of Rheumatology; HAQ, Health Assessment Questionnaire; ITT, intention to treat; JSN, joint space narrowing;
K-L, Kellgren-Lawrence grade; LeQ, Lequesne Algofunctional Index for the Knees; LOCF, last observation carried forward; minus sign, absence; NA, data not applicable;
NSAID, nonsteroidal anti-inflammatory drug; OA, osteoarthritis; OPs, osteophytes; plus sign, presence; question mark, unknown; RDB, randomized double-blind trial;
VAS, visual analog scale.
*The presence or absence of a statistically significant difference from baseline.
†The presence or absence of a statistically significant superiority of an NSAID to acetaminophen.

cently, Pincus et al33 did not indicate the use of inten- the study, which may reflect relatively severe manifes-
tion to treat/last observation carried forward and Geba tations of disease and/or the rheumatology clinic popu-
et al34 used a modified intention-to-treat protocol that did lation out of which the patients were recruited.38 Of the
not use last observation carried forward for the primary studies in Table 5, only the study by Geba et al describes
(WOMAC) data. Finally, the efficacy of pharmacologi- the nature of the subjects’ prestudy OA medications: 77%
cal therapy in patients with OA may wane with time29; were taking NSAIDs in that study, possibly reflecting dis-
however, the studies cited were all of roughly the same ease more severe than that in the present study. Prior treat-
medium-term duration. The present study (at 12 weeks) ment with diclofenac did not abrogate an apparent clini-
is longer. Nevertheless, there seemed to be some dimi- cal response on subsequent treatment with acetaminophen
nution in the efficacy of diclofenac at 12 weeks relative in the study by Pincus et al.
to 2 weeks (Table 2 and the Figure). Last, in outcome measurement and, above all, in in-
Concerning patient and disease selection, these dif- terpretation, there were differences between the present
fered substantially in the NSAID/acetaminophen stud- and prior studies. Numerous investigations25,39-41 have
ies cited. All but the study by Geba et al34 used radio- demonstrated the superiority in patients with OA of dis-
graphic criteria of various degrees of rigor, and all required ease-specific instruments such as the Lequesne index and,
the presence of knee pain (or, in the case of the study by especially, the WOMAC. The older studies1,8,35 in Table
Amadio and Cummings, 35 tenderness, swelling, or 5 appeared before the widespread acceptance of these in-
warmth). The studies by Geba34 and Pincus33 and col- struments and perforce relied on less accurate mea-
leagues and the present study required a quantitated mini- sures, such as a single-question visual analog scale or
mum degree of knee pain. The pain requirement, how- Likert scale scores. The reliance on the joint-specific tar-
ever, differed in the study by Geba et al, depending on geted WOMAC (directed toward a single joint rather than
whether the prestudy OA medication was acetamino- toward the whole lower extremities in the conventional
phen or an NSAID, which may have potentially affected WOMAC application19), as used by Pincus et al,33 is unique
the differential pain responses in the researchers’ non- among the recent studies, but has been used success-
crossover design. All 5 published studies and the pres- fully in the evaluation of surgical outcome in patients with
ent one were of OA of the knee; OA of the hip was the OA of the hip.42
disease in a few (22%) of the subjects in the study by Pin- The most compelling explanation for the dispar-
cus et al. Most (55 [67%]) of the subjects in the present ate results of these studies (ie, whether acetaminophen
study were taking NSAIDs or high-dose aspirin before is as effective as NSAIDs or, indeed, is an effective

