Professional Documents
Culture Documents
ABSTRACT
The Oxford Hip Score (OHS) and Oxford Knee Score (OKS) are validated, reliable and reproducible
outcome measures, however their use retrospectively has not been examined. The aim of this pro-
spective cohort study was to examine the accuracy and reliability of patients’ ability to recall their OHS
and OKS in a retrospective manner. A total of 137 patients undergoing primary hip (40) or primary
knee (97) arthroplasty with a mean age of 70.8 years (range, 47–88) and a mean time to follow up of
27.2 months (range, 6–46) were included in the study. The mean retrospective OHS and OKS de-
creased compared to the pre-operative score (OHS ¼ 1:6 SD, p ¼ 0:36, OKS ¼ 4:7 SD, p < 0:001).
There was only a weak positive relationship between the actual pre-operative scores and the retro-
spective scores (OHS: r 2 ¼ 0:30, OKS: r 2 ¼ 0:19). Bland–Altman analysis demonstrated 95% limits of
agreement between scores of 19.9 to 23.1 for the OHS and 15.3 to 24.8 for the OKS. This study
¶ Correspondence to: Travis M. Falconer, Perth Orthopaedic & Sports Medicine, Center, Sir Charles Gairdner Hospital,
31 Outram Street, West Perth, WA 6008, Australia.
1550003-1
T. M. Falconer et al.
shows that patients are poor at retrospectively recalling their pre-operative OHS and OKS and
therefore these scores should not be used in a retrospective manner.
1550003-2
Retrospective Recall of OHS and OKS
day before their joint replacement. They would and retrospective scores, and linear regression
have to remember their symptoms prior to their analysis was used to examine the linear rela-
hip or knee replacement as opposed to filling the tionship between the two scores. To examine the
survey out based on current symptoms. influence of time since surgery on recall accuracy,
A power calculation was performed based each patient group was stratified with annual
upon a similar previous study looking at the increments and one factor analysis of variance
retrospective use of the OSS.10 Wilson et al. as- was used to test for the effect of time on recall
sumed that a clinically significant change was 5 accuracy. For all analyses, the criterion for sta-
points across the OSS. They calculated the mini- tistical significance was set at p < 0:05.
mum number of participants to be 10, giving a
power of 90% to detect a five-point difference in
the OSS. In the present study, we are examining RESULTS
two sets of data and have focused upon the level A total of 42 of 70 (60%) patients successfully
J. Musculoskelet. Res. 2015.18. Downloaded from www.worldscientific.com
Fig. 1
1550003-3
T. M. Falconer et al.
Fig. 2
and retrospective (37.7) mean OKS scores (p ¼ poor predictor of the actual pre-operative score.
< 0:001). This difference is consistent with the Adjusting for age, gender and health (SF-12,
MCID for the OHS and OKS proposed by Murray mental and physical scores) did not improve the
et al.8 who proposed that clinically significant model fit.
differences in the OHS and OKS are likely to be The level of agreement between the pre-oper-
between 3 and 5. Table 1 displays the pre-oper- ative and retrospective scores, which is arguably
ative and post-operative means and associated the most important factor in this analysis is best
standard deviations. It is important to note displayed using the Bland–Altman plot, first de-
that the post-operative standard deviations are scribed in 1986.2 This technique plots the mean
much larger than the actual pre-operative scores, differences between scores against the mean of
showing the considerable variability between the the two scores, as shown in Fig. 4. While the
two sets of scores in both the OHS and OKS. mean differences for both the OHS and OKS were
When using regression analysis to determine small (1.6 and 4.7, respectively), the 95% limits of
the strength of the relationship between the pre- agreement are large in both cases (19.9 to 23.1
operative and retrospective scores for both for the OHS, and 15.3 to 24.8 for the OKS). This
groups, the Pearson’s rank correlation coefficient then suggests that while the level of agreement
shows only a weakly positive relationship in both across the group is good, the limits of agreement
the OHS (r 2 ¼ 0:30) and OKS (r 2 ¼ 0:19) as dis- are too large to permit accurate estimation of in-
played in Fig. 3. This shows that the retrospective dividual scores.
estimates only explain 30% and 19% of the vari- The data was also analyzed for the influence of
ance in the pre-operative actual scores for the time, and patients grouped according to their
OHS and OKS, respectively, thus making it a time between surgery and recall. The sub-groups
1550003-4
Retrospective Recall of OHS and OKS
J. Musculoskelet. Res. 2015.18. Downloaded from www.worldscientific.com
by GRAND VALLEY STATE UNIV on 09/16/16. For personal use only.
Fig. 3
were 0–12 months, 13–24 months, 25–36 months time periods, suggesting a more accurate recall
and 37–48 months for both the OHS and OKS as during the first year.
shown in Figs. 5 and 6. This was done using one
factor ANOVA analysis comparing the means in
the subgroups over the four time periods. It is DISCUSSION
important to note that the number of patients in The OHS and OKS underwent extensive assess-
each time interval is not equal in both analyses, ment of validity, reliability and responsiveness in
and each OHS sub-group is small. There was no many prospective trials as described by Wilson
significant difference between means across all 4 et al.10 However, their retrospective use has not
periods for both the OHS and OKS (p > 0:05). yet been addressed in the literature. Knowing the
When examining the OKS group closer, there was strengths and limitations of a scoring system is
a tendency for the spread of scores to be less in integral to the overall applicability as well as the
the 0–12 month group, as indicated by a lower validity and reliability.9 The simple, responsive
standard deviation when comparing to the other and patient centered nature of the OHS and OKS
Fig. 4
1550003-5
T. M. Falconer et al.
J. Musculoskelet. Res. 2015.18. Downloaded from www.worldscientific.com
by GRAND VALLEY STATE UNIV on 09/16/16. For personal use only.
Fig. 5
makes them ideal for retrospective application, Selection bias was minimized in this study by
especially in the setting of trauma. Since they including all patients who had undergone a pri-
require no clinician contact, no radiology mea- mary hip or knee replacement over a three-year
surements and can be easily applied by mail, in- period and had no documented cognitive im-
vestigating the validity in this retrospective pairment. This allowed patients from a number of
manner is important for overall clinical practice. different surgeons, using different approaches
Fig. 6
1550003-6
Retrospective Recall of OHS and OKS
and different implants to be included. As the difference unreliable. Similarly, the Bland–Alt-
participants were invited to take part by mail, it man analysis showed 95% limits of agreement of
could be assumed that only those who were about 20 points for both OHS and OKS. These
mentally capable of understanding the study results suggest that while the OHS and OKS may
were included, with an overall response rate of be used retrospectively to evaluate surgical out-
60.6%. comes of groups of patients, this approach would
In the present study, we found no significant not be sufficiently accurate for estimation of the
influence of time since surgery on accuracy of pre-operative status of individual patients.
recall when comparing across different post-sur- This is again reflected in the Bland–Altman
gical time increments. When comparing the four plots examining the level of agreement between
time intervals, it is apparent that there is lower the two scores. While the mean differences are
variability in the difference between pre-opera- low for both the OHS and OKS, the 95% confi-
tive and retrospective scores in the first 12 dence limits are large, with ranges of 19.9–23.1
J. Musculoskelet. Res. 2015.18. Downloaded from www.worldscientific.com
months following surgery, which can likely be for the OHS, and 15.3–24.8 for the OKS.
by GRAND VALLEY STATE UNIV on 09/16/16. For personal use only.
attributed to better recall. This conclusion is Therefore, there is a 95% chance that an indivi-
supported Wilson et al.10 who examined retro- duals’ retrospective estimate could be approxi-
spective recall of the OSS. With a recall period of mately 20 points higher or lower than their actual
between 6 and 8 weeks they reported a strong pre-op score, displaying the inaccurate nature of
correlation between the two scores (r 2 ¼ 0:81). individuals’ ability to recall their symptoms. This
Further, the limits of agreement for individual finding is also consistent with those made by
scores was þ/8.8 points which is less than half Wilson et al. when examining the OSS.10
that recorded in the present study where the Some imitations of this study should be con-
mean time since surgery was 27 months. This sidered in the interpretation of the results. While
suggests that the most suitable time to perform the number of patients would provide sufficient
retrospective analysis of surgical outcome using power to address the research question, the issue
the OHS and OKS is within first 12 months after of short-term recall could have been addressed
surgery. more specifically if the number of patients pro-
When examining the scores overall, it was in- viding recall within the first year following sur-
teresting to note that there was a trend in both gery was increased. The current level of pain and
groups for patients to underestimate their pre- physical function were not measured at the time
operative scores. There was a statistically signif- the recall questionnaire was administered. Con-
icant difference in the OKS group with a mean sequently, the association between current func-
decrease in mean of 4.7 points as compared to the tional status and recall accuracy could not be
OHS group having a mean decrease of only 1.6 examined. Finally this study examined the issue
points. This is in contrast to the study examining of recall accuracy using only one type of outcome
the OSS by Wilson et al.,10 who found that questionnaire. Accuracy of recall may vary
patients tended to overestimate their pre-opera- according to the nature of the questionnaire and
tive shoulder symptoms when asked to recall inclusion of another function, or quality of life
them. While the mean difference for the OKS is questionnaire may have helped to address the
considered to be within the range to be consid- research objective more broadly.
ered clinically significant (3–5 points),8 spread of These findings suggest that individuals are
scores about this mean is large, making this unreliable at accurately recall their pre-operative
1550003-7
T. M. Falconer et al.
symptoms by way of the OHS and OKS. Patients 3. Bland JM, Altman DG. Statistical methods for assessing
tend to underestimate their pre-operative symp- agreement between two methods of clinical measure-
ment. Lancet 1(8476): 307–310, 1986.
toms according to these outcome measures and 4. Dawson J, Fitzpatrick R, Murray D, Carr A. Compari-
there is considerable variability between indivi- son of measures to assess outcomes in total hip re-
duals’ ability to recall. While the overall mean placement surgery. Qual Health Care 5(2): 81–88, 1996.
difference between the pre-operative and retro- 5. Dawson J, Fitzpatrick R, Murray D, Carr A. Question-
naire on the perceptions of patients about total knee
spective recalled scores is low, and there is a replacement. J Bone Joint Surg Br 80(1): 63–69, 1998.
tendency for patients to have more reliable recall 6. Impellizzeri FM, Mannion AF, Leunig M, Bizzini M,
within the first 12 months, it is our recommen- Naal FD. Comparison of the reliability, responsiveness,
dation that the OHS and the OKS are not used as and construct validity of 4 different questionnaires for
evaluating outcomes after total knee arthroplasty. J
retrospective tools. Arthroplasty 26(6): 861–869, 2011.
We represent that this submission is original 7. McMurray R, Heaton J, Sloper P, Nettleton S. Mea-
work, and is not under consideration for publi- surement of patient perceptions of pain and disability in
J. Musculoskelet. Res. 2015.18. Downloaded from www.worldscientific.com
cation with any other journal. relation to total hip replacement: The place of the Ox-
by GRAND VALLEY STATE UNIV on 09/16/16. For personal use only.
1550003-8