Professional Documents
Culture Documents
Long Beach
(n = 200) 59 97 66 74 62 135
Hungary
(n = 425) 48 71 43 38 63 132
Base1
(n = 85) 55 86 73 a5 69 139
Zurich
(n = 58) 54 93 74 74 40 119
Cleveland
(n = 303) 54 68 53 46 60 132
* Diseaseis definedas >50% diameter narrowing;+ MVD = multivesseldiseasedefinedby X0% diameter narrowingin >l vessel:* SBP = mean systolic blood pressure.
diabetes, family history, electrocardiogram at rest, se- Swiss universities: This group was drawn from all
rum cholesterol and fasting blood sugar. The etercise subjects undergoing cardiac catheterization at the uni-
variables included medications at the time of the exer- versity hospitals in Zurich and Basel, Switzerland, in
cise test, duration of the exercise test, peak achieved 1985. The aforementioned exclusion criteria were ap-
heart rate, heart rate at rest, peak exercise systolic plied. Of the 143 Swiss patients, 58 underwent catheter-
blood pressure, exercise-induced angina or hypotension ization in Zurich and 85 in Basel.
or both, exercise-induced ST depression relative to rest, The clinical characteristics of the subjects in the 3
exercise-induced ST slope, exercise-induced R-wave test groups and the rationalization for combining the
change, radionuclide ejection fraction and wall motion groups from the Swiss universities is given in the Re-
abnormalities at rest and during exercise, and exercise sults section. The observed sensitivities and specificities
thallium results (as seen earlier). Coronary angiograms of the exercise electrocardiogram and exercise thallium
were considered abnormal if there was >50% luminal scintigraphy also are given in the Results section.
narrowing of any major epicardial vessel. Histories, Evatuation &f probabilities: The reliability of a prob-
physical examinations and all noninvasive tests were ability estimate :reflects its numerical proximity to the
performed within 6 weeks before the date of the coro- actual disease prevalence in subjects with similar clini-
nary angiogram. A description of the individual test cal and test data. If the estimates are reliable, the mean
groups follows. disease probability in a test group will be the same as
Long Beach Veterans Administration Medical Cen- the disease prevalence ii the group. Thus, by subtract-
t&: This group was drawn from all consecutive subjects ing the prevalence of fisease from the mean or expected
undergoing cardiac catheterization at the Veterans Ad- probabilities, and dlvlding this difference by the stan-
ministration Medical Center in Long Beach between dard deviation, we get an overestimation index that is a
1984 and 1987. After excluding those with prior infarc- measure of how much a model over- or underpredicts
tion, valvular disease and prior catheterization, there disease probability. Because this is a mean of an as-
were 200 test group subjects. sumedly normally distributed difference divided by its
Hungarian Institute of Cardiology: This group was standard deviation,” comparison of models is simpli-
drawn from all patients undergoing catheterjzation at fied. This can be done using the Student t test with a
the Hungarian Institute of Cardiology, in Budapest be- standard deviation of 1.0.
tween 1983 and 1987. Patients with prior infarction or Finer detail can be obtained by sorting the probabili-
valvular disease were excluded. The remaining 425 sub- ty estimates in ascending order and then dividing them
jects made up the Hungarian test group. into quintiles of probability.12 The expected probabili-
ties in each quintile are compared with the prevalences Reliability: Overestimation indexesfor the probabili-
in that quintile and the differencescomputed. Thesedif- ty estimates were significantly higher for CADENZA
ferencesare a measure of the overestimation per quin- than for the discriminant function in the American test
tile, and reflect the reliability of low, intermediate and group (6.1 vs 2.0, p <O.OOl) and in the Hungarian
high estimates. group (10.4 vs 5.6, p <O.OOl). In the Swiss group,
A probability estimate will be clinically useful if itCADENZA slightly overestimateddiseaseprobability,
accurately classifiespatients as diseasedor not diseased. whereas the discriminant function underestimated it.
A probability algorithm will be useful if its probability Figure lA, B and C shows that both models tend to
estimatesare clinically useful over an appropriate range overestimateintermediate diseaseprobabilities (second,
of probability thresholds. third and fourth quintiles), with CADENZA causing
We agreed that the most relevant probability thresh- the most overestimation. At the Swiss universities,
olds for making decisions concerning angiography or where diseaseprevalencewas highest, both models un-
therapy lie between 0.20 and 0.80 for subjects with derestimated low probabilities. In all but the second
chest pain syndromes.Therefore, the percentageof cor- quintile of this group, the probability estimatesderived
rect classifications was calculated over this range (0.2 < by CADENZA had a larger absolute error than those
p < 0.8) for both algorithms in the 3 test groups. obtained using the discriminant function.
Subjects whose clinical and test data are concordant Clinical utility: The percentageof patients who are
will generally have very high or very low probability es- correctly classified will depend on the accuracy of the
timates from any algorithm. We agreed that a clinician model and on the overall diseaseprevalencein the test
would probably not need a probability estimate for clin- group. Models that overestimatediseasewill causemore
ical decision making in these cases. Such estimates erroneousdiagnosesat low prevalence,but will classify
would instead be most useful for caseswhere patient patients correctly at high prevalence.Figure 2A, B and
data are discordant. These patients would have interme- C bears this out. The Cleveland discriminant function
diate probability estimates by most algorithms; there- more accurately classifiedpatients at the Hungarian In-
fore, the percent correct classification rate was recalcu- stitute, where there was a low diseaseprevalence.These
lated ignoring all subjects for whom probability esti- differences were statistically significant between proba-
mates from both the discriminant function and bility thresholdsof 0.4 and 0.7 (p <0.05). In the Ameri-
CADENZA were out of the range (0.2 < p < 0.8). can group, where diseaseprevalencewas higher, the dis-
To compare the correct classification rates for the 2 criminant function also classified patients more correct-
algorithms, we used the McNemar’s test. ly than did CADENZA, but the difference was not
significant except at thresholdsaround 0.6. In the Swiss
RESULTS group, which had the highest diseaseprevalence, CA-
Table I lists the demographic and clinical character- DENZA resulted in a significantly higher rate of cor-
istics of the various patient study groups. Table II lists rect classification at thresholds of at least 0.70 (p
the sensitivities and specificities of 1 mm = 0.1 mV ex- <0.05).
ercise-induced ST depression, exercise-induced angina It is interesting to comparethe percentageof correct
pectoris and an abnormal thallium scintigram (fixed or classifications for the 2 algorithms in the 3 groups at
reversible defect or both). specific thresholds. Table III is such a comparison.
The 2 Swiss groups are very similar with respect to Thresholds of 0.4, 0.5 and 0.6 were used becausewe
age, sex and symptoms. Becauseof their small size and thought them to be appropriate for many clinical deci-
their similarity, they were combined into a single group. sions. The table gives the percentageof correct classifi-
cations by the 2 algorithms (1) when all patients are decisionsare more difficult and the use of a probability
included; and (2) when those for whom both algorithms algorithm is more relevant. Both algorithms performed
produced very low (10.2) or very high (10.8) probabil- less well when these exclusions were made. The dis-
ities are excluded. Excluding these latter subjects may criminant function performed moderately better than
be appropriate becausethis exclusion leaves primarily CADENZA, both with and without the exclusion of
those patients with discordant results for whom clinical subjects with extreme probabilities.
Ownrtimation ot Probability
0.3, --“I I
86%
I
80% -
- Cleveland D. F.
762
+ CADENZA
Ovwertlmrtlon of Probability
- Cleveland D. F.
76%
+ CADENZA
Owmrtlm~tion of Probability
0.1 - - Cleveland D. F.
+ CADEWZA
o-
.O.l -
.0.2 -
00% I
0.2 0.3 0.4 0.8 0.8. 0.7 0.8
.0.3 ’ c
1 2 3 6 Probability Threshold
C Qulntlle of Probabllit:
FlGURE2.Pemmtageefpatients4wmwtly-versus
FIGURE 1. - by quintiles fer the Amerkan (A), pmbabH&tbrdddshtbe~(A),~(B)and
Mmgafian (B) and Swiss (C) test gmups. Bar Mghts are swiss (c) test greups. verlkai dlances behveenibeauves
cakdatd by subtrm disease prevalence from the average aredgn&a~tadyforhwholdsnecvO.6(-),O.Sto
estimated probabilities in each quintile. 0.7 (Hungarian) and near 0.7 (Swiss).
12. Cornfield J. Joint dependenccof risk of coronary heart disc on serum 17. De.tranoR, Froelicher V. A logical approachto screeningfor coronary artery
cholesteroland systolic blood pressure:a discriminant function analysis.Fed Proc disease.Ann Intern Med 1987;106:846-852.
1962;21 (suppl II):Il, 18. Detrano R, Nourijah P, Froelicher V. Screeningfor coronary artery disease.
12. Comlield J, Dunn RA, Batchlor CD, Pipberger H. Multigroup diagnosisof Ann Intern Med 1987;107:594-595.
ekxtrwardioarams. Com~ur Biomed Res 1973.6:97-120. 19. Zir LM, Miller SW, DinsmoreRE, Gilbert JP, Harthome JW. Interobserver
14. Diamond GA. Monkey business.Am J C&diol1986:57:471-475. variability in coronary angiography. Circularion 1976:53:627-632.
15. Detrano R, Janosi A, Lyons KP, Marcondea G, Abbassi N, Froelicher V. 20. Detrano R, Guppy K, AbbassiN, Jan& A, SandhuS, Froelicher V. Reliabil-
Factors affecting the sensitivity and specificity of a diagnostic test: the exercise ity of Bay&n probability analysis for predicting coronary artery diseasein a
thallium scintixram. Am J kfed 1988:84.699-710. veterans hcqital. J Clin Epidemiol 1988;41:599605.
16. Gianrossi i, Detrano R, Mulvihili D, Lehmann K, Dubach P, Colombo A, 21. RussekB, Kronmal RA, Fisher LD: The effect of assumingindependencein
McArthur D, Froelicher V. Exercise.-inducedST depressionin the diagnosisof applying Bayea’thcorcmto risk estimationandclassificationin diagnosis.Comput
coronary artery disease:a meta-analysis.Circulation 1989, in press. - Biomed Res 1983;16:537-552.