Comparison of sensitivity and specificity between two diagostic tests, each measured on the

same patient, when the same reference standard is used

For this situation, we want to test whether the two diagnostic tests perform equally against a

common reference standard. For Test A and Test B, the hypothesis test for a comparison of

sensitivity can be stated,

Test

Reference Standard Positive (1) Negative (0) Total

Present (1) n11A n10A r1A

Absent (0) n01A n00A r0A

Total c1A c0A

where n = cell count, r = row total , c = column total

subscripts represent score (1=present or positive, 0 = absent or negative) and test label (A)

Sensitivity (SeA) = { true positives } / {all patients with disease} = n11A / r1A

Test

Reference Standard Positive (1) Negative (0) Total

Present (1) n11B n10B r1B

Absent (0) n01B n00B r0B

Total c1B c0B

Sensitivity (SeB) = { true positives } / {all patients with disease} = n11B / r1B

_________________

Source: Stoddard GJ. Biostatistics and Epidemiology Using Stata: A Course Manual [unpublished manuscript] University of Utah School of Medicine, 2010.

School of Medicine, 2010.

We see that all information for sensitivity for each test is contained in the first row, where the

first row of each table is the true presence of disease as identified by the common reference

standard. For a paired comparison of sensitivity, then, all we need are the cell counts in these

rows, combined into a paired crosstabulation table.

Test A

Test B Positive (1) Negative (0) Total

Positive (1) m11 m10 n11B

Negative (0) m01 m00 n10B

Total n11A n10A

Where the cell counts, the ms, simply fill in by the crosstabulation procedure.

Since the data are not independent, being repeated measures on the same patient (both tests done

on same patient), we must apply a paired proportions comparision. To compare sensitivity, we

simply apply the McNemar test, which is the standard way to compare two paired binary

variables expressed in this paired data layout (Lachenbruch and Lynch, 1998; Zhou et al, 2002,

pp.166-169).

Test A

Test B Positive (1) Negative (0) Total

Positive (1) m11 m10 n11B

Negative (0) m01 m00 n10B

Total n11A n10A

The McNemar test is commonly referred to as the McNemar change test, as it only uses

information from the discordant pairs (the cells where the two diagnostic tests are different).

It is simply a chi-square test (Siegel and Castellan, 1988, p.76) expressed as,

(m10 - m01 ) 2

c 2

df =1 =

m10 + m01

The chi-square test requires a sufficiently large sample size to provide an accurate p value. The

rule-of-thumb for the McNemar test version of the chi-square test is that when (m10 + m01) < 10,

the exact form of the test should be used (Siegel and Castellan, 1988, p.79). Since the data are

paired, the Fishers exact test is not appropriate, and so the binomial test is used. In Stata, this

binomial test is labeled Exact McNemar.

Specificity

Test

Gold Standard Positive (1) Negative (0) Total

Present (1) n11A n10A r1A

Absent (0) n01A n00A r0A

Total c1A c0A

where n = cell count, r = row total , c = column total

subscripts represent score (1=present or positive, 0 = absent or negative) and test label (A)

Specificity (SpA) = { true negative } / {all patients without disease} = n00A / r0A

Test

Gold Standard Positive (1) Negative (0) Total

Present (1) n11B n10B r1B

Absent (0) n01B n00B r0B

Total c1B c0B

Specificity (SpB) = { true negative } / {all patients without disease} = n00B / r0B

We see that all information for specificity for each test is contained in the second row, where the

second row of each table is the true absence of disease as identified by the common reference

standard. For a paired comparison of specificity, then, all we need are the cell counts in these

rows, combined into a paired crosstabulation table.

Test A

Test B Positive (1) Negative (0) Total

Positive (1) m11 m10 n01B

Negative (0) m01 m00 n00B

Total n01A n00A

Where the cell counts, the ms, simply fill in by the crosstabulation procedure.

Protocol Suggestion

For comparison of sensitivity and specificity between two diagnostic tests, you could describe

the statistical method as:

Within the same patients, both Test A and Test B will be compared to a common Test C

gold standard and test characteristics will be calculated. The sensitivity between Test A

and Test B will be compared using a McNemar test, or exact McNemar test, as

appropriate [Lachenbruch and Lynch, 1998]. The specificity will similarly be compared.

Example

We will use the CASS dataset (see Appendix 1 for references). These data come from the

coronary artery surgery study (CASS). In a cohort study of N=1465 men undergoing coronary

arteriography (the gold standard) for suspected or probable coronary heart disease, both an

exercise stress test (EST) and chest pain history (CPH) were recorded. The data are coded as

est exercise stress test (diagnostic test for CAD), 1 = positive, 0 = negative

cph chest pain history (diagnostic test for CAD), 1 = positive, 0 = negative

File

Open

Find the directory where you copied the course CD:

Find the subdirectory datasets & do-files

Single click on cass.dta

Open

Biostats & Epi With Stata\datasets & do-files\cass.dta", clear

cd "Biostats & Epi With Stata\datasets & do-files"

use cass.dta, clear

To obtain the sensitivity and specificity for est, we use the diagt command, which is not available

from the Stata menu bar.

If you have not already updated your Stata to include it, then while connected to the internet, use

findit diagt

(help diagt if installed) . . . . . . . . . . P. T. Seed and A. Tobias

Q4/04 SJ 4(4):490

new options added to diagt

Click on the sbe36_2 link, or a later version if one appears, to install the diagt command.

Coronary |

artery | Exercise Stress test

disease | Pos. Neg. | Total

-----------+----------------------+----------

Abnormal | 815 208 | 1,023

Normal | 115 327 | 442

-----------+----------------------+----------

Total | 930 535 | 1,465

[95% Confidence Interval]

---------------------------------------------------------------------------

Prevalence Pr(A) 70% 67% 72.2%

---------------------------------------------------------------------------

Sensitivity Pr(+|A) 79.7% 77.1% 82.1%

Specificity Pr(-|N) 74% 69.6% 78%

---------------------------------------------------------------------------

Coronary |

artery | Chest pain history

disease | Pos. Neg. | Total

-----------+----------------------+----------

Abnormal | 969 54 | 1,023

Normal | 245 197 | 442

-----------+----------------------+----------

Total | 1,214 251 | 1,465

---------------------------------------------------------------------------

Prevalence Pr(A) 70% 67% 72.2%

---------------------------------------------------------------------------

Sensitivity Pr(+|A) 94.7% 93.2% 96%

Specificity Pr(-|N) 44.6% 39.9% 49.3%

---------------------------------------------------------------------------

To compute the McNemar test for the sensitivity comparison between the two diagnostic tests,

we restrict the data to the disease present rows, using an if qualifier

| Controls |

Cases | Exposed Unexposed | Total

-----------------+------------------------+------------

Exposed | 786 183 | 969

Unexposed | 29 25 | 54

-----------------+------------------------+------------

Total | 815 208 | 1023

Exact McNemar significance probability = 0.0000

We see that the sum of the discordent pairs, 183+29 > 10, so that the sample size is large enough

to provide an accurate chi-square test p value. Therefore, we report the chi-square version of

McNemars test (p < 0.001). If, however, the discordant pairs had summed to a number < 10, we

would report the Exact McNemar test (p < .001).

Unfortunately, the variables are labeled cases and controls, which is rather confusing. It is

labelled this way because the McNemar test is part of the epitab suite of commands (the

epidemiology statistical procedures). To verify which variable represents cases, and which

represents controls, we can use,

history | 0. neg 1. pos | Total

-----------+----------------------+----------

0. neg | 25 29 | 54

1. pos | 183 786 | 969

-----------+----------------------+----------

Total | 208 815 | 1,023

This output has the row and column variables consistent with the mcc command, but displays it

in ascending sort order.

To compute the McNemar test for the specificity comparison between the two diagnostic tests,

we restrict the data to the disease absent rows, using an if qualifier

tab cph est if cad==0

| Controls |

Cases | Exposed Unexposed | Total

-----------------+------------------------+------------

Exposed | 69 176 | 245

Unexposed | 46 151 | 197

-----------------+------------------------+------------

Total | 115 327 | 442

Exact McNemar significance probability = 0.0000

history | 0. neg 1. pos | Total

-----------+----------------------+----------

0. neg | 151 46 | 197

1. pos | 176 69 | 245

-----------+----------------------+----------

Total | 327 115 | 442

Comparing ROCs

In Stata, the method for comparing two ROCs, as programmed in the roccomp command, is

described by DeLong et al (1988). You could describe this in your protocol as,

The area under the receiver operating characteristic (ROC) curves were computed. For

comparisons of the ROC from different prediction rules, or prognostic models, using a

common reference standard, the method of DeLong et al (1988) was used.

----

DeLong ER, Delong DM, Clark-Pearson DL. Comparing the areas under two or more

correlated receiver operating characteristic curves: a nonparametric approach. Biometrics

1988;44(3):837-845.

References

DeLong ER, Delong DM, Clark-Pearson DL. (1988). Comparing the areas under two or more

correlated receiver operating characteristic curves: a nonparametric approach. Biometrics

44(3):837-845.

Lachenbruch PA, Lynch C. (1998). Assessing screening tests: extensions of McNemars test.

Statist Med 17:2207-2217.

Siegel S, Castellan NH Jr. (1988). Nonparametric Statistics for the Behavioral Sciences. 2nd ed.

New York, McGraw Hill.

Zhou X-H, Obuchowski NA, McClish DK. (2002). Statistical Methods in Diagnostic Medicine.

New York, John Wiley & Sons.

