Partini Pudjiastuti Child Health Department Faculty of Medicine University of Indonesia Jakarta
Diagnostic Tests
Tests as diagnostic aids and screening tools  key element of clinical medicine and public health. Electrocardiogram, cardiac enzymes for diagnosis of myocardial infarction Murphys sign (right upper abdominal tenderness on inspiration) in diagnosis of acute cholecystitis Pap smear for detection of cervical cancer Also essential in many epidemiologic studies where diagnostic criteria and/or tests are used to establish exposure, outcome status.
Diagnostic tests include lab tests, radiologic tests, tissue diagnosis and history and physical examination maneuvers An accurate diagnostic test can determine if a disease or condition is truly present Evaluating a diagnostic test means looking at the relationship between the test result and the true diagnosis
One compares the result of a diagnostic test to the reference or gold standard for the diagnosis Gold standard may be expensive, inappropriate (e.g. autopsy based) or unsuitable (e.g. clinical followup when immediate decision required) There may be no gold standard test and may need to use long term outcome of patients with suspected disease or autopsy Cannot determine characteristics of test without this comparison
New Test
+

Gold std
No disease Disease
Phase I questions: Do patients with the target disorder have different test results from normal individuals? Phase II questions: Are patients with certain test results more likely to have the target disorder than patients with other test results? Phase III questions: Among patients in whom it is clinically sensible to suspect the target disorder, does the test result distinguish those with and without the target disorder? Phase IV questions: Do patients who undergo the diagnostic test fare better (in their ultimate health outcomes) than similar patients who do not?
(+) results in everyone with the disease and (  ) results in everyone else
The test be studied in the same way it would be used in the clinical setting
Sensitivity, specificity Prevalence, prior probability, predictive values Likelihood ratios Dichotomous scale, cutoff points (continuous scale) Positive (true and false), negative (true and false) ROC (receiver operator characteristic) curve
Total
True positive False positive a + b (a) (b) False negative (c) a+c True negative (d)
d
c+d
Total
b+d
a+b+ c+d
Goal of clinical studies of diagnostic tests: Provide information on all four cells in the 2X2 table Include information on negative tests Information on test results in nondiseased (especially if test to be used for screening) Must know population in which test studied as this affects properties of test want to know results in patients across broad clinical spectrum
Sensitivity
present absent Positive a b d b+d a+b c+d a+b+c+d negative c a+c
Sensitivity proportion of people with the disease who have a positive test Usually positive in the presence of disease A sensitive test will rarely miss disease in those who have it Sensitive tests rule out disease SnOut Sn = a / (a +c)
Specificity
present absent Positive a b d b+d a+b c+d a+b+c+d negative c a+c
Specificity refers to proportion of patients without the disease with a negative test Rarely positive in the absence of disease A specific test will rarely identify disease in someone who does not have it A specific test rules in disease SpIn Sp = d / b+d
Example: A researcher develops a new saliva pregnancy test. She collects samples from 100 women known to be pregnant by blood test (the gold standard) and 100 women known not be pregnant, also based on the same blood test. The saliva test is positive in 95 of the pregnant women. It is also positive in 15 of the nonpregnant women. What are the sensitivity and specificity?
Nonpregnant 15 85 100
Is it more important that a test be sensitive or specific? It depends on its purpose. A cheap mass screening test should be sensitive (few cases missed). A test designed to confirm the presence of disease should be specific (few cases wrongly diagnosed).
Note that sensitivity and specificity are two distinct properties. Where classification is based on an cutpoint along a continuum, there is a tradeoff between the two.
Example: The saliva pregnancy test detects progesterone. A refined version is developed. Suppose you add a drop of indicator solution to the saliva sample. It can stay clear (0 reaction) or turn green (1+), red (2+), or black (3+). (For purposes of discussion we will ignore overlapping colors)
The researcher conducts a validation study and finds the following: Pregnant Nonpregnant Totals Saliva 3+ Saliva 2+ Saliva 1+ Saliva 0 85 10 3 2 5 10 17 68 90 20 20 70
Totals
100
100
200
The sensitivity and specificity of the saliva test will depend on the definition of positive and negative used.
The choice of cutpoint depends on the relative adverse consequences of falsenegatives vs. falsepositives. If it is most important not to miss anyone (FN), use sensitivity and specificity.
If it is most important that people not be erroneously labeled as having the condition (FP), use sensitivity and specificity.
What you need to know is, given a positive or negative result what is the chance the patient has the disease NOT. If they have disease what is the chance the patient has a positive (sensitivity) or negative (specificity) test
negative c
a+c
d
b+d
c+d
a+b+c+d
Positive predictive value probability of disease in a patient with a positive (abnormal) test (i.e. that a positive test is a true positive) Highly specific diagnostic tests have high PPV PPV = a / a + b
negative c
d
b+d
c+d
a+b+c+d
Negative predictive value probability that a patient with a negative test (normal) does not have disease More sensitive tests have higher NPV NPV = d / c +d
Predictive values
Predictive value influenced by prevalence, therefore, not independent of situation in which it is used Positive tests in a low prevalence population likely to be false positives As prevalence approaches 0, the PPV also approaches 0 In order to know the PPV in an individual patient, need to know or estimate prevalence (likehood of disease) in such a patient
Predictive values
Sensitivity and specificity are fixed characteristics of a diagnostic test Do no depend on the prevalence of disease in a population
Predictive values do depend on prevalence e.g. a highly sensitive test applied in a high prevalence population will have a greater PPV than the same test in a low prevalence population
Likelihood Ratios
Ratio of the chance of finding that test result in patient with disease compared to the chance of that same result in patients without disease
LR =
probability of an individual with the condition having the test result probability of an individual without condition having the test result
A way to incorporate the sensitivity and specificity of a test into a single measure
Likelihood ratios
present absent Positive a a+c b d b+d a+b c+d a+b+c+ d negative c
Expresses how many times more (or less) likely a test result is found in diseased versus nondiseased people Likelihood ratio = probability of test result in diseased / probability of test result in nondiseased Positive LR = a / (a + c) b / (b +d)
LR+
LR
= = = =
LR >10 or <0.1 cause large changes in likelihood LR 510 or 0.10.2 cause moderate changes LR 25 or 0.20.5 cause small changes LR between <2 and 0.5 cause little or no change
Allow us to summarize information from studies over a range of values i.e. the degree of abnormality, not just presence or absence Allows us to apply results to patients with varying pretest probabilities of disease
Likelihood ratios
Use same information from diagnostic test studies Used to determine how much a particular test result changes the probability that a patient has a particular disease (how much it increases the posttest probability compared to pretest probability) Whether a test result convinces you to go ahead and treat a patient depends on how much it changes the probability the patient has the disease
Likelihood ratios
Use LR to convert pretest probability to posttest probability First convert pretest probability to pretest odds Then multiply pretest odds by likelihood ratio to get posttest odds Convert posttest odds back to posttest probability Odds = prob of event / 1 prob of event Probability = odds / 1 + odds
LR+
= sensitivity / (1specificity) = [a/(a+c)) / (b/(b+d)] LR= (1sensitivity) / specificity = [c/(a+c)) / (d/(b+d)] Pretest odds = pretest probability / (1pretest probability) Posttest odds = pretest odds * LR Posttest probability = posttest odds / (post test odds+1)
0%
Test Threshold Treatment Threshold
100%
Probability of Disease
Likelihood ratio
+ = Se/(1Sp)
do not test
Test
Test
.20 .30 .40 .50 .60 .70 .80 .90 1
.10
A pretest probability
PreTest odds x LR
Outcome
Predictor
Diag nostic test result (Serum ferritin)
Totals
1001 a+b 1578 c+d 2579 a+b+c+ d
Totals
Pretest probability
Likelihood ratio
Posttest probability
DHF No 15
45 60
45 55 100
Prevalence = 0.40
Sensitivity = 30/40 = 0.75 Specificity = 45/60 = 0.75 PPV = 30/45 = 0.67 LR+ = 0.75/0.25 = 3 NPV = 45/55 = 0.82 LR = 0.25/0.75 = 0.33
29 71 100
Prevalence = 0.08
Sensitivity = 6/8 = 0.75 Specificity = 69/92 = 0.75 LR+ = 0.75/0.25 = 3 PPV = 6/29 = 0.20 NPV = 69/71 = 0.97 LR = 0.25/0.75 = 0.33
Not all tests are positive/negative For some there is a continuum of possible results for a test where cutoff set a some level Generally, Sn and Sp are increased or decreased at each others expense Depending on desired characteristic of test (to maximize sensitivity or specificity) is where one sets cutoff Can also express this relationship with a receiver operating characteristic curve
ROC curves
Obtained by plotting the true positive rate (sensitivity) against the false positive rate (1specificity) over a range of cutoff values Values on the axes run from 0 to 1.0 Tests that discriminate well crowd the upper left corner of the curve Generally best cutoff is at the shoulder of the curve Accuracy of test equal to AUC
T4 value
5 >5
Hypo thyroid
Eu thyroid
18 7
1 17
18 14
1 92
4 3
32
36 39
93
Totals
32
93
Hypo thyroid 25 7 32
Hypo thyroid
Eu thyroid 18 75 93
Eu thyroid
Cutoff point
5 7 9
Sens
0.56 0.78 0.91
Spec
0.99 0.81 0.42
T4 value
9 >9 Totals
29 3 32
54 39 93
For tests / predictors with continuous values result , cutoff points should be determine to choose the best value to use in distinguishing those with and without the target disorder
Sens
0.56 0.78 0.91
Spec
0.99 0.81 0.42
Cutoff point
5 7 9
Sens TP
0.56 0.78 0.91
1Spec FP
0.01 0.19 0.58
The accuracy of the test depends on how well the test separates the group being tested into those with and without the disease in question Accuracy is measured by the area under the ROC curve. An area of 1 represents a perfect test; an area of 0.5 represents a worthless test (AUC)
= = = = =
excellent (A) good (B) fair (C) poor (D) fail (F)
The closer the curve follows the lefthand border and then the top border of the ROC space, the more accurate the test. The closer the curve comes to the 45degree diagonal of the ROC space, the less accurate the test. The slope of the tangent line at a cutoff point gives the likelihood ratio (LR) for that value of the test.
Summary 1
Think about the pretest probability before you order a test Compare the operating characteristics of tests before you select one Think about what you will do with the results of the test (implications)
Summary 2
If a serious outcome if the disease is Sensitivity missed, you want a high ________ If the treatment is invasive or risky, you Specificity want a high _________ Predictive value is influenced by underlying prevalence of disease Likelihood ratios are not influenced by prevalence of disease