You are on page 1of 47

Diagnostic Test

Partini Pudjiastuti Child Health Department Faculty of Medicine University of Indonesia Jakarta

Diagnostic Tests
Tests as diagnostic aids and screening tools - key element of clinical medicine and public health. Electrocardiogram, cardiac enzymes for diagnosis of myocardial infarction Murphys sign (right upper abdominal tenderness on inspiration) in diagnosis of acute cholecystitis Pap smear for detection of cervical cancer Also essential in many epidemiologic studies where diagnostic criteria and/or tests are used to establish exposure, outcome status.

Properties of diagnostic tests

Diagnostic tests include lab tests, radiologic tests, tissue diagnosis and history and physical examination maneuvers An accurate diagnostic test can determine if a disease or condition is truly present Evaluating a diagnostic test means looking at the relationship between the test result and the true diagnosis

Reference (Gold) standard

One compares the result of a diagnostic test to the reference or gold standard for the diagnosis Gold standard may be expensive, inappropriate (e.g. autopsy based) or unsuitable (e.g. clinical follow-up when immediate decision required) There may be no gold standard test and may need to use long term outcome of patients with suspected disease or autopsy Cannot determine characteristics of test without this comparison

Designs of a diagnostic test

Observational study and consists of:
Predictor variable (test result) Outcome variable (presence / absence of the disease)

Specific cross sectional Variable scale: binominal/dichotomous

Disease Pts with & w/o disease

New Test


Gold std

No disease Disease

Architecture of diagnostic research

The diagnostic research question to be answered has to be carefully formulated, and determines the appropriate research approach:

Phase I questions: Do patients with the target disorder have different test results from normal individuals? Phase II questions: Are patients with certain test results more likely to have the target disorder than patients with other test results? Phase III questions: Among patients in whom it is clinically sensible to suspect the target disorder, does the test result distinguish those with and without the target disorder? Phase IV questions: Do patients who undergo the diagnostic test fare better (in their ultimate health outcomes) than similar patients who do not?

Basic principles (1)

Ideal diagnostic tests right answers:

(+) results in everyone with the disease and ( - ) results in everyone else

Usual clinical practice:

The test be studied in the same way it would be used in the clinical setting

Basic principles (2)

Sensitivity, specificity Prevalence, prior probability, predictive values Likelihood ratios Dichotomous scale, cutoff points (continuous scale) Positive (true and false), negative (true and false) ROC (receiver operator characteristic) curve

General Structure: 2 X 2 table

Condition (by gold standard) Present Absent
Positive Test Negative


True positive False positive a + b (a) (b) False negative (c) a+c True negative (d)




a+b+ c+d

What a study should tell you

Goal of clinical studies of diagnostic tests: Provide information on all four cells in the 2X2 table Include information on negative tests Information on test results in non-diseased (especially if test to be used for screening) Must know population in which test studied as this affects properties of test want to know results in patients across broad clinical spectrum

present absent Positive a b d b+d a+b c+d a+b+c+d negative c a+c

Sensitivity proportion of people with the disease who have a positive test Usually positive in the presence of disease A sensitive test will rarely miss disease in those who have it Sensitive tests rule out disease SnOut Sn = a / (a +c)

present absent Positive a b d b+d a+b c+d a+b+c+d negative c a+c

Specificity refers to proportion of patients without the disease with a negative test Rarely positive in the absence of disease A specific test will rarely identify disease in someone who does not have it A specific test rules in disease SpIn Sp = d / b+d

Example: A researcher develops a new saliva pregnancy test. She collects samples from 100 women known to be pregnant by blood test (the gold standard) and 100 women known not be pregnant, also based on the same blood test. The saliva test is positive in 95 of the pregnant women. It is also positive in 15 of the nonpregnant women. What are the sensitivity and specificity?

Pregnant Saliva + Saliva Totals 95 5 100

Non-pregnant 15 85 100

Totals 110 90 200

Sensitivity = TP/(TP+FN) = 95/100 = 95% Specificity = TN/(TN+FP) = 85/100 = 85%

Is it more important that a test be sensitive or specific? It depends on its purpose. A cheap mass screening test should be sensitive (few cases missed). A test designed to confirm the presence of disease should be specific (few cases wrongly diagnosed).

Note that sensitivity and specificity are two distinct properties. Where classification is based on an cutpoint along a continuum, there is a tradeoff between the two.

Example: The saliva pregnancy test detects progesterone. A refined version is developed. Suppose you add a drop of indicator solution to the saliva sample. It can stay clear (0 reaction) or turn green (1+), red (2+), or black (3+). (For purposes of discussion we will ignore overlapping colors)

The researcher conducts a validation study and finds the following: Pregnant Non-pregnant Totals Saliva 3+ Saliva 2+ Saliva 1+ Saliva 0 85 10 3 2 5 10 17 68 90 20 20 70





The sensitivity and specificity of the saliva test will depend on the definition of positive and negative used.

If positive 1+, sensitivity = (85+10+3)/100 = 98% specificity = 68/100 = 68%

If positive 2+, sensitivity = (85+10)/100 = 95% specificity = (68+17)/100 = 85% If positive = 3+, sensitivity = 85/100 = 85% specificity = (68+17+10)/100 = 95%

The choice of cutpoint depends on the relative adverse consequences of false-negatives vs. false-positives. If it is most important not to miss anyone (FN), use sensitivity and specificity.

If it is most important that people not be erroneously labeled as having the condition (FP), use sensitivity and specificity.

Sensitivity, and specificity dont help to make a diagnosis

What you need to know is, given a positive or negative result what is the chance the patient has the disease NOT. If they have disease what is the chance the patient has a positive (sensitivity) or negative (specificity) test

Positive Predictive values

present absent Positive a b a+b

negative c



Positive predictive value probability of disease in a patient with a positive (abnormal) test (i.e. that a positive test is a true positive) Highly specific diagnostic tests have high PPV PPV = a / a + b

Negative Predictive values

present absent Positive a a+c b a+b

negative c



Negative predictive value probability that a patient with a negative test (normal) does not have disease More sensitive tests have higher NPV NPV = d / c +d

Predictive values

Predictive value influenced by prevalence, therefore, not independent of situation in which it is used Positive tests in a low prevalence population likely to be false positives As prevalence approaches 0, the PPV also approaches 0 In order to know the PPV in an individual patient, need to know or estimate prevalence (likehood of disease) in such a patient

Predictive values

Sensitivity and specificity are fixed characteristics of a diagnostic test Do no depend on the prevalence of disease in a population

PPV/NPV useful for diagnosis

Probability of disease after a + or test

Predictive values do depend on prevalence e.g. a highly sensitive test applied in a high prevalence population will have a greater PPV than the same test in a low prevalence population

Likelihood Ratios

Ratio of the chance of finding that test result in patient with disease compared to the chance of that same result in patients without disease

LR =

probability of an individual with the condition having the test result probability of an individual without condition having the test result

A way to incorporate the sensitivity and specificity of a test into a single measure

Likelihood ratios
present absent Positive a a+c b d b+d a+b c+d a+b+c+ d negative c

Expresses how many times more (or less) likely a test result is found in diseased versus nondiseased people Likelihood ratio = probability of test result in diseased / probability of test result in nondiseased Positive LR = a / (a + c) b / (b +d)



= = = =

sensitivity / (1-specificity) [a/(a+c)) / (b/(b+d)] (1-sensitivity) / specificity [c/(a+c)) / (d/(b+d)]

Impact on Disease Likelihood

LR >10 or <0.1 cause large changes in likelihood LR 5-10 or 0.1-0.2 cause moderate changes LR 2-5 or 0.2-0.5 cause small changes LR between <2 and 0.5 cause little or no change

Likelihood ratios - why do we need them?

Allow us to summarize information from studies over a range of values i.e. the degree of abnormality, not just presence or absence Allows us to apply results to patients with varying pre-test probabilities of disease

Likelihood ratios

Use same information from diagnostic test studies Used to determine how much a particular test result changes the probability that a patient has a particular disease (how much it increases the posttest probability compared to pretest probability) Whether a test result convinces you to go ahead and treat a patient depends on how much it changes the probability the patient has the disease

Likelihood ratios

Use LR to convert pre-test probability to posttest probability First convert pretest probability to pretest odds Then multiply pretest odds by likelihood ratio to get posttest odds Convert posttest odds back to posttest probability Odds = prob of event / 1- prob of event Probability = odds / 1 + odds


= sensitivity / (1-specificity) = [a/(a+c)) / (b/(b+d)] LR= (1-sensitivity) / specificity = [c/(a+c)) / (d/(b+d)] Pre-test odds = pre-test probability / (1-pretest probability) Post-test odds = pre-test odds * LR Post-test probability = post-test odds / (post test odds+1)

Test & Treatment Thresholds in Diagnosis

Further testing

Test Threshold Treatment Threshold


Probability of Disease

(1-Se)/Sp= do not test do not treat

Likelihood ratio

+ = Se/(1-Sp)
do not test


get on with treatment

.20 .30 .40 .50 .60 .70 .80 .90 1


A pretest probability
PreTest odds x LR

B posttest probability pretest probability

Sensitivity=a/a+c=90% Specificity =d/b+d=85%

Pos predictive value=a/a+b=73% Neg predictive value=d/c+d=95% LR + = se/(1-sp)=90/15=6


Iron deficiency anemia

Diag nostic test result (Serum ferritin)

1001 a+b 1578 c+d 2579 a+b+c+ d

Present 731 a 78 c 809 a+c

Absent 270 b 1500 d 1770 b+d

Prevalence= (a+c)/(a+b+c+) = 32%

(+) <65 mmol/L (-) >65 mmol/L


Pretest probability

Likelihood ratio

Posttest probability

Diagnostic properties X-test for DHF

DHF Yes 30
10 40

X-test Result Positive X-Test Result Negative

DHF No 15
45 60

45 55 100

Prevalence = 0.40
Sensitivity = 30/40 = 0.75 Specificity = 45/60 = 0.75 PPV = 30/45 = 0.67 LR+ = 0.75/0.25 = 3 NPV = 45/55 = 0.82 LR- = 0.25/0.75 = 0.33

Diagnostic properties X-test for DHF

DHF Yes 6 2 8 DHF No 23 69 92

X-test Result Positive X-Test Result Negative

29 71 100

Prevalence = 0.08
Sensitivity = 6/8 = 0.75 Specificity = 69/92 = 0.75 LR+ = 0.75/0.25 = 3 PPV = 6/29 = 0.20 NPV = 69/71 = 0.97 LR- = 0.25/0.75 = 0.33

Trade-offs between Sn and Sp

Not all tests are positive/negative For some there is a continuum of possible results for a test where cut-off set a some level Generally, Sn and Sp are increased or decreased at each others expense Depending on desired characteristic of test (to maximize sensitivity or specificity) is where one sets cut-off Can also express this relationship with a receiver operating characteristic curve

ROC curves

Obtained by plotting the true positive rate (sensitivity) against the false positive rate (1specificity) over a range of cut-off values Values on the axes run from 0 to 1.0 Tests that discriminate well crowd the upper left corner of the curve Generally best cut-off is at the shoulder of the curve Accuracy of test equal to AUC

Continuum Scale Data: example

T4 value
5 or less 5.1 7.0
Hypo thyroid Eu thyroid

T4 value
5 >5

Hypo thyroid

Eu thyroid

18 7

1 17

18 14

1 92

7.1 9.0 9 or more


4 3

36 39




T4 value T4 level in suspected hypothyroidism in children

7 >7 Totals

Hypo thyroid 25 7 32
Hypo thyroid

Eu thyroid 18 75 93
Eu thyroid

Cutoff point
5 7 9

0.56 0.78 0.91

0.99 0.81 0.42

T4 value
9 >9 Totals

29 3 32

54 39 93

For tests / predictors with continuous values result , cutoff points should be determine to choose the best value to use in distinguishing those with and without the target disorder


Cutoff point
5 7 9

0.56 0.78 0.91

0.99 0.81 0.42

Cutoff point
5 7 9

Sens TP
0.56 0.78 0.91

1-Spec FP
0.01 0.19 0.58

Accuracy of the test

The accuracy of the test depends on how well the test separates the group being tested into those with and without the disease in question Accuracy is measured by the area under the ROC curve. An area of 1 represents a perfect test; an area of 0.5 represents a worthless test (AUC)

0.90-1.00 0.80-0.90 0.70-0.80 0.60-0.70 0.50-0.60

= = = = =

excellent (A) good (B) fair (C) poor (D) fail (F)

An ROC curve demonstrates several things:

It shows the tradeoff between sensitivity and specificity

any increase in sensitivity will be accompanied by a decrease in specificity

The closer the curve follows the left-hand border and then the top border of the ROC space, the more accurate the test. The closer the curve comes to the 45-degree diagonal of the ROC space, the less accurate the test. The slope of the tangent line at a cutoff point gives the likelihood ratio (LR) for that value of the test.

Summary- 1

What do you want to DO with a test?

Rule in disease? Rule out disease?

Think about the pre-test probability before you order a test Compare the operating characteristics of tests before you select one Think about what you will do with the results of the test (implications)

Summary- 2

If a serious outcome if the disease is Sensitivity missed, you want a high ________ If the treatment is invasive or risky, you Specificity want a high _________ Predictive value is influenced by underlying prevalence of disease Likelihood ratios are not influenced by prevalence of disease