You are on page 1of 46

DIAGNOSTIC TEST IN FAMILY

MEDICINE

LECTURE DELIVERED AT
THE 2008 WACP. FACULTY OF FAMILY MEDICINE
PART I REVISION COURSE.
BY
DR F A OLANIYAN

MBBS Ib., MSc. Epid. & Med Stat. Ib., FWACP, FMCGP
UNIVERSITY COLLEGE HOSPITAL IBADAN
INTRODUCTION

 Diagnostic tests have particular


importance in medicine, where early and
accurate diagnosis can decrease the
morbidity and mortality of disease.
 For many years, diagnostic performance
was reported by the accuracy of test.
 The test accuracy is the percentage of the
diagnostic decisions that turned out to be
correct and very important for both
quality of care and cost containment.
Diagnostic tests
 When looking at a paper about a
diagnostic test we ask ourselves
three questions.
 Is this test useful?
 Is it reliable?
 Is it valid?
Is this test useful?
 The test should have been
researched in a study population
relevant to the individual or
population in whom it is to be used.
Reliability
 Reliability refers to the repeatability
or reproducibility of a test.

 It can be assessed by repeating the


test using the same or different
observers.
Validity
 Relates to whether the test measures what it
purports to measure. Is the result true?
 For example if you measure blood pressure in
an obese patient and use a cuff that is too small
you are likely to get a falsely high reading. The
reading maybe reliable (you get the same blood
pressure if you do it again) but it lacks validity.
Diagnostic Test Studies
How well does a test
 identify people with the disease?

 exclude people without the disease?


Compare test results on people with the disease
with test results on people without the disease.
 Need to know who has the disease.
Who has the disease?

True diagnosis.
 We can never be absolutely sure that the ‘true’ diagnosis
is correct.
 We decide to accept one method as ‘true’: this is the gold
standard or reference standard.
 It is often more invasive than the test, e.g. histopathology
compared to ultrasound image.
 It is always possible that the reference standard is wrong
for some subjects.
Statistics of diagnostic test studies
-Sensitivity
-Specificity
-False positive rate
-False negative rate
-Likelihood ratio (LR) for positive test
-Likelihood ratio (LR) for negative test
- Positive predictive value (PPV)
-Negative predictive value (NPV)
-Receiver operating characteristic (ROC) curve
Kappa when responses are binary
TYPES OF DIAGNOSTIC TEST
Diagnostic Test Studies

Two designs
 Prospective or cohort design, or cross-sectional
design:
take a sample of subject eligible for the test, test them
all and get true diagnosis on them all.
 Retrospective or case-control design:
take a sample with true diagnosis established as
positive and another sample of controls. We may
have negative diagnosis established on controls and
we may not.
TWO GAUSSIAN CURVES DESCRIBING
THE DISTRIBUTION OF A TEST RESULT
Statistical Vs Systematic ERRORS
 Statistical error: the diff btw measured value
and the true value that is caused by random,
and inherently unpredictable fluctuations in
the measurement apparatus.
 Systematic error: the dif btw a measured
value and the true value that is caused by
non-random fluctuations from an unknown
source and which, once identified, can usually
be eliminated.
TYPES OF ERRORS
STATISTICAL ERROR: Type I and Type II
 Type I error, also known as an "error of the

first kind", an α error, or a "false positive": the


error of rejecting a null hypothesis when it is
actually true.
 Type II error, also known as an "error of the

second kind", a β error, or a "false negative":


the error of accepting a null hypothesis when
the alternative hypothesis is the true state of
nature
Interpretation Diagnostic Test
Interpretation
• The basic idea of diagnostic test interpretation is
to calculate the probability that a patient has a
disease under consideration given a certain
result.
 • A 2 by 2 table is employed for this purpose.
LEGAL SYSTEM
Testing for guilty/not-guilty:
Actual condition
Guilty Not guilty
False Positive
Verdict of (i.e. guilt reported
True Positive
'guilty' unfairly)
Test Type I error
result False Negative
Verdict of (i.e. guilt
True Negative
'not guilty' not detected)
Type II error
LEGAL SYSTEM
Testing for innocent/not innocent (reversed from previous
example):

Actual condition
Innocent Not innocent
False Positive
Judged (i.e. guilty but
True Positive
'innocent' not caught)
Test Type I error
result False Negative
Judged (i.e. innocent
True Negative
'not innocent' but condemned)
Type II error
MEDICAL CONDITION
Actual condition
Present Absent
Condition absent + Positive result
Condition Present + Positive result
Positive = False Positive
= True Positive
Type I error
Test
result Condition present + Negative
Condition absent + Negative
result
Negative result
= False (invalid) Negative
= True (accurate) Negative
Type II error
DISEASE

Actual condition
Infected Not infected
False Positive
Test shows (i.e. infection reported
True Positive
'infected' but not present)
Test Type I error
result False Negative
Test shows (i.e. infection
'not infected' not detected) True Negative
Type II error
BASIC CONCEPTS
A test is useful (informative) if
A test is useful (informative) if
SENS + SPEC > 1
SENS + SPEC > 1
FALSE POSITIVE

The false positive rate is the proportion of negative instances that were erroneously
reported as being positive.
It is equal to 1 minus the specificity of the test. This is equivalent to saying it is equal to 1
minus the significance level.

In statistical hypothesis testing, this fraction is given the symbol α, and 1 − α is defined
as the specificity of the test.

The specificity of the test lowers the probability of type I


errors, but raises the probability of type II errors
FALSE NEGATIVE RATE
The false negative rate is the proportion of positive instances that were erroneously
reported as negative.

It is equal to 1 minus the "power" of the test.

In statistical hypothesis testing, this fraction is given the symbol β.


RELATIONSHIP BTW SENSITIVITY AND
SPECIFICITY
BASIC CONCEPTS CONT.
PPV
NPV

A
A test
test is
is useful
useful (informative)
(informative)
PPV
PPV ++ NPVNPV >> 11
2 BY 2 TABLE
Example
 100 children diagnosed with language
impairments (LI) and enrolled in language
intervention, and 100 same-age children with
no history of language impairment (LN), were
administered a new test of grammatical
morphology.
 80 of the children with LI, and 30 of the
children with LN, scored in the disordered
range on the new measure.
Disorder Status (re: Gold Standard)

+ Disorder (LI) - Disorder (LN)

80 30
+ Disorder (LI)
a b
New Test
Result c d
-Disorder (LN) (20) (70)

100 with 100 without


disorder disorder
Sens= a/a+c= Spec = d/b+d =
80/100 = .80 70/100 = .70
Why not just use sensitivity and specificity as
measures of accuracy?

 It’s their interrelationship that is most


important overall
 Sensitivity and specificity vary substantially
according sample characteristics, including
N, base rate (prevalence), severity,
confusability
 Likelihood Ratios are not impervious to
sample characteristics, but are much less
affected than are sensitivity and specificity
LIKELIHOOD RATIO
 LR+=sensitivity/1-specificity
={a/(a+c)}/{1-[d/(b+d)]}

 LR- =1-sensitivity/specificity
=[1-{a/(a+c)}]/[d/(b+d)]
Calculating Likelihood Ratios
 Sens = .80
 Spec = .70
 LR+ = sens/1-spec = .80/.30 = 2.67
 LR- = 1-sens/spec = .20/.770 = 0.29
 Several programs, some free on web, are set
up to allow entry in 2x2 table format
 In addition to accuracy measures, they also
provide information on precision
RECEIVER OPERATIVE
CHARACTERISTIC CURVE
 ROC analysis is part of a field called "Signal
Detection Theory" developed during World
War II for the analysis of radar images.
 ROC analysis is now:

• Common in medicine and healthcare.


• Particularly in Radiology.
Where it is used to quantify the accuracy of
diagnostic test.
ROC Curves
ROC analysis
• Is the standard approach to evaluate the sensitivity and
specificity of diagnostic procedures.
• Estimates a curve, which describes the inherent tradeoff
between sensitivity and specificity of a diagnostic test.
Each point on the ROC curve
 Is associated with a specific diagnostic criterion.
 Will vary among observers. (because their diagnostic
criteria will vary even when their ROC curves are the
same)
ROC CURVE
The plot shows the FP rate on the X axis and 1 - FN rate on the
Y axis.
A ROC CURVE DEMONSTRATES SEVERAL THINGS

• Shows the tradeoff between sensitivity and specificity. (any


increase in sensitivity will be accompanied by a decrease in
specificity)
• The closer the curve follows the left-hand border and then the
top border of the ROC space, the more accurate the test.
• The closer the curve comes to the 45-degree diagonal of the
ROC space, the less accurate the test.
• The slope of the tangent line at a cut-point gives the likelihood
ratio (LR) for that value of the test.
• The area under the curve is a measure of test accuracy.
Kappa measure of agreement
 Kappa is defined as the difference between observed
and expected agreement expressed as a fraction of
the maximum difference and ranges between -1 to 1.
Imperfect reference standard
R+ R-
New T+
Test a b a+b
T-
c d c+d

a+c b+d n=a+b+c+d

 k=(Io-Ie)/(1-Ie) where Io=(a+d)/n, Ie=((a+c)


(a+b)+(b+d)(c+d))/n2
KAPPA TEST
Example
5000 women underwent a test for
blood glucose at 24 weeks following
a glucose load. 243 women were
found to have a blood glucose
greater than 6.8 mmol/L and were
referred for an OGTT. 186 were
found to have gestational diabetes.
Four women who initially had tested
negative were diagnosed as having
diabetes later in their pregnancy.
Diabetes No diabetes Total

Positive 186 57 243

Negative 4 4753 4757

Total 190 4810 5000


ANSWER
Prevalence 190/5000

Sensitivity 186/190

Specificity 4753/4810

Positive predictive value 186/243

Negative predictive value 4753/4757

Likelihood ratio + test (186/190)/(57/4810)

Likelihood ratio - test (4/190)/(4753/4810)


E
xa
mpl
e
P
r
eva
l
enc
e 3
.
8%

S
en
si
ti
vit
y 9
7.
9%

S
pe
ci
fi
cit
y 9
8.
8%

P
os
i
ti
vep
r
edi
cti
vev
al
ue 7
6.
5%

N
eg
at
ive
pr
edi
cti
vev
al
ue 9
9.
9%

L
i
kel
iho
odr
at
io+
te
st 8
2.
6

L
i
kel
iho
odr
at
io-t
est .
02

A
cc
ur
acy 9
8.
8%
QUESTION
A test to screen for a dx is performed on a population. 900
respondents have the dx. 920 in the population were
tested positive. The sensitivity of the test is 0.90 and
the specificity of the test is 0.80.
1. How many person do not have the dx?
2. What is FPR, FNR, PPV, NPV, LR+ & LR-?
Greenhalgh guidelines
1: Is this test potentially relevant to my practice?
2: Has the test been compared with a true gold standard?
3: Did this validation study include an appropriate spectrum
of subjects?
4: Has workup bias been avoided?(Was the reference standard group
originally identifiedbecause they were positive on the test?)
5: Has expectation bias been avoided?(I.e. was the reference standard blind to
the test?)
6: Was the test shown to be reproducible?
7: What are the features of the test as derived from this validation study?
8: Were confidence intervals given?
9: Has a sensible ‘normal range’ been derived? (Only relevant for continuous
test variables.)
10: Has this test been placed in the context of other potential tests in the
diagnostic sequence?
Other GL is QUADAS (Quality Assessment of Diagnostic
Accuracy Studies) tool
THANK YOU
FOR
LISTENING

You might also like