You are on page 1of 41

Diagnostic

and
Screening
tests
Moustafa El Houssinie
Department of Community, Environmental
and Occupational Medicine
October 2012

1
Introduction to screening/diagnostic tests
• A diagnostic test
• A screening test
• The screening is an attempt to diagnose the disease earlier when it is
curable.
• Not all diagnostic tests can be used as a screening tool.

2
Screening and Diagnosis
• Primary purpose of screening is to identify individuals
AT RISK for disease
• Screening tests sort out apparently well persons who
have a disease or a risk factor for disease from those
who do not
• Screening tests are not intended to be diagnostic

3
For a screening test to be of any value
• The disease • The proposed screening test
• Significant burden (morbidity and mortality) • Valid = Good sensitivity and specificity
• Acceptable, available and effective • Precise = reliable = reproducible = repeatable
treatment • Low cost, inexpensive.
• A relatively enough pre-symptomatic period • Low risk
• A better outcome with early intervention • Availability of confirmatory tests

 The population to which the screening


program will be done
 High prevalence of the disease
 High compliance with the test and with
subsequent tests and treatments
4
Examples of screening tests:

• Mammography for breast cancer


• Occult fecal blood for colorectal cancers
• Cervical Pap smear for cervical cancer
• Specific screening tests for mental problems: e.g.,
addiction and depression
• Phenylketonuria, galactosemia, hypothyroidism,
hemoglobinopathies and hearing loss in newborns

5
Why these diseases are – STILL -unsuitable for
screening?
• The Common Cold
• Lung Cancer
• Pancreatic Cancer
• Ovarian Cancer
• Multiple Sclerosis

6
Screening/Diagnostic tests
• Screening/Diagnostic tests
• Validity of a test is shown by how well the test actually
measures what it is supposed to measure.
• Validity is determined by the sensitivity and specificity of the
test.
• Reliability is based on how well the test does in use over
time – in its repeatability.

7
The “Gold Standard” :
• What is a Gold Standard ?
• Tissue analysis, radiological contrast procedures, prolonged follow up,
autopsies
• Almost always more costly, less feasible
• Lack of objective standards of disease(e.g. angina Pectoris: Gold
standard is careful history taking)
• Consequences of imperfect standards

8
Diagnostic Characteristics

Not a hypothesis testing situation

BUT
• How well does the test identify patients with a disease?
• How well does the test identify patients without a disease?

9
Evaluation of the Diagnostic Test -
Categorical outcome
Give a group of people (with and without the
Cross-sectional disease) both tests (the candidate test and
design the “gold standard” test and then cross-
classify the results and report the
diagnostic characteristics of the test.

Get a number of confirmed cases by the gold


Case-control standard and a disease free controls and
design apply the test on both group

10
Sensitivity, Specificity and Predictive values of the
test
Only applicable with the cross-sectional design

Test Result Disease Free from disease Total


Positive by the test A (true positive) B (false positive) A+B (all positive)
Negative by the test C (false negative) D (true negative) C+D (all negative)
Total A+C (All cases) B+D (All healthy)  Total screened

• Sensitivity is the ability of a screening procedure to correctly identify those who


have the disease =TP / Cases =a(a+c)
• Specificity is the ability of a screening procedure to correctly identify the
percentage of those who do not have the disease =TN / Healthy =d/(b+d)
• Predictive value of positive result or proportion of diseases among those with
positive test= TP/All positive = a/(a+b)
• Predicative value of negative result or proportion of disease free in those with
negative test = TN/All negative = d/(c+d) 11
Example   Disease Healthy Total
Positive by the test TP FP Positives
Negative by the test FN TN Negatives
Total Cases Healthy ALL

  Disease Healthy Total


Sensitivity = TP / cases = 70%
Cross- Positive by the test 70 90 160
Specificity = TN / healthy = 90%
sectional Negative by the test 30 810 840 PVP = TP / All +ve = 70/160 ≈ 44%
design Total 100 900 1000 PVN = TN / All –ve =810/840 ≈ 96%

  Disease Healthy
Case- Positive by the test 70 10 Sensitivity = TP /cases = 70%
control Negative by the test 30 90 Specificity = TN/healthy = 90%
Total 100 100
design

12
Is it more important that a test be sensitive
or specific?
• It depends on its purpose. A cheap mass screening test should be
sensitive (few cases missed). A test designed to confirm the presence of
disease should be specific (few cases wrongly diagnosed).
• Note that sensitivity and specificity are two distinct properties. Where
classification is based on an cut point along a continuum, there is a
tradeoff between the two.

13
A researcher studied a pregnancy test on
saliva with the following results:
  Pregnant Non-pregnant Total
Saliva +3 85 5 90
Saliva +2 10 10 20
Saliva +1 3 17 20
Saliva 0 2 68 70

• If “positive”  1+, sensitivity = (85+10+3)/100 = 98%


specificity = 68/100 = 68%
• If “positive”  2+, sensitivity = (85+10)/100 = 95%
specificity = (68+17)/100 = 85%
• If “positive” = 3+, sensitivity = 85/100 = 85%
specificity = (68+17+10)/100 = 95%

14
• The choice of cut-point depends on the relative adverse
consequences of false-negatives vs. false-positives.

• If it is most important not to miss anyone, use sensitivity


and  specificity.

• If it is most important that people not be erroneously


labeled as having the condition, use  sensitivity and 
specificity.

15
Predictive values are dependent on the
prevalence of the disease
Test sensitivity = 70% and specificity =90%

  Disease Healthy Total Prevalence =100/1000 = 10%


PV + = 70 / 160 = 44%
Positive by test 70 90 160 PV -= 810 / 840 = 96%

Negative by test 30 810 840

Total 100 900 1000

  Disease Healthy Total Prevalence = 200/1000 = 20%


PV + = 140 / 220= 64%
Positive by test 140 80 220 PV - = 720 / 780= 92%

Negative by test 60 720 780

Total 200 800 1000


As prevalence increases - PV+ increases and testing is more efficient

16
• The following table illustrates how PVP
is  influenced by specificity more than
sensitivity with the same prevalence of
the disease.  The situation occurs when
using two tests with binary outcome
but different sensitivity and specificity
or with moving the cutoff point in a
test whose outcome is a continuous
variable (Hb, blood sugar,
cholesterol...etc).
• The cause is apparent, there are more
healthy individuals in the community
and a drop of a small percent in the
specificity will add much more to the
false positive compared to a similar
change in the sensitivity would add to
the false negative. 17
Probability of case changes with the result
of the screening test
  Disease Healthy Total
Positive by the test 70 90 160
Negative by the test 30 810 840
Total 100 900 1000

• Probability of having a case in the screened population = 100 / 1000 = 10%


• Probability of having a case in the positive cases = 70/160 ≈ 44%
• Probability of having a case in the negative cases = 30/840 ≈ 4%

18
The basic idea
The probability changes after knowing the result of the test

Negative Positive
test (0.04) test (0.44)

0 0.1 1.0
(Pre-test)
Probability of being Diabetic

19
Probability  event 
Relationship between Odds  event  = and
Odds & Probability 1-Probability  event 

Odds  event 
Probability  event  =
Probability =‫تماــية‬
‫اـحـ ل‬ 1+Odds  event 
Odds = ‫ارجـحية‬ Example:

proability=0.2

0.2
odds= =1:4=0.25
0.8
0.25
probability= =1:5=0.2
1.25
Odds of case changes with the result of the
screening test
  Disease Healthy Total
Positive by the test 70 90 160
Negative by the test 30 810 840
Total 100 900 1000

• Odds of having a case in the screened population = 100 / 900 = 1: 9


• Odds of having a case in the positive results = 70/90 = 7 : 9
• The odds of being a case increased 7 time after the results appeared +ve
• Odds of having a case in the negative results = 30/810= 1 : 27
• The odds of being a case decreased by 0.33 after the results appeared -ve

21
Likelihood ratio positive
  Disease Healthy Total
Positive by the test 70 90 160

Negative by the test 30 810 840

Total 100 900 1000

• Odds of having a case in the screened population = 100/900 or


1:9
• Odds of having a case among test positive = 70/90 = 7: 9
• The likelihood of having a case increases 7 times by testing
positive
• LR + =7
• LR + = sensitivity / (1-specificity) so it is fixed for a test
• LR + = True +ve rate/False +ve rate 22
Likelihood ratio negative
  Disease Healthy Total
Positive by the test 70 90 160

Negative by the test 30 810 840

Total 100 900 1000

• Odds of having a case in the screened population = 100/900 or 1 : 9


• Odds of having a case among test negative = 30/810 = 1:27
• The likelihood of having a case decreases 3 times by testing
positive
• LR - =1/3 = 0.33
• LR - = (1-sensitivity) / specificity so it is fixed for a test
• LR - = False –ve rate / True –ve rate
23
Another view
Probability test +ve in cases
LH +ve =
Probability test +ve in non cases
Sensitivity
LH +ve =
(1-Specificity)
Probability test -ve in cases
LH -ve =
Probability test -ve in non cases
(1-Sensitivity)
LH - ve =
Specificity
24
Interpretation of LR
LR Interpretation
>10 Large and often conclusive increase in the likelihood of disease
10-5 Moderate increase in the likelihood of disease
5-2 Small increase in the likelihood of disease
2-1 Minimal increase in the likelihood of disease
1 No change in the likelihood of disease
0.5-1.0 Minimal decrease in the likelihood of disease
0.2-0.5 Small decrease in the likelihood of disease
0.1-0.2 Moderate decrease in the likelihood of disease
<0.1 Large and often conclusive decrease in the likelihood of disease

25
CURRENT Medical Dx & Tx > Chapter e3. Diagnostic Testing & Medical Decision
Making > Odds-Likelihood Ratios >
Table e3–6. Examples of likelihood ratios (LR).  
Target Disease Test LR+ LR–
Abscess Abdominal CT scanning 9.5 0.06
Coronary artery disease Exercise electrocardiogram (1 mm 3.5 0.45
depression)
Lung cancer Chest radiograph 15 0.42
Left ventricular hypertrophy Echocardiography 18.4 0.08
Myocardial infarction Troponin I 24 0.01
Prostate cancer Digital rectal examination 21.3 0.37

26
Fagen Nomogram
• A nomogram that relates pre-test probability of
having the condition (P(D+) to the post-test
probability, P(D+/T+) and P(D+/T-), if positive or
negative using the LH+ and LH-
Example:
• if LH+ of a test is 20 an the prevalence of the
disease P(D+) is 10% the probability of disease
with a positive test P(D+/T+) is 70%
• if LH- of a test is 0.1 an the prevalence of the
disease P(D+) is 10% the probability of disease with
a negative test P(D+/T-) = 1%
27
A continuous variable

• Overlapping of measurements
• As sensitivity rises specificity declines
• The reference range
• Mean ± 2 SE of the mean
• 2.5th to 97.5th percentiles

28
Lower sensitivity (less True +ve)
If more is Cut-off point
worse increases
Higher specificity (Less False +ve)

False
negative True positive
Diabetics

False
True negatives positive
Non-Diabetics

90 95 100 105 110 115 120 125 130 135


29
Diabetics Normal ROC Curve
1

PB sugar Positive TP rate Positive FP rate 95


0.9

500 0 0 0 0
0.8

400 2 0.02 0 0 100


0.7
300 10 0.1 0 0

Sensitivity or True positive rate


0.6
250 20 0.2 1 0.01
150
200 30 0.3 5 0.05 0.5

150 50 0.5 10 0.1 0.4

100 70 0.7 20 0.2 0.3

95 89 0.89 60 0.6
0.2

90 95 0.95 85 0.85
0.1

85 98 0.98 95 0.95
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
80 100 1 100 1
1-Specificity or False positive rate

30
Higher sensitivity (more true +ve)
If less is Cut-off point
worse increases
Lower specificity (more False +ve)

False
True positive Hypothyroidism
negative

False
positive True negatives
Normal thyroid

1 2 3 4 5 6 7 8 9
31
Hypothyroidism Euothyroidism ROC Curve
1

T4 Positive TP rate Positive FP rate


0.9
4 0 0 0 0 7.5
0.8
4.5 24 0.24 2 0.02
0.7
5 34 0.34 5 0.05

Sensitivity or True positive rate


6
0.6
5.5 55 0.55 14 0.14
0.5
6 65 0.65 15 0.15
0.4
6.5 80 0.8 24 0.24
0.3
7 85 0.85 36 0.36

7.5 90 0.9 45 0.45 0.2

8 95 0.95 80 0.8 0.1

8.5 98 0.98 95 0.95 0


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

1-Specificity or False positive rate


9 100 1 100 1

32
• A curvilinear curve between True
positive rate and False positive
rate at different cutoff points
• TP rate = sensitivity
• FP rate = (1-specificity)
• If the line is diagonal --- so at any
cutoff points TP rate =50% and FP
rate =50% and the area under the
curve =50% of the area of the
rectangle --- so this quantitative
test (variable) can not be used to
diagnose this disease

33
• If the quantitative test (variable)
has diagnostic power
• Changing the cut-off point --- the TP
rate increases the FP rate remains
low and the line passes parallel to
the vertical axis (the y axis)
• At the optimal point, when
sensitivity + specificity become
maximal, the line changes its
direction to become parallel to the
horizontal axis (the x axis) where
minor increases in TP is associated
with large increases in FP
• The ideal point is the left upper
corner of the rectangle with
TP=100% and FP=0%
• If the optimal point is closer to the
ideal point --- the test becomes of
more value and the area under the
curve approximates 100%
34
Area under the curve A final note of historical interest
Area under the Value of the test • "Receiver Operating Characteristic” or ROC analysis is
curve part of a field called "Signal Detection Theory"
developed during World War II for the analysis of radar
images.
1.0 – 0.9 Excellent A • Radar operators had to decide whether a blip on the
screen represented an enemy target, a friendly ship, or
0.8-0.9 Good B just noise.

0.7-0.8 Fair C
• Signal detection theory measures the ability of radar
receiver operators to make these important
0.6-0.7 Poor D distinctions. Their ability to do so was called the
Receiver Operating Characteristics. It was not until the
0.5-0.6 Fail F 1970's that signal detection theory was recognized as
useful for interpreting medical test results.

35
ROC curve IV
• After graphing the ROC curve you should
• Test the significance of the test by testing the
Area Under the Curve (AUC) with the Ho:
AUC=50%
• If the test is statistically significant the
optimal cutoff point should be calculated
• In general it equals the point with the
maximum sensitivity and specificity
=Youden j index =Se+Sp-1
• You may choose a point with very high
sensitivity or very high specificity
according to the nature of the disease
(duration, prognosis, cost of
confirmatory test, ….etc)

36
Biases in screening programs:

• Selection bias: people who seek screening are different that


people who do not (compliance bias)
• Lead-time bias.
• Length time bias

37
Biases
Lead-time bias: early detection apparently increases the
survival period.
Both cases A and B had the same onset, outcome and true
survival period.
As case A was diagnosed earlier, it appeared as if its survival
was longer.

38
Length time bias
 The proportion of slow-
growing lesions diagnosed
during screening programs is
greater than the proportion of
those diagnosed during usual
medical care.

 Slower-growing tumors makes


screening seem to improve
survival.

39
True prevalence from apparent prevalence

AP  Sp  1
True Prevalence =
Sn  Sp  1
where AP=apparent prevalence, Sn=sensitivity and Sp=specificity

• Suppose you have conducted a survey of white spot disease in a shrimp farm, using
a test with sensitivity of 80% (0.8) and specificity of 100% (1.0). You have tested 150
shrimp, and 6 shrimp tested positive. What is the estimated true prevalence? The
apparent prevalence is 6/150 = 0.04 or 4% (Wilson 95% CI: 1.8–8.5%)
• Therefore, true prevalence = (0.04 + 1 – 1)/(0.8 + 1 – 1) = 0.04/0.8 = 0.05 or 5%
(95% CI: 1.1 – 8.9%)
Any
Question???

41

You might also like