Validity-Reliability Screening Tests Rbs-Feltp

VALIDITY AND RELIABILITY
OF SCREENING TESTS
Rashida B Syed, Epidemiologist

Consultant Faculty
Field Epidemiology Training Program (FETP)-
Pakistan
7/11/2013 Validity and reliability of Tests 1

Objectives
 Calculate and interpret measures of the validity of a screening
test:
 Sensitivity
 Specificity
 Understand the relationship between sensitivity and
specificity.
 Calculate and interpret measures of the performance (yield) of

a screening test:
 Predictive value positive (PV+)
 Predictive value negative (PV-)
 Understand factors that influence PV+ and PV-
 Recognize issues and sources of bias in evaluating screening

programs.
Purpose of screening
 The early detection of disease in individuals
who do not show any signs of disease.
 Aims to reduce morbidity and mortality from

disease among persons being screened.
 Is the application of a relatively simple,

inexpensive test, examinations or other
procedures to people.
 a means of identifying persons at increased risk
for the presence of disease, who warrant further
evaluation

Diagnosis = Screening
 Screening tests can also often be used as

diagnostic tests
 Diagnosis involves confirmation of presence or

absence of disease in someone suspected of or
at risk for disease
 Screening is generally in done among

individuals who are not suspected of having
disease

Requirements
 Is there a truly effective treatment available for

the discovered disease?
 Is that treatment more effective in screened than
non-screened cases?
 What are the side effects of the screening
process?
 How efficient is screening? Do we have the right
threshold? i.e. how many people must be
screened to obtain a case?

Natural History of Disease
Detectable sub-clinical disease
Susceptible Subclinical Clinical Stage of Recovery,

Host Disease Disease Disability, or Death
Diagnosis
Point of sought
Exposure
Onset of
symptoms
Screening
Examples of Screening Tests
 Questions
 Clinical Examinations
 Laboratory Tests
 Genetic Tests
 X-rays
Goel
Diseases for which screening
has been recommended
 Cervical cancer
 Breast cancer
 Prostate cancer
 Colon cancer
 Diabetes
 Hypertension

Terminology
Validity is analogous to accuracy
The validity of a screening test is how well the given

screening test reflects another test of known
greater accuracy
Validity assumes that there is a gold standard to

which a test can be compared
7/11/2013 Validity and reliability of Tests Paneth9

Three key measures of validity
• Sensitivity
• Specificity
• Predictive value

Sensitivity and Specificity
Sensitivity tells us how well a positive test detects disease.
It is defined as the ability of the test to identify correctly as

diseased, those who have the disease.
---------------------------------------------------------------------------------
Specificity tells us how well a negative test detects
non-disease.
Defined as the ability of the test to identify correctly those who do

not have the disease as test negative.

Disease
Present Absent
True False
Screening
Positive positives positives

Test
False True
Negative negatives negatives

Disease
Present Absent
a b
Screening
a+b
Positive
Test
Negative c d c+d
a+c b+d N

Sensitivity
 Proportion of individuals who have the disease who

test positive (true positive rate) tells us how well a “+”
test picks up disease
Disease
yes no
a
Screening
+ a b a+b Sensitivity =
Test
- c d c+d a+c
a+c b+d N
Specificity
Proportion of individuals who don’t have the disease who

test negative (true negative rate) tell us how well a “-
” test detects no disease
Disease
yes no
d
Screening
+ a b a+b Specificity =
Test
- c d c+d b+d
a+c b+d N
Predictive value
 Positive predictive value – the number of

individuals who have a condition from all
those who test positive.
 Negative predictive value - the number of

individuals who do not have a condition
from all those who test negative

Positive Predictive Value
 Proportion of individuals who test positive who
actually have the disease
Disease
yes no
a
Screening
+ a b a+b P.P.V. =
Test
- c d c+d a+b
a+c b+d N
Negative Predictive Value
 Proportion of individuals who test negative

who don’t have the disease
Disease
yes no
d
Screening
+ a b a+b N.P.V. =
Test
- c d c+d c+d
a+c b+d N
Determinants of predictive value
The predictive value of a test is determined by

3 factors:
 1. Sensitivity
 2. Specificity
 3. Prevalence of the disease in the
population being tested

Effect of prevalence on PPV
 As prevalence rates decrease, the positive

predictive value of a test also decreases
 This explains why diagnostic tests which are
developed in clinical populations (where the
prevalence of the disease being tested is
often high) often perform poorly in general
population settings (where disease
prevalence tends to be lower).
 In our example-prove it
Scenarios
 Tests with Dichotomous Results
 Examples
 (Positive or Negative)
 Tests with Continuous results

 Examples
 Systolic blood pressure (mm Hg)
 Tuberculin reaction (induration diameter, mm)

Examples
 In a sample of 200 people: 100 people have the disease
Hypothyroidism, and 100 people do not have it.
 In the same sample of 200 people: 110 people test
positive for Hypothyroidism using a new diagnostic test,
and 90 people test negative for Hypothyroidism using
the same diagnostic test.
 Of the 110 people who are test positive, 90 do have the
disease and 20 do not.
 Of the 90 people who are test negative, 10 do have the
disease and 80 do not.
 Sensitivity and Specificity?

Solution
 SENSITIVITY=TP/TP+FN
=90/90+10=90%
 SPECIFICITY=TN/TN+FP
=80/80+20=80%

A test is used in 50 people with disease and
50 people without. These are the results.
Disease
Present Absent
48 3 51
Screening
Positive
Test
Negative 2 47 49
50 50 100
7/11/2013 Validity and reliability of Tests
Paneth
25
Disease
Present Absent
48 3 51
Screening
Test Positive
Negative 2 47 49
50 50 100
Sensitivity = 48/50
Specificity = 47/50
Positive Predictive Value = 48/51
Negative Predictive Value = 47/49
7/11/2013 Validity and reliability of Tests
Paneth
26
So… you understand the
accuracy of a screening test …
What is the next step?

Put screening to use in the
population

Sensitive vs. Specific tests
 A test with high levels of sensitivity is usually
positive when disease is present and has few false
negatives – useful when it is important not to miss a
diagnosis (e.g. if the disease is dangerous but has
an effective treatment)
 A test with high levels of specificity is usually

negative when disease is absent and has few false
positives – useful when a false positive diagnosis
would be harmful (e.g. if it resulted in unnecessary
treatment)

Balancing sensitivity vs. specificity
 A really good test would be highly sensitive and highly
specific.
 In practice, this is often not the case.
 Instead, there is often a trade-off between the sensitivity and

the specificity of diagnostic tests
 This occurs in cases where the test result is expressed on a

continuous scale (e.g. blood pressure, blood sugar levels)
 In such circumstances, a cut-point has to be chosen to define

normal vs. abnormal
 The decision for the cut point involves weighing the

consequences of leaving cases undetected (false negatives)
against erroneously classifying healthy persons as diseased
(false positives).
Refer
 7/11/2013 to Gordis Validity and reliability of Tests 29
NET SENSITIVITY AND SPECIFICITY
 Use of multiple tests

 Refer Gordis

Balancing sensitivity vs. specificity
Blood sugar level Sensitivity % Specificity%

2hrs after eating
(mg/100ml)
70 98.6 8.8
90 94.3 47.6
110 85.7 84.1
130 64.3 96.9
170 42.9 100.0

ROC curves
 One method for determining the best cut-

off point is by constructing a ROC curve
 ROC=receiver operating characteristic, a
term that comes from radar science
 ROC curves are constructed by plotting
the sensitivity (or true positive rate)
against the false positive rate (1-
specificity)

ROC curve for blood sugar
readings
Source: Fletcher, Fletcher and Wagner, Clinical epidemiology: the essentials (3rd ed)
 Shows trade-off between sensitivity and
specificity
 Closer to left hand and top borders the
more accurate the test
 Slope of tangent at cut point gives the
Likelihood Ratio (LR) for that value of the
test
 The area under the curve is a measure of
test accuracy

The Area under an ROC Curve

 Good tests lie close to the upper left hand
corner of the graph – where sensitivity and
specificity are both high
 Generally the best cut-off point lies at or near

the “shoulder” of the curve*
 The overall accuracy of the test is represented

by the area under the curve
 Tests that plot close to the diagonal across the

middle of the graph are least useful, as this is
where the test is no better than chance
 ROC curves can also be used to compare

different tests
*unless there are clinical reasons for preferring a highly sensitive
or highly specific test
Sources of Bias in the Evaluation
of Screening Programs
 Lead time bias

 Length bias
 Volunteer bias

Lead time bias
 Lead time: interval between the

diagnosis of a disease at screening and
the usual time of diagnosis (by
symptoms) Lead Time
Diagnosis Diagnosis
by screening via symptoms

Lead-Time Bias
Consider a condition where the natural history allows for

an earlier diagnosis, however, survival does not improve
despite identifying it earlier
A screening program here will…

 survival will appear to increase
 but in reality, it is increased by exactly the

amount of time their diagnosis was advanced by
the screening program
 Thus there is no benefit to screening from a survival
standpoint.

Lead time bias
 Assumes survival is time between screen and
death
 Does not take into account lead time between
diagnosis at screening and usual diagnosis.
Survival = 14 years
Diagnosis
by screening Death
in 1994 in 2008
Lead time bias
Survival = 14 years
True Survival = 10 years
Lead Time 4 years
Diagnosis Usual time of Death

by diagnosis in 2008
screening via symptoms
in 1994
7/11/2013
in 1998
Validity and reliability of Tests 41
Length Bias
 Most chronic diseases, especially cancers, do not

progress at the same rate in everyone.
 Any group of diseased people will include some in

whom the disease developed slowly and some in
whom it developed rapidly.
 Screening will preferentially pick up slowly developing

disease (longer opportunity to be screened) which
usually has a better prognosis

Paneth
O P Y D
Biological Disease Symptoms Death
onset of detectable Begin
disease via screening
Screening
O P Y D
O P Y D
Length bias
O P Y D
O P Y D
O P Y D
O P Y D
7/11/2013 Validity and reliability of Tests Time

43
Volunteer bias
 Type of bias where those who choose to participate are

likely to be different from those who don’t
 Volunteers tend to have:
 Better health
 Lower mortality
 Likely to adhere to prescribed medical regimens

A worked example the Fecal occult blood (FOB) screen test
is used in 203 people to look for bowel cancer: Patients with
bowel cancer (as confirmed on endoscopy)
 False positive rate (α) = FP / (FP + TN) = 18 / (18 + 182) = 9% = 1 −

specificity.
 False negative rate (β) = FN / (TP + FN) = 1 / (2 + 1) = 33% = 1 −

sensitivity.
 Power = sensitivity = 1 − β
 Hence with large numbers of false positives and few false negatives,
a positive FOB screen test is in itself poor at confirming cancer
(PPV = 10%) and further investigations must be undertaken, it will
though pickup 66.7% of all cancers (the sensitivity). However as a
screening test, a negative result is very good at reassuring that a
patient does not have cancer (NPV = 99.5%) and at this initial screen
correctly identifies 91% of those who do not have cancer (the
specificity).
Reliability
 Validity (accuracy)
 Reliability (Repeatability)
 Refer Epidemiology by Gordis

Review questions from Gordis

 Likelihood-ratio positive =
sensitivity / (1 − specificity) =
66.67% / (1 − 91%) = 7.4
 Likelihood-ratio negative =
(1 − sensitivity) / specificity =
(1 − 66.67%) / 91% = 0.37

Validity-Reliability Screening Tests Rbs-Feltp

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Validity-Reliability Screening Tests Rbs-Feltp

Uploaded by

Copyright:

Available Formats

VALIDITY AND RELIABILITY

Rashida B Syed, Epidemiologist

7/11/2013 Validity and reliability of Tests 1

 Calculate and interpret measures of the performance (yield) of

 Understand factors that influence PV+ and PV-

 Recognize issues and sources of bias in evaluating screening

 Aims to reduce morbidity and mortality from

 Is the application of a relatively simple,

7/11/2013 Validity and reliability of Tests 3

 Screening tests can also often be used as

 Diagnosis involves confirmation of presence or

 Screening is generally in done among

7/11/2013 Validity and reliability of Tests 4

 Is there a truly effective treatment available for

7/11/2013 Validity and reliability of Tests 5

Susceptible Subclinical Clinical Stage of Recovery,

7/11/2013 Validity and reliability of Tests 8

Validity is analogous to accuracy

The validity of a screening test is how well the given

Validity assumes that there is a gold standard to

7/11/2013 Validity and reliability of Tests Paneth9

7/11/2013 Validity and reliability of Tests 11

Sensitivity tells us how well a positive test detects disease.

It is defined as the ability of the test to identify correctly as

Defined as the ability of the test to identify correctly those who do

7/11/2013 Validity and reliability of Tests 12

Positive positives positives

7/11/2013 Validity and reliability of Tests 13

7/11/2013 Validity and reliability of Tests 14

 Proportion of individuals who have the disease who

Proportion of individuals who don’t have the disease who

 Positive predictive value – the number of

 Negative predictive value - the number of

7/11/2013 Validity and reliability of Tests 17

 Proportion of individuals who test negative

The predictive value of a test is determined by

7/11/2013 Validity and reliability of Tests 20

 As prevalence rates decrease, the positive

 Tests with Continuous results

7/11/2013 Validity and reliability of Tests 22

7/11/2013 Validity and reliability of Tests 23

7/11/2013 Validity and reliability of Tests 24

What is the next step?

7/11/2013 Validity and reliability of Tests 27

 A test with high levels of specificity is usually

7/11/2013 Validity and reliability of Tests 28

 Instead, there is often a trade-off between the sensitivity and

 This occurs in cases where the test result is expressed on a

 In such circumstances, a cut-point has to be chosen to define

 The decision for the cut point involves weighing the

 Use of multiple tests

7/11/2013 Validity and reliability of Tests 30

Blood sugar level Sensitivity % Specificity%

7/11/2013 Validity and reliability of Tests 31

 One method for determining the best cut-

7/11/2013 Validity and reliability of Tests 32

7/11/2013 Validity and reliability of Tests 34

7/11/2013 Validity and reliability of Tests 35

 Generally the best cut-off point lies at or near

 The overall accuracy of the test is represented

 Tests that plot close to the diagonal across the

 ROC curves can also be used to compare