(PREVMED) 3.5 Assessing Articles On Diagnosis - Dr. Sta. Maria

PREVENTIVE MEDICINE
EBM 3: ASSESSING ARTICLES ON DIAGNOSIS

Dr. Sta. Maria
EBM Moodle Course Book (2020), Activity 3 Discussion PPT (2020),
INTRODUCTION Imprecision
DIAGNOSIS • Must not be confused with bias
• Diagnosis is an imperfect process • Arises when an estimate is based on a small sample
o Results in a probability rather than a certainty of being right • Caused by random error
• The doctor’s certainty/uncertainty about a diagnosis is expressed using • Statistical analysis appropriately describes the uncertainty in an estimate
terms like “rule out” or “possible” before a clinical diagnosis caused by random error using confidence intervals
• Increasingly, clinicians express the likelihood that a patient has a disease • In a systematic review, confidence intervals (or regions) are estimated to
as a probability describe imprecision
o Implies being familiar with mathematical relationships between the
properties of diagnostic tests and the information they yield in Bias vs. Imprecision
various clinical situations
o Understanding these may: Bias Imprecision
▪ Help the clinical reduce diagnostic uncertainty Arises through issues of internal Arises when an estimate is based
▪ Increase the understanding of the degree of uncertainty and external validity on a small sample
(know what you don’t know) Caused by systematic error Caused by random error
▪ Convince the clinician to increase their level of uncertainty Statistical analysis can
• Four Possible Types of Test Results appropriately describe the Statistical analysis cannot describe
o Diagnostic Tests can be either positive (abnormal) or negative uncertainty in an estimate the uncertainty
(normal) (confidence intervals)
o Disease can be either present or absent Described by estimating
Risk of bias is assessed by study
confidence intervals (or regions) in
Disease (+) Disease (-) validity in a systematic review
a systematic review
Test (+) True Positive False Positive
Test (-) False Negative True Negative • Assessing study validity and estimating confidence intervals are both
essential in all systematic reviews
• Reasons for Performing a Diagnostic Test:
o Also done to assess the accuracy and precision of the article
o To diagnose the presence or absence of a particular condition
o To identify people who are predisposed to disease
Precision and Accuracy
o To identify early asymptomatic disease (screening)
o To plan an intervention (e.g., surgery) • Some articles can be misleading due to bias
o To monitor response to therapy
o To estimate the risk of future events
o To determine prognosis in patients with a known disease
• Questions on diagnosis should be phrased in terms of the following
variables:
o P - the patient population on whom the test might be done
o I/E - the exposure or the test to be performed (referred to as the
index test)
o C - the comparator (the gold standard test/reference
standard/criterion standard test)
o O - the outcome (or condition) that is supposed to be diagnosed BIAS IN ARTICLES ON DIAGNOSIS
• Example: Spectrum of Patients
Among patients with febrile illness (P), how accurate is Dengue NS1 (E) • An appropriate patient spectrum should be defined in light of the
compared to viral isolation (C) in establishing the presence or absence of dengue research question
fever (O)? o State key factors that could affect test accuracy
▪ Setting, disease severity, prevalence, and prior testing
BIAS AND IMPRECISION o If a small proportion of inappropriate patients will be tolerated, this
Bias (Review) proportion should be stated
• Systematic error or deviation from the truth • Exclusion of inappropriate sampling methods may be part of the eligibility
o Either in the results or in their inferences criteria of some reviews
• Can act in either direction o Ex: exclusion of studies that have employed a group of healthy
o Can lead to overestimates or underestimates of test accuracy controls
• Impossible to know for sure whether a study is biased, or the direction
and magnitude of the bias Verification Bias
o When weaknesses are identified, judgements can be made of the • The choice of an optimal reference standard is crucial
risk of bias in an individual study o Used to determine the presence or absence of the target condition
o Its likely direction and size can also occasionally be hypothesized (disease status)
• Can arise through • Indicators of diagnostic accuracy are calculated by comparing the results
o Problems in the design or execution of the study (internal validity) of the index test with the outcome of the reference standard
o Recruiting the wrong participants, using the wrong test, testing in o When there are disagreements between the reference standard and
the wrong way (external validity) the index test, it is assumed that the index test is incorrect
littlemarmaid 1
o Estimates of accuracy are calculated based on the assumption that • Sensitivity Analysis
the reference standard is 100% sensitive and specific o Unverified patients are alternately considered as different
• Perfect reference standards are rare combinations of test-positive and test-negatives
o Errors due to imperfect reference standards can potentially bias the o May allow the potential magnitude of any bias to be ascertained
estimation of the diagnostic accuracy of the index test
• Acceptable reference standards need to be predefined in the review Differential Verification Bias
protocol • Occurs when some patients are verified by one type of reference
o Judgement about the accuracy of the reference standard are not standard and other patients by a different standard
always straightforward o Particularly of concern when those positive to the index test use
▪ Require clinical experience of the topic area to know whether one method of verification, and negatives receive a second
a test or test combination is an appropriate reference • Where the reference standard is a composite test (involving a panel of
standard tests and other information) differential verification will not occur if all
o In some research areas, consensus reference standards have been individuals receive all tests
defined o Becomes problematic when only selected test information for each
o If a mixture of reference standards is used, consider carefully individual is available, and the extent of that information relates to
whether all of these are acceptable the index test finding
• In other situations, differential verification may occur because different
Disease Progression Bias and Recovery Bias tests are available in different centers
• Ideally, the results of the index test and the reference standard are
collected on the same patients at the same time Incorporation Bias
o If this is not possible/a delay ensues, misclassification may occur • In some primary studies, the reference standard is ascertained through a
due to: panel of tests, or on the basis of information collected over a prolonged
▪ Spontaneous recovery period of investigation
▪ Benefit from treatment o Ex: hospital discharge diagnosis
▪ Progression to a more advanced stage of disease • When the result of the index test is used in establishing the reference
▪ Occurrence of a new disease standard, incorporation bias may occur
• Disease Progression Bias and Recovery Bias o Incorporation of the index test in the reference standard panel is
o Used to describe the associated potential biases likely to increase the amount of agreement between index test
• Treatment Paradox results and the reference standard
o Effective treatment of those found positive on the first test ▪ Overestimates diagnostic accuracy
undertaken leads them to be negative in a later test
o The length of the time period that may cause such bias can vary Test and Diagnostic Review
between conditions
• Similar to the issue of blinded outcome assessment in intervention
• Have to make judgements about what is considered “short enough” for
studies
the condition being considered
• Test Review Bias
o Pre-state this in the review protocol
o Interpretation of the results of the index test may be influenced by
o The time period will depend on the:
the knowledge of the results of the reference standard
▪ Speed of progression of the disease
• Diagnostic Review Bias
▪ Possible resolution of the disease
o Interpretation of the results of the reference may be influenced by
▪ The speed at which treatment can be administered and
the knowledge of the results of the index test
effective
• The extent to which test results can be influenced depends on the
o The time period is likely longer in chronic diseases than in acute
degree of subjectivity involved in interpreting the test
diseases
o More subjective reading → more likely for interpreters to be
o Should state whether:
influenced by the results of the reference standard
▪ All patients have to be assessed within this interval
o Fully automated test → less likely
▪ It is based on mean or maximum times
• Consider the topic area being reviewed and determine whether the
▪ It is acceptable for a pre-specified proportion to be outside
interpretation of the index test or reference standard could be influenced
the required interval
by knowledge of the results of the other test
• Empirical evidence shows that both diagnostic and test review bias
Partial Verification Bias
increase sensitivity but have no systematic effect on specificity
• Aka work-up bias, (primary) selection bias or sequential ordering bias • Whether or not blinding was undertaken in a study may not be explicitly
• Can occur when not all of the study patients are verified by the reference stated
standard o If index and reference tests were undertaken and interpreted in a
• Biased estimates of test performance may arise where the choice of clear order, it is evident that the first must have been undertaken
patients for verification is not random blind to the results of the second
o Esp. if it is then influenced by the results of the index test o If the index and reference tests were undertaken by different
• The effect of partial verification is complicated to predict people, a degree of ambiguity may exist about what information
o Depends on: was available for each test
▪ Whether test-positive or test-negative patients are not o Knowledge of standard laboratory practices may allow reasonable
verified assumptions to be made
▪ Whether unverified patients are omitted from the 2×2 table ▪ Ex: where samples are sent in batches to an independent
or classified as true negatives or true positives laboratory
▪ Whether unverified patients are random samples of index test ▪ Authors should always try to confirm assumptions
negatives and positives
o There is no correct way of handling unverified patients in an
analysis
littlemarmaid 2
Clinical Review Bias QUESTION 2: WERE THE “DEFINITIONS” OF THE INDEX TEST AND THE
• For some index tests, the availability or absence of relevant patient REFERENCE STANDARD INDEPENDENT?
information when the test is undertaken may affect its performance. • The results of a test (the index test or the reference standard) are
o Age, gender, presence and severity of symptoms, other test results interpreted based on a defined set of criteria
o Sometimes, 1 criterion is used (ex: pyuria to define urinary tract
Uninterpretable Results infection)
o Other times, multiple criteria are used (ex: pyuria AND positive
• Diagnostic tests may report uninterpretable results, or call results
urine culture AND compatible urinary symptoms)
uncertain, indeterminate, or intermediate
• If any of the criteria for the index test are part of the criteria of the
• Happens with both index tests and reference standards
reference standard, there tends to be a falsely high level of agreement
• Bias will arise depending on the possible correlation between
between the two tests
uninterpretable test results and the true disease status
o Results in an overestimation of the accuracy of the index test
o If uninterpretable results occur randomly, and are not related to the
• Need to make sure that the definitions of the index test and the
true disease status of the individual, then these should not affect
reference standard DO NOT overlap
the test performance
• TIP: To ascertain the independence of definitions, look at the criteria

Withdrawals
used to interpret the index test and the reference standard. Make sure
• Can occur when patients drop out from the study before their results (of
that these criteria do not overlap.
either or both index test and reference standard) are known
• Unlikely to occur in truly cross-sectional studies
QUESTION 3: WERE THE “PERFORMANCE” OF THE INDEX TEST AND THE
• A possibility when reference standard includes a degree of follow-up
REFERENCE STANDARD DONE INDEPENDENTLY?
• In clinical settings, the result of a simple test often becomes the
indication for performing more sophisticated tests
APPRAISING DIRECTNESS
o Ex: positive results on a needle biopsy will trigger the performance
(WILL I READ THE ARTICLE?)
of an excision biopsy and histopathologic exam
• Similar to articles of therapy
• In a study setting, the result of the index test (usually the simpler test)
• Important to assess for directness
may become the indication for the reference standard (a more
o How well does the PECO of the study (research question)
sophisticated test)
correspond with the PECO of the clinical question?
o Has to be corrected/avoided
• Does the study provide a direct enough answer to the clinical question in terms
o Even if this happen in only a few cases, the agreement between the
of patients (P), exposure/intervention (E), and outcome (O)?
index test and the reference standard will increase
• Researchers should not recruit patients in whom both tests were done.
Rather, they should do both tests in all recruited patients.
APPRAISING VALIDITY
o May present ethical problems
(WILL I BELIEVE IT?)
▪ May mean doing invasive tests in patients who probably do
• Done to assess if the article gives an accurate result
not need them
• 4 criteria will be used to assess the validity of the study
o Presents funding problems
o Phrased as yes or no questions
▪ Invasive tests can be quite expensive
o Yes = criterion is satisfied
o Acceptable alternatives would be to use a less expensive, and less
o More yes answers = better study
invasive reference standard that still satisfies the criteria for
acceptability
QUESTION 1: WAS THE REFERENCE STANDARD ACCEPTABLE?
• Reference Standard
• TIP: To ascertain the independence of performance, check whether
o Unequivocally establishes the presence or absence of a disease
measures were taken to ensure that the result of the index test was not
o Used to establish how well an index test can diagnose true disease
an indication for doing the reference standard.
• Perfectly accurate tests are very rare, invasive, and expensive
o Researchers evaluating diagnostic tests try to strike a balance
QUESTION 4: WERE THE “INTERPRETATION” OF THE INDEX TEST AND THE
between cost, accuracy, and safety in choosing a reference
REFERENCE STANDARD INDEPENDENT?
standard
• In clinical situations, it is standard practice to supply clinical information
• Examples of Reference Standards
(history, PE, and early test results) when requesting for certain tests
o A biopsy showing cancer in an excised thyroid gland
o Ex: clinical data are routinely required when requesting for a chest
▪ A single test that is probably close to “perfect.”
x-ray
o Jones criteria in diagnosing rheumatic fever
• In a study setting, knowledge of the results of an earlier index test may
▪ Single tests are not always available for complex conditions
influence the interpretation of the reference standard
▪ Some need multiple criteria to establish a disease
o Makes the index test seem better than it really is
o Reversible airway obstruction in diagnosing bronchial asthma
• Need to make sure that those who interpret the results of the index test
▪ Response to therapy can also be the reference standard
are blind to the results of the reference standard, and vice versa
• Reference standard must always be acceptable to clinicians
• Researchers will rarely point out that the interpretation of the index test
o Defining reference standards as acceptable rather than exact
and the reference standard were not blinded
definitions of the disease makes it easier for researchers
o Blinding is less likely to happen in retrospective studies (ex: chart
▪ They can choose a reference standard based on feasibility,
reviews)
even if it is not the most accurate of several choices
▪ Retrospective studies are more feasible, BUT the assurance of
independence becomes much more challenging
• TIP: Look for how the authors define the presence or absence of disease
in the study population. Make sure that this definition is at least
acceptable in your practice.
littlemarmaid 3
• TIP: To ascertain the independence of interpretation, look for attempts • Knowing the sensitivity and specificity of a test does not necessarily help
to make sure interpretation of one test did not affect the interpretation in making clinical decisions because they are statistics based on knowing
of the other whether the patient has a disease
o Suggested by terms such as “independent” or “blind” interpretation o Except:
of results ▪ A negative test result from a test with high sensitivity (a very
o In general, independence can only be assured through a prospective low false-negative rate) usually excludes the disease
study. ▪ A positive test result in a test with high specificity (a very low
• DECISION POINT: Will I believe the article? false-positive rate) usually indicates disease
REMEMBER:
APPRAISING RESULTS SnOUT – Sensitivity, Negative, Rule Out
APPRAISING RESULTS OF PRIMARY TEST ACCURACY STUDY SpIN – Specificity, Positive, Rule In
• The 2X2 Table
o Basic format for evaluating the performance characteristics of a PREDICTIVE VALUES
diagnostic test • Predictive Values
(+) Reference (-) Reference o Measures defined as conditional on the index test results
Standard Standard o Computed as proportions of the total with positive and negative
(+) Positive index test results
True Positives False Positives
Index Predictive Value • Although predictive values seem useful, they will vary substantially
(TP) (FP)
Test TP/(TP+FP) according to the prevalence of the disease
(-) Negative
False Negatives True Negatives
Index Predictive Value Positive Predictive Value
(FN) (TN)
Test TN/(FN+TN) PPV = TP/TP+FP or a/(a+b)
Sensitivity Specificity
• The probability that a case with a positive index test result is diseased
TP/(TP+FN) TN/(FP+TN)
• Reported as proportions or percentages
• Sensitivity and the Negative Predictive Value provide information on the

Negative Predictive Value
magnitude of false negatives
NPV = TN/(FN+TN) or d/(c+d)
o As sensitivity and negative predictive value increase, the proportion
of the false-negative test errors decrease • The probability that a case with a negative index test result is non-
• Specificity and the Positive Predictive Value provide information on the diseased
magnitude of false-positive test errors • Reported as proportions or percentages
o As specificity and positive predictive value increase, the proportion
of false-positive test errors decrease ADDITIONAL NOTES (from the video):
• Predictive Values
SENSITIVITY AND SPECIFICITY o Help the clinician determine what the probability of disease is
• Measures defined conditional on the disease status given a positive or negative test result
• Computed as proportions of the number of diseased and the number of • Positive Predictive Value (PPV)
non-diseased respectively o The probability that subjects with a positive test truly have the
Sensitivity disease
• Negative Predictive Value (NPV)
Sensitivity = TP/(TP+FN) or a/(a+c)
o The probability that subjects with a negative test truly do not
• Aka Detection Rate, True Positive Rate, or True Positive Fraction have the disease (disease free)
• The probability that the index test results will be positive in a diseased • The Role of Prevalence
case o The PPV of a certain diagnostic test increases as the prevalence
• Expressed as a proportion or a percentage for the disease being diagnosed increases
Specificity Please watch:

Specificity = TN/(FP+TN) or d/(b+d) https://www.youtube.com/watch?v=QqgJHryKOSU&feature=emb_title
• Aka True Negative Rate or True Negative Fraction
• The probability that the index test results will be negative in a non- PREVALENCE OF THE DISEASE
diseased case Prevalence = (TP + FN)/(TP + FP + FN + TN) or (a + c)/(a + b + c + d)
• False Positive Rate and False Positive Fraction • Represented as the proportion of patients that have the disease
o Sometimes used to complement specificity • Based on the characteristics of the patient population and clinical setting
o Computed as 1 – Specificity
How Prevalence Affects PPV and NPV (Example)
Youden’s Index • Both Sensitivity and Specificity of the diagnostic test are 90%
• Measure of the combined value of sensitivity and specificity • 2x2 table where the prevalence is 10%:
• Has not probabilistic interpretation Disease (+) Disease (-) Total
• Provides a general index of test accuracy (+) Index Test 90 90 180
o Gives equal weight to errors (false positives and false negatives) (-) Index Test 10 810 820
• Values close to 1 = high accuracy Total 100 900 1000
• Value of 0 = uninformed guessing o Sn = a/(a+c) = 90/100 = 90%
o Indicates that a test has no diagnostic value o Sp = d/(b+d) = 810/900 = 90%
o PPV = a/(a+b) = 90/180 = 50%
o NPV = d/(c+d) = 810/820 = 99%
littlemarmaid 4
• 2x2 table where the prevalence is 1%: Comorbidities
Disease (+) Disease (-) Total • Consider comorbid conditions that might affect the performance of the
(+) Index Test 9 99 108 index test
(-) Index Test 1 891 892 • Ex: malnutrition can decrease the sensitivity of a tuberculin skin test
Total 10 990 1000
o Sn = a/(a+c) = 9/10 = 90% Race
o Sp = d/(b+d) = 891/990 = 90% • Consider racial differences that might alter the performance of the index
o PPV = a/(a+b) = 9/108 = 8.3% test
o NPV = d/(c+d) = 891/892 = 99.8% • Ex: African American ancestry increases the likelihood of a high-grade
prostatic cancer in patients with high levels of prostatic surface antigen
• The PPV was higher in the case where prevalence was higher (10%), and
the NPV was higher in the case where prevalence was lower (1%)
Age
o As prevalence increases, PPV increases but NPV decreases
o As prevalence decreases, PPV decreases but NPV increases • Consider the age of the population in the study concerning the patient
• Ex: a sputum AFB stain for PTB performs well in adults, but gastric
LIKELIHOOD RATIO aspirates are more accurate in infants
• More useful clinically
o Can be used to update the pretest probability of disease using Pathology
Bayes’ Theorem or nomogram once the test result is known • Consider differences in the type of pathology for which the index test is
• Post-Test Probability being used
o The updated probability • Ex: a plain cranial CT scan is reasonable accurate diagnostic test within
o Should be higher than the pre-test probability if the test result is the first few hours of a hemorrhagic stroke, but is less accurate for an
positive ischemic stroke within the same time window
o Should be lower than the pre-test probability if the test result is
negative SOCIOECONOMIC FACTOR
Questionnaires
Positive Likelihood Ratio
• If the index test is a questionnaire, then it particularly prone to social,
LR (+) = Sn/(1 – Sp) or estimated as (a/(a+b)/(b/b+d)) cultural, and economic differences among patients
• Describes how many times more likely positive index results were in the o A lot can be lost in translation of questionnaires
disease group compared to the non-disease group o Even if the language is the same, interpretation and reaction may
• Should be greater than 1 if the test is informative vary
o LR (+) > 1.0 increases the likelihood of the disease • Examples:
o The higher the LR above 1.0, the better it is at ruling in a condition o Cognitive tests tend to underestimate the abilities of elderly people
from ethnic minorities
LR < 3.0 (close to 1.0) Weakly positive
▪ Can lead to overdiagnosis of dementia in these communities
LR 3.0 – 10.0 Moderately positive o The CAGE questionnaire (commonly used to detect alcoholism)
LR > 10.0 Strongly positive performs poorly in some ethnic groups, especially African American
men
Negative Likelihood Ratio o A questionnaire to detect autism developed in the US and UK could
LR (-) = (1-Sn)/Sp or estimated as (c/(a+c) / (d/(b+d)) not be used in families in Hong Kong because of perceived cultural
• Describes how many times less likely negative index results were in the differences
disease group compared to the non-disease group • Should look for local validation studies before accepting the accuracy of
• Should be less than 1 if the test is informative diagnostic tests that come in the form of questionnaires
o LR (-) < 1.0 decreases the likelihood of disease
o The lower the LR below 1.0, the better it is at ruling out a condition Laboratory Tests
• Even laboratory tests may sometimes have questionable applicability
LR > 0.3 (close to 1.0) Weakly negative
• When a lab has limited resources, it may not match the standards of
LR 0.3-0.1 Moderately negative performance defined in a study that uses the best equipment/hires the
LR < 0.1 Strongly negative best interpreters/continuously monitors good laboratory practices
• Need to ensure that these standards are (at least) approximated by the
local laboratories
ASSESSING APPLICABLITY • Ex: RT-PCR used in the diagnosis of COVID-19
• After the article is appraised for its directness, validity, and results, and it o Only laboratories that passed the accreditation are considered
shoes that it is valid and gave acceptable accuracy, the next step is to
assess its applicability to the individual patient
• Look for biologic and socio-economic issues that may affect the INDIVIDUALIZING THE RESULT
applicability of the result of the study THE DIAGNOSTIC PROCESS
o Similar to assessing applicability in articles on therapy • The algorithm of the Diagnostic Process has 7 steps
• Step 5: Testing the Hypothesis
BIOLOGIC FACTORS o Once you have your leading hypothesis, you need to decide
Sex whether you need further information before proceeding to treat
• Consider physiologic, hormonal, or biochemical differences between the patient/before excluding the diagnosis
males and females that might affect the accuracy of the index test o Think in terms of probability
• Ex: creatinine clearance based on a single serum creatinine determination ▪ Is the pretest probability high enough that no further testing is
must be adjusted according to sex needed to proceed with treatment?
▪ Is it low enough that no further testing is needed to exclude
the diagnosis?
littlemarmaid 5
Use Your Overall Clinical Impression
• This is a combination of:
o Known symptom prevalence and disease prevalence
o Clinical experience
o “Clinical judgment.”
• This is just as imprecise as it sounds
o It has been shown that physicians are disproportionately influenced
by their most recent clinical experience.
o BUT it has also been shown that the overall clinical impression of
experienced clinicians has significant predictive value.
• Clinicians generally categorize pretest probability as low, moderate, or
high.
o This rather vague categorization is still helpful.
o Do not get distracted thinking a number is necessary.
THE THRESHOLD MODEL

CONCEPTUALIZING PROBABILITIES
• The ends of the bar in the threshold model represent 0% to 100% pretest
probability
• Treatment Threshold
o The probability above which the diagnosis is so likely, you would
treat the patient without further testing
DETERMINE THE PRETEST PROBABILITY • Test Threshold
• There are 3 ways to determine the pretest probability of the leading o The probability below which the diagnosis is so unlikely, it is
diagnosis and the most important (usually the most serious) active excluded without further testing
alternatives: • Diagnostic tests are necessary when the pretest probability of disease is
o Use a validated clinical trial decision rule (CDR) in the middle (grey zone)
o Use information about the prevalence of certain symptoms in a o Above the test threshold and below the treatment threshold
given disease • A really useful tests shifts the probability of disease so much the that
o Use your overall clinical impression post-test probability (the probability after the test is done) crosses one of
the thresholds
Using a Validated CDR
• Investigators construct a list of potential predictors of the outcome of Factors to consider when setting the test threshold
interest and then examine a group of patients if the predictors and • The perceived menace
outcome are present o The more dangerous a disease seems, the lower the diagnostic
o Logistic regression is then used to determine which predictors are threshold (test at lower probabilities)
most powerful and which can be omitted • The invasiveness of the test
o The model is then validated by applying it to other patient o The more invasive a test, the higher the diagnostic threshold (test at
populations higher probabilities)
o To simplify use, the clinical predictors in the model are often • The side effects of the test
assigned point values, and different point totals correspond to o The more side effects a test has, the higher the diagnostic threshold
different pretest probabilities (treat at higher probabilities)
• CDRs are rarely available, but are the most precise way of estimating • The cost of the test
pretest probability o The more expensive a test, the higher the diagnostic threshold
• Finding a validated CDR will allow you to come up with an exact number (treat at higher probabilities)
(or a small range of numbers) for your pretest probability ▪ Esp. if the expenses will be paid out of pocket
Use Information About the Prevalence of Certain Symptoms Factors to consider when setting the treatment threshold
• Ex:73% of patients with pulmonary embolism (PE) have dyspnea. • The perceived menace
However, this does not tell you how many patients with dyspnea have o The more dangerous a disease seems, the lower the treatmnet
PE. threshold (treat at lower probabilities)
• There is often a lot of information available about symptom prevalence. • The invasiveness of treatment
o The more invasive a treatment, the higher the diagnostic threshold
(treat at higher probabilities)
littlemarmaid 6
• The side effects of the treatment APPRAISING DIRECTNESS
o The more side effects a treatment has, the higher the diagnostic Question Article
threshold (treat at higher probabilities) Population/ Symptomatic and
• The cost of the test COVID-19 suspect
Patient asymptomatic patient
o The more expensive a treatment, the higher the diagnostic
Coris COVID-19 Ag Coris COVID-19 Ag
threshold Exposure
Respi-Strip Test Respi-Strip Test
▪ Esp. if the expenses will be paid out of pocket
Comparator RT-PCR RT-PCR
Outcome Diagnosis of COVID-19 Diagnosis of COVID-19
BAYES THEOREM
• Step 1: Convert pretest probability to pretest odds.
APPRAISING VALIDITY
o Pretest odds = pretest probability/ (1 - pretest probability)
• Q1: Was the reference standard acceptable? Yes
• Step 2: Multiply pretest odds by the Likelihood ratio to get the post-test
odds.
o Post-test odds = pretest odds x LR
• Step 3: Convert post-test odds to post-test probability.
o Post-test probability = posttest odds/ (1 + post-test odds)
• DECISION POINT: Did the negative test result make the probability
reach the test threshold? Did the positive test result make the probability
reach the treatment threshold?
FAGAN NOMOGRAM o RT-PCR is the current gold standard being used for the diagnosis of
• Find the patient’s pretest probability on the COVID-19
left, then draw a line through the likelihood o It is a combination of nasopharyngeal and oropharyngeal samples to
ratio of the test to find the post-test improve the sensitivity of one specimen alone.
probability o In the article, only a nasopharyngeal swab was done
o Once you have arrived at the post-test ▪ The nasopharyngeal swab specimen only has a 97% (92-
probability, you can make a clinical 100%) sensitivity.
decision base don the thresholds of
management and the patient’s
preferences
• 3 Choices
o Stop testing and get on with the
treatment of the probable disease
o Stop testing and reassure the patient
that the disease probability is low • Q2: Were the definition of the index test and the reference standard
o Do more tests before deciding independent? Yes
o There are no criteria used for the index test and the reference
A useful calculator: http://araw.mede.uic.edu/cgi-bin/testcalc.pl standard
CASE SCENARIO
You are a doctor in a remote barangay in the mountainous province of Tralala
where RT-PCR was not available. A 40 year/old, male patient came in to your clinic
during the enhanced community quarantine due to cough, sore throat and fever
which started 10 days prior to consult. On your history, patient had a close contact
with a COVID (+) patient 13 days prior to consult. Past medical history revealed
that he has uncontrolled T2DM. You decided to do rapid antigen test. You inquired
in the laboratory you are affiliated to and asked the availability of the test. The
laboratory personnel said that it was available and the name of the test kit is Coris
COVID-19 Ag Respi-Strip test . You want to know the accuracy of this test. So, you
decided to search for an article and found:
Low Performance of Rapid Antigen Detection Test

as Frontline Testing for T COVID-19 Diagnosis
FOCUSED CLINICAL QUESTION

Among suspected COVID-19 (P)patients, how accurate is Coris COVID-19
Ag Respi-Strip (E) test compared to RT-PCR (C) in the diagnosis of COVID-
19 (O)?
littlemarmaid 7
• Q3: Were the performance of the index test and the reference standard APPRAISING RESULTS
independent? Yes
o Both tests were done in all collected samples
• Q4: Were the interpretation of the index test and the reference standard
independent? Yes
o There is no degree of subjectivity in interpreting the RT-PCR
▪ It is based on the CT value
o The Respi-strip employs a qualitative interpretation using a control
line and a test line
▪ Important to blind the interpreter of the Respi-strip to the
results of the RT-PCR
• Computing for the different test statistics based on the results:
• DECISION POINT: Will I believe the article?

o The reference standard used in the study only has 97% (92-100%)
sensitivity
▪ The result of the sensitivity in the study might be affected
▪ Ideally, the reference standard should be 100% sensitive and
100% specific
o There was also no mention in the article about how they
interpreted the result of the Respi-strip
littlemarmaid 8
The test has no cross-reactivity
• Prevalence of the Disease with other viruses, bacteria, or
o The computed prevalence of the disease is quite high considering fungus
that the samples were taken in a hospital Race Asian European
▪ Might not be reflective of the true prevalence of the disease The age range included in the
in the setting of the patient in the clinical case scenario Age 40
study is 0 to 94 years old
Included both asymptomatic and
symptomatic but stated also the
Symptomatic
results of those who are
Pathology
symptomatic only.
CT-value should The accuracy of the test is
be taken affected by the CT-value
Does not require special
Availability of equipment and skilled laboratory
COVID-19 CT- personnel familiar with
value molecular techniques. The test
Socioeconomic strip is only needed.
Factors Availability of
ultracentrifugation
in the laboratory Ultracentrifugation is needed.
that will conduct
the test.
• Visiting again the website of the manufacturer:
https://www.corisbio.com/pdf/Products/COVID-19-Respi-
Strip_20200910.pdf
• What CT-value is: https://www.cebm.net/study/duration-of-
• Likelihood Ratio infectiousness-and-correlation-with-rt-pcr-cycle-threshold-values-in-
cases-of-covid-19-in-england/
INDIVIDUALIZING THE RESULT

• Pretest probability: 60% based on the overall clinical impression.
Set the Test Threshold

• The perceived menace
o Among people with diabetes, the mortality rate was 7.3%, more
than three times that of the overall population (Mathew et al,
2020). ↓
• The invasiveness of the test
o Samples are collected through a nasopharyngeal swab. ↓
• The side effect of the test
o No noted side effect other than temporary discomfort due to
sample collection. ↓
• Cost of the test
o 6,000 pHp ↑
• Test Threshold: 30%
Set the Therapeutic Threshold

• The perceived menace
o Among people with diabetes, the mortality rate was 7.3%, more
than three times that of the overall population (Mathew et al,
2020). ↓
• The invasiveness of treatment: ↓
o Cannot compute for the LR (+) because the specificity of the test is
100%
▪ The test is highly specific
o The LR (-) is > 0.3 (close to 1), so the test is weakly negative
ASSESSING APPLICABILITY
Patient Article
The study included both males
Sex Male
and females
There was no mention of the
conditions (concomitant
Comorbidity Type 2 DM
diseases) that will give a false
positive or false-negative result.
littlemarmaid 9
• The cost of treatment: ↓ Were the definition of the index test and the reference standard independent?
• Bias: Incorporation Bias
• Answer YES if the index test did not form part of the reference standard
or vice versa.
• Answer NO if the reference standard formally included the result of the
• index test.
• Answer UNCLEAR if it is unclear whether the results of the index test
were used in the final diagnosis.
• Where facts may be found

o Definitions of the reference standard in the methods section
• Details that need to be reported
o Statements from the study about the tests used in the reference
standard procedure
• Treatment Threshold: 70%

Were the performance of the index test and the reference standard independent?
• Bias: Partial Verification Bias (Work-Up Bias, Primary Selection Bias,
• Discuss with the patient to elicit their values and preferences regarding
Sequential Ordering Bias)
the test and the treatment
• Answer YES if all patients, or a random selection of patients, who
o Helpful in setting the test and treatment thresholds
received the index test went on to receive verification of their disease
status using a reference standard
Bayes’ Theorem
o Even if the reference standard was not the same for all patients.
• Pretest Probability: 60%
• Answer NO if some of the patients who received the index test did not
• Test Threshold: 30%
receive verification of their true disease state
• Treatment Threshold: 70%
o And if the selection of patients to receive the reference standard
was not random.
• Answer UNCLEAR if this information is not reported by the study
• Where facts may be found

• Decisions
o The plans for verification may be described in the methods
o If the test gave a negative result, the probability went down from
o The numbers verified given in the results
60 to 50.3% but it did not reach the test threshold of 30% hence I
o Some recent studies may include a patient flow diagram which
need to request another test to exclude COVID-19 as one of the
indicates who was not verified
differential diagnoses.
• Description Required:
o If the test gave a positive result, COVID-19 is the likely diagnosis of
o The proportions not verified (if possible, according to index test
the patient because the COVID-19 Ag test based on the study
result)
result is highly specific thus it is good at ruling-in a disease.
o Any explanation of how decisions about verification were made,
and whether unverified patients were excluded from the 2×2
tables.
ADDITIONAL NOTES FROM ACTIVITY 3 DISCUSSION:
APPRAISING VALIDITY
Were the interpretation of the index test and the reference standard independent?
Was the reference standard an acceptable one?
• Bias: Test or Diagnostic Review Bias
• Bias: Verification Bias • Answer YES if:
• Prespecify in protocol o Test results (index or reference standard) were interpreted blind to
o Acceptable reference standards need to be predefined in the the results of the other test
review protocol. o Blinding is dictated by the test order, or meets the pre-stated
o Judgements as to the accuracy of the reference standard are often assumptions.
not straightforward and require clinical experience of the topic area • Answer NO if it is clear that one set of test results was interpreted with
to know whether a test or test combination is an appropriate knowledge of the other
reference standard. In some research areas consensus reference • Answer UNCLEAR if it is unclear whether blinding took place
standards have been defined.
o If a mixture of reference standards are used, consider carefully • Where facts may be found
whether these were all acceptable. o Details of blinding and processes may be described in
o the methods section outlining testing methods
• Answer YES if all reference standards used meet the pre-stated criteria • Description Required:
• Answer NO if one or more reference standards used do not meet the o Any clear order of the tests, and methods used to ensure blinding
pre-stated criteria should be described
• Answer UNCLEAR if it is unclear exactly what reference standard was ▪ Such as using code numbers, retrospective testing of samples
used o Any ambiguous phrases which are interpreted as indicating or not
indicating blinding should be stated.
• Where facts may be found:
o The methods section of the paper should describe the reference
standards that were used
• Description required:
o Report the reference standard(s) used
littlemarmaid 10

(PREVMED) 3.5 Assessing Articles On Diagnosis - Dr. Sta. Maria

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

(PREVMED) 3.5 Assessing Articles On Diagnosis - Dr. Sta. Maria

Uploaded by

Copyright:

Available Formats

PREVENTIVE MEDICINE

EBM 3: ASSESSING ARTICLES ON DIAGNOSIS

• TIP: To ascertain the independence of definitions, look at the criteria

• Sensitivity and the Negative Predictive Value provide information on the

Specificity Please watch:

THE THRESHOLD MODEL

Low Performance of Rapid Antigen Detection Test

FOCUSED CLINICAL QUESTION

• Computing for the different test statistics based on the results:

• DECISION POINT: Will I believe the article?

INDIVIDUALIZING THE RESULT

Set the Test Threshold

Set the Therapeutic Threshold

• Where facts may be found

• Treatment Threshold: 70%

• Where facts may be found

You might also like