Professional Documents
Culture Documents
who are normal, the test will correctly identify all patients In an ideal setting, there would be no overlap between a
as normal who do not exceed the cutoff value (true nega- normal and a disease population. Then, a cutoff value could
tive); however, it will misidentify a small number of normal be placed between the two populations, and such a test
patients as abnormal (false positive) (Figure 9–2, left). It would have 100% sensitivity and 100% specificity (Figure
is important to remember that every positive test is not 9–3, left). However, in the real world, there is always some
necessarily a true positive; there will always be a small overlap between a normal and disease population (Figure
percentage of patients (approximately 1–2%) who will be 9–3, right). If a test has very high sensitivity and specificity,
misidentified. it will correctly identify nearly all normals and abnormals;
The sensitivity of a test is the percentage of all patients however, there will remain a small number of normal
with the condition who have a positive test. When a test is patients misidentified as abnormal (false positive) and a
applied to a disease population, the test will correctly iden- small number of abnormal patients misidentified as normal
tify all abnormal patients who exceed the cutoff value (true (false negative).
positive); however, it will misidentify a small number of Often there is a compromise between sensitivity and
abnormal patients as normal (false negative) (Figure 9–2, specificity when setting a cutoff value. Take the example
right). Thus, it is equally important to remember that every of a normal and a disease population where there is signifi-
negative test is not necessarily a true negative; there will cant overlap between the populations for the value of a test.
always be a small percentage of abnormal patients (approxi- If the cutoff value is set low, the test will have high sensitiv-
mately 1–2%) who will be misidentified as normal. Thus, ity but very low specificity (Figure 9–4). In this case, the
the specificity and sensitivity can be calculated as follows: test will correctly diagnose nearly all the abnormals cor-
rectly (true positive) and will only misidentify a few as
True Negatives normal (false negative) (Figure 9–4, left). However, the
Specificity (%) = ∗ 100
(True Negatives + False Positives ) tradeoff for this high sensitivity will be low specificity. In
this case, a high number of normal patients will be classified
as abnormal (false positive) (Figure 9–4, right).
True Positives Conversely, take the example where the cutoff value is
Sensitivity (%) = ∗ 100
(True Positives + False Negatives ) set high. The test will now have high specificity but very
Normal Disease
Downloaded from ClinicalKey.com at Univ Targu Mures Med Pharmacy March 28, 2016.
For personal use only. No other uses without permission. Copyright ©2016. Elsevier Inc. All rights reserved.
92 SECTION III Sources of Error: Anomalies, Artifacts, Technical Factors, and Statistics
High sensitivity
Low specificity
Cut off Cut off
Low sensitivity
High specificity
Cut off Cut off
low sensitivity (Figure 9–5). In this case, the test will cor- The tradeoff between sensitivity and specificity can be
rectly identify nearly all the normals correctly (true nega- appreciated by plotting a receiver operator characteristic
tive) and will only misidentify a few normals as abnormal (ROC) curve that graphs various cutoff values by their
(false positive) (Figure 9–5, left). However, the tradeoff sensitivity on the y-axis and specificity on the x-axis (actu-
for this high specificity will be low sensitivity. Here, a high ally in a typical ROC curve, the x-axis is 1 minus the specificity,
number of abnormal patients will be classified as normal which can alternatively be graphed as the specificity going
(false negative) (Figure 9–5, right). from 100 to 0, instead of 0 to 100). Figure 9–6 shows an
False positives and false negatives result in what are ROC curve for the digit 4 sensory nerve conduction study
termed type I and type II errors, respectively. In a type I in patients with mild carpal tunnel syndrome. For this nerve
error, a diagnosis of an abnormality is made when none is conduction study, the sensory latency stimulating the ulnar
present (i.e., convicting an innocent man). Conversely, in a nerve at the wrist and recording digit 4 is subtracted
type II error, a diagnosis of no abnormality is made when from the sensory latency stimulating the median nerve at
one actually is present (i.e., letting a guilty man go free). the wrist and recording digit 4, using identical distances. In
Although both are important, type I errors are generally normals, one expects there to be no significant difference.
considered more unacceptable (i.e., labeling patients as In patients with carpal tunnel syndrome, the median
having an abnormality when they are truly normal, because latency is expected to be longer than the ulnar latency.
this can lead to a host of problems, among them inappropri- Note in Figure 9–6 that there is a tradeoff between specifi-
ate testing and treatment). Thus, the specificity of a test city and sensitivity as the cutoff value changes. For any
should take precedence over the sensitivity, unless the test cutoff value 0.4 ms or greater, there is a very high specifi-
is being used as a screening tool alone (i.e., any positive city. As the cutoff value is lowered, the sensitivity increases
screening test must be confirmed by a much more specific but at a significant cost to the specificity. In this example,
test before any conclusion is reached). it is easy to appreciate that the 0.4 ms cutoff is where the
Downloaded from ClinicalKey.com at Univ Targu Mures Med Pharmacy March 28, 2016.
For personal use only. No other uses without permission. Copyright ©2016. Elsevier Inc. All rights reserved.
Chapter 9 • Basic Statistics for Electrodiagnostic Studies 93
Sensitivity (%)
2003. A Bayesian argument against rigid cutoffs in electrodiagnosis
of median neuropathy at the wrist. Neurology 60, 458–464.) > 0.5
50
40
> 0.6
30
20
10
0
100 90 80 70 60 50 40 30 20 10 0
Specificity (%)
graph abruptly changes its slope. Setting the cutoff value Population = 1000
at 0.4 ms or greater achieves a specificity greater than 97%.
The sensitivity is approximately 70%. One could place the
cutoff value at 0.1 ms and achieve a sensitivity of 90%; Prevalence of disease = 80%
however, the specificity would fall to about 60%, meaning
40% of normal patients would be misidentified as abnor-
mal, a clearly unacceptable level.
Important clinical–electrophysiologic implications are as Disease Normal
follows: 800 200
BAYES’ THEOREM AND examples (Figures 9–7 and 9–8). In both examples, the
THE PREDICTIVE VALUE OF same test with a 95% sensitivity and a 95% specificity is
A POSITIVE TEST applied to a population of 1000 patients. In Figure 9–7,
the prevalence of the disease in the population is high
Bayes’ theorem states that the probability of a test demon- (80%); in Figure 9–8, the prevalence is low (1%). In the
strating a true positive depends not only on the sensitivity population with a disease prevalence of 80%, 760 of the
and specificity of a test but also on the prevalence of the 800 patients with the disease will be correctly identified;
disease in the population being studied. The chance of a of the 200 normals, 10 will be misidentified as abnormal
positive test being a true positive is markedly higher in a (false positives). The predictive value of a positive test is
population with a high prevalence of the disease. In con- defined as the number of true positives divided by the
trast, if a very sensitive and specific test is applied to a number of total positives. The total positives are the true
population with a very low prevalence of the disease, most positives added to the false positives. In Figure 9–7, the
positive tests will actually be false positives. The predictive predictive value that a positive test is a true positive is
value of a positive test is best explained by contrasting two 760/(760 + 10) = 98.7%. Thus, in this example, where the
Downloaded from ClinicalKey.com at Univ Targu Mures Med Pharmacy March 28, 2016.
For personal use only. No other uses without permission. Copyright ©2016. Elsevier Inc. All rights reserved.
94 SECTION III Sources of Error: Anomalies, Artifacts, Technical Factors, and Statistics
disease prevalence in the population is high, a positive test patients with a high index of suspicion for the disorder
is extremely helpful in correctly identifying the patient as being questioned; hence, the prevalence of the disease is
having the disease. high. For instance, take the example of a patient referred
In the example where the disease prevalence is 1% to the EDX laboratory for possible carpal tunnel syndrome.
(Figure 9–8), of the 10 patients with the disease, 9.5 will If the patient has pain in the wrist and hand, paresthesias
be correctly identified. However, of the 990 normals, 49.5 of the first four fingers, and symptoms provoked by sleep,
will be misidentified as abnormal. Thus, the predictive driving, and holding a phone, the prevalence of carpal
value that a positive test is a true positive is 9.5/ tunnel syndrome in patients with such symptoms would be
(9.5 + 49.5) = 16.1%. This means that 83.9% of the positive extremely high. Thus, if EDX studies are performed and
results will actually be false! In this setting, where the demonstrate delayed median nerve responses across the
disease prevalence in the population is low, a highly sensi- wrist, there is a very high likelihood that these positive tests
tive and specific test is of absolutely no value. are true positives. However, if the same tests are performed
Although this analysis may seem distressing, the good in a patient with back pain and no symptoms in the hands
news is that EDX studies are generally performed in and fingers, the prevalence of carpal tunnel syndrome
would be low in such a population. In this situation, any
positive finding would have a high likelihood of being a
false positive and would likely not be of any clinical
Population = 1000
significance.
Less well appreciated is that the problem of a false posi-
tive in a population with a low prevalence of disease can
Prevalence of disease = 1%
be overcome by making the cutoff value more stringent
(i.e., increasing the specificity). Take the example shown
in Figure 9–9 of the palmar mixed latency difference test
Disease Normal in patients with suspected carpal tunnel syndrome. For
10 990 this nerve conduction study, the latency for the ulnar
palm-to-wrist segment is subtracted from the latency for
the median palm-to-wrist segment, using identical dis-
Sensitivity = 95% Specificity = 95% tances. In normals, one expects there to be no significant
difference. In patients with carpal tunnel syndrome, the
median latency is expected to be longer than the ulnar
True positive False negative False positive True negative
latency. In this example, the post-test probability (i.e., the
9.5 0.5 49.5 940.5 predictive value of a positive test) is plotted against differ-
ent cutoff values for what is considered abnormal for
Predictive value of
=
(True positives) patients in whom there is a high pre-test probability of
a positive test (True positives + false positives) disease and for those in whom there is a low pre-test
9.5 probability. In the patients with a high pre-test probability
= = 16.1%
(9.5 + 49.5) of disease, a cutoff value of 0.3 ms (i.e., any value >0.3 ms
FIGURE 9–8 Predictive value of a positive test: low prevalence is abnormal) achieves a 95% or greater chance that a posi-
of disease. See text for details. tive test is a true positive. However, the same 0.3 ms
the actual test value and PreTP, with higher PreTP yielding 65
higher PostTP. A borderline abnormal test value (i.e., 0.4 ms) 60
yields very high PostTP (95%) when PreTP is high, whereas 55
50
the same test value results in only intermediate PostTP when 45
PreTP is low. In contrast, very abnormal test values (i.e., 40
≥0.5 ms) result in PostTPs of 100%, regardless of the PreTP. 35
(Adapted from Nodera, H., Herrmann, D.N., Holloway, R.G., et al., 30
2003. A Bayesian argument against rigid cutoffs in electrodiagnosis of 25
median neuropathy at the wrist. Neurology 60, 458–464.) 20 90%
15 Pre-test Probability
10 10%
5
0
<0.1 0.1 0.2 0.3 0.4 0.5 0.6 >0.6
Palmar median-ulnar difference (ms)
Downloaded from ClinicalKey.com at Univ Targu Mures Med Pharmacy March 28, 2016.
For personal use only. No other uses without permission. Copyright ©2016. Elsevier Inc. All rights reserved.
Chapter 9 • Basic Statistics for Electrodiagnostic Studies 95
cutoff in the low pre-test probability population results in 4. An abnormal test, especially when borderline, is likely
only a 55% chance that a positive test is a true positive a false positive if the clinical symptoms and signs do
(and a corresponding 45% false-positive rate). These find- not suggest the possible diagnosis.
ings are in accordance with Bayes’ theorem wherein the
chance of a positive test being a true positive depends not
only on the sensitivity and specificity of the test, but on MULTIPLE TESTS AND
the prevalence of the disease in the population being THE INCREASING RISK
sampled (i.e., the pre-test probability). However, if the
cutoff value is increased to 0.5 ms, then the post-test OF FALSE POSITIVES
probability that a positive test is a true positive jumps to The last relevant statistical issue that every electromyogra-
greater than 95%, even in the population with a low prob- pher needs to appreciate is the increased risk of a false
ability of disease. positive when many different tests are applied in an attempt
Important clinical–electrophysiologic implications are as to reach a diagnosis. The most common situation occurs
follows: in the electrodiagnosis of median neuropathy at the wrist
1. Every EDX study must be individualized, based on (i.e., carpal tunnel syndrome) where numerous useful
the patient’s symptoms and signs and the correspond- nerve conduction studies have been described. However,
ing differential diagnosis. When the appropriate tests when normal values for each individual test are set, an
are applied for the appropriate reason, any positive upper limit of normal usually is selected at 2 SD beyond
test is likely to be a true positive and of clinical the mean so that approximately 97.5% of the normal popu-
significance. lation will be correctly identified. Thus, each test carries a
2. A test result that is minimally positive has significance 2.5% false-positive rate. If these tests are independent and
only if there is a high likelihood of the disease being used sequentially, the false-positive rate increases and
present, based on the presenting symptoms and quickly rises to unacceptable levels. For instance, if 10 tests
differential diagnosis. are applied, each with a 2.5% false-positive rate, and only
3. A test that is markedly abnormal is likely a true positive, one abnormal test is required to make a diagnosis, the false-
regardless of the clinical likelihood of the disease. positive rate rises above 20%. This situation is similar to a
.050
.045
.040
.035
FPR .030 FPR
50 .025 50
.020 .050
40 40 .045
.015
.040
30 30
.035
.010
.030
20 20
.025
.005
.020
10 10
.015
.010
0 0 .005
0 10 20 30 0 10 20 30
Number of tests Number of tests
1 test abnormal ≥2 tests abnormal
FIGURE 9–10 Multiple tests and the risk of false positives. The number of tests is plotted against the cumulative false-positive rate (FPR)
for a variety of different individual test FPRs. Note the curve with the (★); this represents a false-positive rate of 2.5%, which carries the most
common test specificity of 97.5%. Left: Cumulative FPR is calculated based on the assumption that only one test needs to be abnormal to
diagnose the condition. Note that if 10 different tests are done, the cumulative FPR is almost 25%. Right: If two or more tests are required to
be abnormal before a diagnosis is reached, the statistics change. When 10 tests are done with an individual FPR of 2.5%, the cumulative FPR
remains less than 2.5%, an acceptable level.
(Adapted from Van Dijk, J.G., 1995. Multiple tests and diagnostic validity. Muscle Nerve 18, 353–355.)
Downloaded from ClinicalKey.com at Univ Targu Mures Med Pharmacy March 28, 2016.
For personal use only. No other uses without permission. Copyright ©2016. Elsevier Inc. All rights reserved.
96 SECTION III Sources of Error: Anomalies, Artifacts, Technical Factors, and Statistics
normal person undergoing an SMA-20 blood screen. It is 1. Be very cautious about making any diagnosis based on
not uncommon that a single test is above or below the only one piece of data; if that piece of data is in error,
cutoff range and, in nearly every case, represents a false it will be a false positive.
positive. 2. Be very cautious about making any diagnosis based on
Fortunately, there is a relatively simple remedy to this only one piece of data; 2.5% of all tests will be false
problem of multiple tests and the increasing risk of false positives, simply based on how the cutoff values are
positives. In Figure 9–10, the number of tests performed selected (i.e., 2 SD beyond the mean).
is plotted against the cumulative false-positive rate for a 3. Be very cautious about making any diagnosis based on
variety of different individual test false-positive rates only one piece of data, especially if multiple tests are
(FPRs). Note the curve with the (★); this represents a used; the cumulative false-positive rate quickly rises
false-positive rate of 2.5%, which carries the most common to unacceptable levels.
test specificity of 97.5%. In the graph to the left, the cumu- 4. When multiple tests are used, the false-positive rate
lative false-positive rate is calculated based on the assump- can be reduced to an acceptable level if two or more
tion that only one test needs to be abnormal to diagnose tests must be abnormal before a diagnosis is made.
the condition. Note that if 10 different tests are performed,
with each individual test carrying a false-positive rate of
2.5%, the cumulative false-positive rate is almost 25%. In Suggested Readings
contrast, the statistics change significantly if two or more Nodera, H., Herrmann, D.N., Holloway, R.G., et al.,
tests are required to be abnormal to diagnose the condition. 2003. A Bayesian argument against rigid cut-offs in
In the graph to the right, if 10 tests are done, each with an electrodiagnosis of median neuropathy at the wrist.
individual false-positive test rate of 2.5%, the cumulative Neurology 60, 458–464.
false-positive rate remains less than 2.5%, an acceptable Rivner, M.H., 1994. Statistical errors and their effect in
level, if two or more of the tests are required to be electrodiagnostic medicine. Muscle Nerve 17, 811–814.
abnormal. Van Dijk, J.G., 1995. Multiple tests and diagnostic validity.
Important clinical–electrophysiologic implications are as Muscle Nerve 18, 353–355.
follows:
Downloaded from ClinicalKey.com at Univ Targu Mures Med Pharmacy March 28, 2016.
For personal use only. No other uses without permission. Copyright ©2016. Elsevier Inc. All rights reserved.