Professional Documents
Culture Documents
Mas S Mohktar Email: Mas - Dayana@um - Edu.my Phone (Office) : 0379677681
Mas S Mohktar Email: Mas - Dayana@um - Edu.my Phone (Office) : 0379677681
Email: mas_dayana@um.edu.my
Phone (office): 0379677681
The diagnostic test is compared against a reference ('gold')
standard, and results are tabulated in a 2 x 2 table
Sensitivity = a / a+c
Specificity = d / b+d
Test Gold Standard
Positive Predictive Value (PPV)
Results D+ D- = a / a+b ??
a b
T+ (TP) (FP) Negative Predictive Value (NPV)
= d / c +d ??
c d
T- (FN) (TN) Prevalence = a+c / (a+b+c+d) ??
2
Bayes’ theorem is often employed in issues of diagnostic testing
or screening
Sensitivity and Specificity
3
PPV
Sensitivity (SE)
NPV
1-Specificity (1-
SP)
4
The probability that he or she has the disease
depends on the prevalence of the disease in
the population tested and the validity of the
test (sensitivity and specificity)
5
Let say we have:
D+ = “Drug user”
D- = “Not a drug user”
T+ =“Positive drug test”
T− = “Negative test”
P(“Drug user”) = P(D+)
Using the Bayes Theorem calculate P(D+|T+) and P(D-|T-)
D+ D- Total
T+ 20 180 200
T- 10 1820 1830
Total 30 2000 2030
D+ D- Total
T+ 20 180 200
T- 10 1820 1830
Total 30 2000 2030
SE 20 / (20 10) 67%
P ( D ) 30 / 2030 1.48%
SP 1820 / (180 1820) 91%
30
20
PPV 20 / (20 180) 10% 2030 30
30
2030 20
30 2000
2030 2000
180
2000
1820
NPV 1820 / (10 1820) 99.5% 2030 2000
2000
2030 1820
2000 2030
30
10
30 7
Variable-any characteristic that can be measured or
categorized
If a variable can assume a number of different values such that
any particular outcome is determined by chance, it is a random
variable (r.v.)
Random variables are typically represented by uppercase
letters such as X, Y , and Z
A discrete random variable can assume only a finite or
countable number of outcomes
For example, someone’s marital status and the number of ear
infections an infant develops during his or her first year of life
A continuous random variable can take on any value within a
specified internal or continuum, e.g. weight or height
Every random variable has a corresponding probability
distribution
A probability distribution applies the theory of probability to
describe the behavior of the random variable
For a discrete r.v., its probability distribution specifies all
possible outcomes of this r.v. along with the probability that
each will occur
For a continuous r.v., its probability distribution allows us to
determine the probability associated with specified ranges of
values
For instance, let X be the birth order of each child
born to a woman residing in the U.S. (so X is a discrete
r.v.)
To construct a probability distribution for X, we list
each of the values x that the r.v. can assume, along with
the probability (i.e. P(X = x))
An uppercase X to denote the r.v. and a lowercase x to
represent the outcome of a particular child
The probabilities represents the relative frequency of
occurrence of each outcome x
Since all possible values of the r.v. are taken into
account, the outcomes are exhaustive; therefore, the
sum of their probabilities must be 1
Probability distribution of a random variable X representing the birth order of children born in the US
Probability distribution of a random variable representing the birth order of children born in the US
where
𝑛 𝑥
𝑃 𝑋=3 = 𝑝 (1 − 𝑝)𝑛−𝑥 =
𝑥
𝑛!
𝑝 𝑥 (1 − 𝑝)𝑛−𝑥 =
𝑥! 𝑛 − 𝑥 !
3 4−3
4! 1 1
1−
3! 4 − 3 ! 2 2
If p = 0.50, the probability distribution is symmetric
When n becomes large, the combination of n objects taken x at
a time
·
X is said to have a Poisson distribution with parameter λ
The Poisson distribution involves a set of underlying
assumptions:
1. The probability that a single event occurs within an interval is
proportional to the length of the interval
2. Within a single interval, an infinite number of occurrences of
the event are theoretically possible (not restricted to a fixed
number of trials)
3. The events occur independently both within the same interval
and between consecutive intervals
Recall that the mean of a binomial r.v. is equal to np and that its
variance is np(1−p)
When p is very small, 1−p is close to 1 and np(1−p) is
approximately equal to np
In this case, the mean and the variance of the distribution are
identical and can be represented by the single parameter λ =
np
The property that the mean is equal to the variance is an
identifying characteristic of the Poisson distribution
Instead of performing the calculation by hand, we can use
Table of Poisson Probabilities to obtain Poisson probabilities
for selected values of λ
The Poisson distribution is highly skewed for small values of λ;
as λ increases, the distribution becomes more symmetric
If λ∼ 2.5, use the table to find P(X ≥ 7)
When a r.v. X follows either a binomial or a Poisson distribution,
it is restricted to taking on integer values only
However, some outcomes of a random variable may not be
limited to integers or counts
A smooth curve is used to represent the probability distribution
of a continuous r.v.; the curve is called a probability density
The most common continuous distribution is the normal
distribution, also known as (aka) the Gaussian distribution or the
bell-shaped curve
Its probability density function (p.d.f.) is given by the equation
The normal curve is unimodal and symmetric about its mean (i.e.
mean=median=mode in this special case)
The two parameters μ and σ2 completely define a normal curve
X ∼ N(μ, σ2)
Since a normal distribution could have an infinite number of
possible values for its mean and standard deviation, it is
impossible to tabulate the area associated with each and every
normal curve
Instead, only a single curve is tabulated which is known as the
standard normal distribution for which μ = 0 and σ = 1 and we
write it as N(0, 1).
For the standard normal distribution, approximately 68.2% of the area
beneath the curve lies within ± 1 standard deviation from the mean
In general, for any arbitrary normal r.v.
X ∼ N(μ, σ2),
Standard score/Z-value (Z) =(X − μ)/σ has a standard normal
distribution
The standardization is done by shifting the mean to 0 and scaling
the variance to 1
By transforming X into Z, we can use a table of areas computed for
the standard normal curve to estimate probabilities associated
with X
An outcome of the r.v. Z, denoted z, is known as a standard normal
deviate or a z-score
X m
Z
s
Normal
Standard normal
s
distribution
distribution
s 1
Z
m x m
GENERAL FEATURES OF STANDARD
NORMAL DISTRIBUTION
column with the Z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
corresponding Z 0.0 0.5000 0.4960 0.4920 0.4880 0.4840 0.4800 0.4760 0.4720 0.4680 0.4640
0.1 0.4600 0.4560 0.4520 0.4480 0.4440 0.4400 0.4360 0.4330 0.4260 0.4250
value on the row, 0.2 0.4210 0.4170 0.4130 0.4090 0.4050 0.4010 0.3970 0.3940 0.3900 0.3860
0.3 0.3820 0.3780 0.3750 0.3710 0.3670 0.3630 0.3590 0.3560 0.3520 0.3480
then 0.4 0.3450 0.3410 0.3370 0.3340 0.3300 0.3260 0.3230 0.3190 0.3160 0.3120
0.5 0.3090 0.3050 0.3020 0.2980 0.2950 0.2910 0.2880 0.2840 0.2810 0.2780
Look up cross 0.6
0.7
0.2740
0.2420
0.2710
0.2390
0.2680
0.2360
0.2640
0.2330
0.2610
0.2300
0.2580
0.2270
0.2550
0.2240
0.2510
0.2210
0.2480
0.2180
0.2450
0.2150
0.8 0.2120 0.2090 0.2060 0.2030 0.2010 0.1980 0.1950 0.1920 0.1890 0.1870
point 0.9 0.1840 0.1810 0.1790 0.1760 0.1740 0.1710 0.1690 0.1660 0.1640 0.1610
1.0 0.1590 0.1560 0.1540 0.1520 0.1480 0.1470 0.1450 0.1420 0.1400 0.1380
1.1 0.1360 0.1340 0.1310 0.1290 0.1270 0.1250 0.1230 0.1210 0.1190 0.1170
1.2 0.1150 0.1130 0.1110 0.1090 0.1080 0.1060 0.1040 0.1020 0.1000 0.0985
1.3 0.0968 0.0915 0.0934 0.0918 0.9010 0.0885 0.0869 0.0853 0.0838 0.0823
1.4 0.0808 0.0792 0.0778 0.0764 0.0749 0.0735 0.0721 0.0708 0.0694 0.0681
1.5 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.0559
1.6 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.0455
1.7 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.0367
1.8 0.0359 0.0352 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294
1.9 0.0287 0.0281 0.0274 0.0268 0.0262 0.0265 0.0250 0.0244 0.0239 0.0233
Table 1 critical value of standard normal curve
z 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0.00
1 3.0 0.0010 0.0010 0.0011 0.0011 0.0011 0.0012 0.0012 0.0013 0.0013 0.0013
2.9 0.0014 0.0014 0.0015 0.0015 0.0016 0.0016 0.0017 0.0018 0.0018 0.0019
2.8 0.0019 0.0020 0.0021 0.0021 0.0022 0.0023 0.0023 0.0024 0.0025 0.0026
The area under 2.7 0.0026 0.0027 0.0028 0.0029 0.0030 0.0031 0.0032 0.0033 0.0034 0.0035
normal curve is 2.6 0.0036 0.0037 0.0038 0.0039 0.0040 0.0041 0.0043 0.0044 0.0045 0.0047
2.5 0.0048 0.0049 0.0051 0.0052 0.0054 0.0055 0.0057 0.0059 0.0060 0.0062
1 2.4 0.0064 0.0066 0.0068 0.0069 0.0071 0.0073 0.0075 0.0078 0.0080 0.0082
2.3 0.0084 0.0087 0.0089 0.0091 0.0094 0.0096 0.0099 0.0102 0.0104 0.0107
2.2 0.0110 0.0113 0.0116 0.0119 0.0122 0.0125 0.0129 0.0132 0.0136 0.0139
2.1 0.0143 0.0146 0.0150 0.0154 0.0158 0.0162 0.0166 0.0170 0.0174 0.0179
2 2.0 0.0183 0.0188 0.0192 0.0197 0.0202 0.0207 0.0212 0.0217 0.0222 0.0228
1.9 0.0233 0.0239 0.0244 0.0250 0.0256 0.0262 0.0268 0.0274 0.0281 0.0287
The distribution 1.8 0.0294 0.0301 0.0307 0.0314 0.0322 0.0329 0.0336 0.0344 0.0351 0.0359
1.7 0.0367 0.0375 0.0384 0.0392 0.0401 0.0409 0.0418 0.0427 0.0436 0.0446
is symmetric at 1.6 0.0455 0.0465 0.0475 0.0485 0.0495 0.0505 0.0516 0.0526 0.0537 0.0548
0 1.5 0.0559 0.0571 0.0582 0.0594 0.0606 0.0618 0.0630 0.0643 0.0655 0.0668
1.4 0.0681 0.0694 0.0708 0.0721 0.0735 0.0749 0.0764 0.0778 0.0793 0.0808
1.3 0.0823 0.0838 0.0853 0.0869 0.0885 0.0901 0.0918 0.0934 0.0951 0.0968
1.2 0.0985 0.1003 0.1020 0.1038 0.1056 0.1075 0.1093 0.1112 0.1131 0.1151
3 1.1 0.1170 0.1190 0.1210 0.1230 0.1251 0.1271 0.1292 0.1314 0.1335 0.1357
1.0 0.1379 0.1401 0.1423 0.1446 0.1469 0.1492 0.1515 0.1539 0.1562 0.1587
tail area 0.6 0.2451 0.2483 0.2514 0.2546 0.2578 0.2611 0.2643 0.2676 0.2709 0.2743
0.5 0.2776 0.2810 0.2843 0.2877 0.2912 0.2946 0.2981 0.3015 0.3050 0.3085
0.4 0.3121 0.3156 0.3192 0.3228 0.3264 0.3300 0.3336 0.3372 0.3409 0.3446
0.3 0.3483 0.3520 0.3557 0.3594 0.3632 0.3669 0.3707 0.3745 0.3783 0.3821
0.2 0.3859 0.3897 0.3936 0.3974 0.4013 0.4052 0.4090 0.4129 0.4168 0.4207
0.1 0.4247 0.4286 0.4325 0.4364 0.4404 0.4443 0.4483 0.4522 0.4562 0.4602
Example 4.1
Suppose that the scores on an aptitude test are normally distributed with
a mean of 100 and standard deviation of 10.
Calculate Z
X m
Z
s
X x
Z
s
X m 90 100
Z 1 .0
s 10
and 115?
Solution
X m 90 100
Z1 1 .0
s 10
X m 115 100
Z2 1 .5
s 10
Table 1 critical value of standard normal curve
z 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0.00
3.0 0.0010 0.0010 0.0011 0.0011 0.0011 0.0012 0.0012 0.0013 0.0013 0.0013
2.9 0.0014 0.0014 0.0015 0.0015 0.0016 0.0016 0.0017 0.0018 0.0018 0.0019
2.8 0.0019 0.0020 0.0021 0.0021 0.0022 0.0023 0.0023 0.0024 0.0025 0.0026
2.7 0.0026 0.0027 0.0028 0.0029 0.0030 0.0031 0.0032 0.0033 0.0034 0.0035
2.6 0.0036 0.0037 0.0038 0.0039 0.0040 0.0041 0.0043 0.0044 0.0045 0.0047
2.5 0.0048 0.0049 0.0051 0.0052 0.0054 0.0055 0.0057 0.0059 0.0060 0.0062
2.4 0.0064 0.0066 0.0068 0.0069 0.0071 0.0073 0.0075 0.0078 0.0080 0.0082
2.3 0.0084 0.0087 0.0089 0.0091 0.0094 0.0096 0.0099 0.0102 0.0104 0.0107
2.2 0.0110 0.0113 0.0116 0.0119 0.0122 0.0125 0.0129 0.0132 0.0136 0.0139
2.1 0.0143 0.0146 0.0150 0.0154 0.0158 0.0162 0.0166 0.0170 0.0174 0.0179
2.0 0.0183 0.0188 0.0192 0.0197 0.0202 0.0207 0.0212 0.0217 0.0222 0.0228
1.9 0.0233 0.0239 0.0244 0.0250 0.0256 0.0262 0.0268 0.0274 0.0281 0.0287
1.8 0.0294 0.0301 0.0307 0.0314 0.0322 0.0329 0.0336 0.0344 0.0351 0.0359
1.7 0.0367 0.0375 0.0384 0.0392 0.0401 0.0409 0.0418 0.0427 0.0436 0.0446
1.6 0.0455 0.0465 0.0475 0.0485 0.0495 0.0505 0.0516 0.0526 0.0537 0.0548
1.5 0.0559 0.0571 0.0582 0.0594 0.0606 0.0618 0.0630 0.0643 0.0655 0.0668
1.4 0.0681 0.0694 0.0708 0.0721 0.0735 0.0749 0.0764 0.0778 0.0793 0.0808
1.3 0.0823 0.0838 0.0853 0.0869 0.0885 0.0901 0.0918 0.0934 0.0951 0.0968
1.2 0.0985 0.1003 0.1020 0.1038 0.1056 0.1075 0.1093 0.1112 0.1131 0.1151
1.1 0.1170 0.1190 0.1210 0.1230 0.1251 0.1271 0.1292 0.1314 0.1335 0.1357
1.0 0.1379 0.1401 0.1423 0.1446 0.1469 0.1492 0.1515 0.1539 0.1562 0.1587
0.9 0.1611 0.1635 0.1660 0.1685 0.1711 0.1736 0.1762 0.1788 0.1814 0.1841
0.8 0.1867 0.1894 0.1922 0.1949 0.1977 0.2005 0.2033 0.2061 0.2090 0.2119
0.7 0.2148 0.2177 0.2206 0.2236 0.2266 0.2296 0.2327 0.2358 0.2389 0.2420
0.6 0.2451 0.2483 0.2514 0.2546 0.2578 0.2611 0.2643 0.2676 0.2709 0.2743
0.5 0.2776 0.2810 0.2843 0.2877 0.2912 0.2946 0.2981 0.3015 0.3050 0.3085
0.4 0.3121 0.3156 0.3192 0.3228 0.3264 0.3300 0.3336 0.3372 0.3409 0.3446
0.3 0.3483 0.3520 0.3557 0.3594 0.3632 0.3669 0.3707 0.3745 0.3783 0.3821
0.2 0.3859 0.3897 0.3936 0.3974 0.4013 0.4052 0.4090 0.4129 0.4168 0.4207
0.1 0.4247 0.4286 0.4325 0.4364 0.4404 0.4443 0.4483 0.4522 0.4562 0.4602
We wish to find the shaded area from Z=-1.0 to Z=1.5
15.87% 6.68%
77.45%
-1.0 0 1.5
X m
Z1 0.625
s
X m 124 100
Z2 1 .5
s 16
Table 1 critical value of standard normal curve
z 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0.00
3.0 0.0010 0.0010 0.0011 0.0011 0.0011 0.0012 0.0012 0.0013 0.0013 0.0013
2.9 0.0014 0.0014 0.0015 0.0015 0.0016 0.0016 0.0017 0.0018 0.0018 0.0019
2.8 0.0019 0.0020 0.0021 0.0021 0.0022 0.0023 0.0023 0.0024 0.0025 0.0026
2.7 0.0026 0.0027 0.0028 0.0029 0.0030 0.0031 0.0032 0.0033 0.0034 0.0035
2.6 0.0036 0.0037 0.0038 0.0039 0.0040 0.0041 0.0043 0.0044 0.0045 0.0047
2.5 0.0048 0.0049 0.0051 0.0052 0.0054 0.0055 0.0057 0.0059 0.0060 0.0062
2.4 0.0064 0.0066 0.0068 0.0069 0.0071 0.0073 0.0075 0.0078 0.0080 0.0082
2.3 0.0084 0.0087 0.0089 0.0091 0.0094 0.0096 0.0099 0.0102 0.0104 0.0107
2.2 0.0110 0.0113 0.0116 0.0119 0.0122 0.0125 0.0129 0.0132 0.0136 0.0139
2.1 0.0143 0.0146 0.0150 0.0154 0.0158 0.0162 0.0166 0.0170 0.0174 0.0179
2.0 0.0183 0.0188 0.0192 0.0197 0.0202 0.0207 0.0212 0.0217 0.0222 0.0228
1.9 0.0233 0.0239 0.0244 0.0250 0.0256 0.0262 0.0268 0.0274 0.0281 0.0287
1.8 0.0294 0.0301 0.0307 0.0314 0.0322 0.0329 0.0336 0.0344 0.0351 0.0359
1.7 0.0367 0.0375 0.0384 0.0392 0.0401 0.0409 0.0418 0.0427 0.0436 0.0446
1.6 0.0455 0.0465 0.0475 0.0485 0.0495 0.0505 0.0516 0.0526 0.0537 0.0548
1.5 0.0559 0.0571 0.0582 0.0594 0.0606 0.0618 0.0630 0.0643 0.0655 0.0668
1.4 0.0681 0.0694 0.0708 0.0721 0.0735 0.0749 0.0764 0.0778 0.0793 0.0808
1.3 0.0823 0.0838 0.0853 0.0869 0.0885 0.0901 0.0918 0.0934 0.0951 0.0968
1.2 0.0985 0.1003 0.1020 0.1038 0.1056 0.1075 0.1093 0.1112 0.1131 0.1151
1.1 0.1170 0.1190 0.1210 0.1230 0.1251 0.1271 0.1292 0.1314 0.1335 0.1357
1.0 0.1379 0.1401 0.1423 0.1446 0.1469 0.1492 0.1515 0.1539 0.1562 0.1587
0.9 0.1611 0.1635 0.1660 0.1685 0.1711 0.1736 0.1762 0.1788 0.1814 0.1841
0.8 0.1867 0.1894 0.1922 0.1949 0.1977 0.2005 0.2033 0.2061 0.2090 0.2119
0.7 0.2148 0.2177 0.2206 0.2236 0.2266 0.2296 0.2327 0.2358 0.2389 0.2420
0.6 0.2451 0.2483 0.2514 0.2546 0.2578 0.2611 0.2643 0.2676 0.2709 0.2743
0.5 0.2776 0.2810 0.2843 0.2877 0.2912 0.2946 0.2981 0.3015 0.3050 0.3085
0.4 0.3121 0.3156 0.3192 0.3228 0.3264 0.3300 0.3336 0.3372 0.3409 0.3446
0.3 0.3483 0.3520 0.3557 0.3594 0.3632 0.3669 0.3707 0.3745 0.3783 0.3821
0.2 0.3859 0.3897 0.3936 0.3974 0.4013 0.4052 0.4090 0.4129 0.4168 0.4207
0.1 0.4247 0.4286 0.4325 0.4364 0.4404 0.4443 0.4483 0.4522 0.4562 0.4602
We wish to find the area when Z<-0.625 or Z>1.5
26.43% 6.68%
-0.625 0 1.5
1. Assume that the age at onset of disease B is distributed normally
with a mean of 50 years and a standard deviation of 12 years.
What is the probability that an individual afflicted with disease B
developed it before age 35?
2. The weights of 18-24 years old women are normally distributed
with a mean of 65 kg and a standard deviation of 13.6kg. If one
randomly selected 150 of these women age 18-24, how many of
them would be expected to weight 50-70 kg?
3. An instructor is administering a final examination. She tells her
class that she will give an A grade to the 10% of the students
who earn the highest marks. Past experience with the same
examination has yielded grades that are normally distributed
with a mean of 70 and SD of 10. If the same present class runs
true to form, what numerical score would a student need to earn
an A grade?