Professional Documents
Culture Documents
distributions
1 / 41
Random Variables
2 / 41
Random Variables
3 / 41
Reading
4 / 41
Types of random variables
5 / 41
Discrete random variables
6 / 41
Distribution of discrete random variables
7 / 41
Bar graph showing a distribution
0.4
0.3
Probabilities
0.2
0.1
0.0
0 1 2 3
8 / 41
S = {HHHH, HHHT, HHTH, ..., TTTT}
P(HHHH)
Probability = P(HHHT) = ... = P(TTTT) = 1/16
Distributions
• X is a discrete random variable with 5 possible
For a discrete random variable, a probability distribution is a table
values {0, 1, 2, 3, 4}
(often shown as a graph) of all disjoint outcomes and their
• Its probabilities.
associated probability distribution is:
Value 0 1 2 3 4
For example, counting the number of heads, X in 4 tosses of a fair
Probability 1/16 4/16 6/16 4/16 1/16
coin:
44
If the variable X records the number of heads, then X is a random
variable and the graph show its distribution.
9 / 41
Using a simulation to construct a probability
distribution
10 / 41
Expectation of a discrete random variable
E (X ) = x1 P(X = x1 ) + · · · + xk P(X = xk )
k
X
= xi P(X = xi )
i=1
11 / 41
Simple example
12 / 41
Variance and SD of a discrete random
variable
If X takes on outcomes x1 , . . . , xk with probabilities P(X = x1 ),
. . . , P(X = xk ) and expected value µ = E (X ), then the variance of
X , denoted by Var(X ) or the symbol σ 2 , is
13 / 41
Coin tossing again
p √
The standard deviation is 3/4 = 3/2 = 0.866.
14 / 41
Continuous random variables
15 / 41
The Normal distribution
16 / 41
Continuous distributions
17 / 41
Probabilities for continuous distributions
Two important features of continuous distributions
68%
95%
99.7%
µ − 3σ µ − 2σ µ−σ µ µ+σ µ + 2σ µ + 3σ
19 / 41
A normal example
The distribution of test scores on the SAT and the ACT are both
nearly normal.
Suppose that one student scores an 1800 on the SAT (Student A)
and another student scores a 24 on the ACT (Student B). Which
student performed better?
Student A
X
Student B
11 16 21 26 31
20 / 41
A normal example . . .
• SAT scores are N(1500, 300). ACT scores are N(21, 5).
• xA represents the score of Student A; xB represents the score of
Student B.
xB − µACT 24 − 21
ZB = = = 0.6
σACT 5
21 / 41
Calculating normal probabilities (I)
What is the percentile rank for a student who scores an 1800 on the
SAT for a year in which the scores are N(1500, 300)?
pnorm(1)
## [1] 0.8413447
22 / 41
Calculating normal probabilities (II)
What score on the SAT would put a student in the 99th percentile?
qnorm(0.99)
## [1] 2.326348
23 / 41
Calculating normal probabilities (II). . .
X = σZ + µ
X =σZ + µ
=300(2.33) + 1500
=2199
24 / 41
The Binomial distribution
25 / 41
Binomial random variables (3.2 in OI Biostat)
26 / 41
The bionomial coefficient
items from a set of size n, where the order of the choice is ignored.
Mathematically,
!
n n!
=
x x !(n − x )!
• n = 1, 2, . . .
• x = 0, 1, 2, . . . , n
• For any integer m, m! = (m)(m − 1)(m − 2) · · · (1)
27 / 41
Formula for the binomial distribution
# of trials
P(x successes) = p # of successes (1 − p)# of trials - # of successes
# of successes
!
n x
P(X = x ) = p (1 − p)n−x , x = 0, 1, 2, . . . , n
x
• n = number of trials
• p = probability of success
28 / 41
Mean and standard deviation for binomial
random variable
• Mean = np
• Standard Deviation =
p
np(1 − p)
29 / 41
Calculating binomial probabilities in R
• dbinom(a, n, p) = P(X = a)
• pbinom(a, n, p) = P(X ≤ a)
30 / 41
Normal approximation to the binomial
• We will not cover the formula since you will always be able to
use R to calculate binomial probabilities.
• Will not be on an exam or quiz
• Read section 3.3.6, OI Biostat if you wish to know more details.
31 / 41
The Poisson distribution
32 / 41
Introduction to the Poisson distribution
(Section 3.4 in OI Biostat)
33 / 41
Example: Outbreaks of childhood leukemia
34 / 41
Poisson Distribution
Suppose events occur over time in such a way that
e −λ λx
P(X = x ) = , x = 0, 1, 2, . . .
x!
e −λt (λt)x
P(X = x ) = , x = 0, 1, 2, . . .
x!
Poisson Distribution
0.25
0.20
Probability
0.15
0.10
0.05
0.00
0 2 4 6 8 10
x
36 / 41
Poisson mean and standard deviation
• The mean is λ.
√
• The standard deviation is λ
In t units
√ of time, the mean and standard deviation are, respectively,
λt and λt.
37 / 41
Childhood leukemia cases. Example 3.37, OI
Biostat
38 / 41
What about a city of size 75,000?
39 / 41
What is the probability of 8 cases over 5
years?
e −λ λx e (−2.25) (2.25)8
P(X = 8) = =
x! 8!
Easiest to calculate this in R using the function dpois() rather
than by hand. . .
Suppose X has a Poisson distribution with parameter λ.
## [1] 0.001717027
40 / 41
What is the probability of 8 or more cases?
Would 8 or more cases be a rare event?
Suppose X has a Poisson distribution with parameter λ.
## [1] 0.002267088
## [1] 0.002267088
41 / 41