Professional Documents
Culture Documents
3200chap3 wk5
3200chap3 wk5
ii. Toss a fair coin and let X be the number of tosses until the third
head occurs.
• A continuous random variable can take any value within a finite or in-
finite interval of the real number line (−∞, ∞).
i. Let X be the pH level of a lake.
1
Properties of the CDF:
i. It is non-decreasing: If a ≤ b then F (a) ≤ F (b).
ii. F (−∞) = 0 and F (∞) = 1.
iii. If a < b, then P (a < X ≤ b) = F (b) − F (a).
Example: Toss a fair coin 3 times and let X be the number of heads.
Outcomes:
2
Properties of the CDF of a discrete random variable: Let x1 < x2 <
· · · denote the possible values of X. Then
i. F is a step function with jumps occurring only at the values x of SX .
The size of the jump at each x of SX equals p(x).
ii. The CDF can be obtained from the PMF:
X
F (x) = p(xi )
xi ≤x
3
Density Function for a Continuous Random Variable
• For a continuous random variable P (X = x) = 0 for all x.
• The CDF is a continuous function.
Example: We say X is the uniform in [0, 1] random variable if X has CDF
0, x < 0
F (x) = P (X ≤ x) = x, 0 ≤ x ≤ 1
1, x > 1
4
Example: If X ∼ U (0, 1), the PDF is
(
1, 0 ≤ x ≤ 1
f (x) =
0, elsewhere
Let X be a continuous random variable with PDF f (x) and CDF F (x).
Then
R∞
• −∞ f (x)dx = 1
• The CDF can be obtained from the PDF:
Z x
F (x) = P (X ≤ x) = f (y)dy
−∞
Find k.
5
Example: Let X be a random variable with CDF
(
0, x≤0
F (x) =
1 − e−x , x > 0
6
Note: We can simulate a random sample of size n from a U (A, B) distribu-
tion using runif(n, A, B) in R.
Expected Value
For a discrete random variable X with sample space SX and PMF p(x),
the expected value is X
E(X) = µX = xp(x)
x∈SX
x -1 1 2
p(x) 13 16 12
For a continuous random variable X with PDF f (x), the expected value is
Z ∞
E(X) = µX = xf (x)dx
−∞
7
Example: Let X ∼ U (−3, 2). Find E(X).
8
Note: If h(x) = ax + b and Y = aX + b, then
E(h(X)) = aE(X) + b
If Y = X 3 , find E(Y ).
9
Variance and Standard Deviation
• The variance σX
2
, or Var(X), of a random variable X is
2
σX = E[(X − µX )2 ]
where µX = E(X) is the expected value of X.
2
Note: If the variance of X is σX and Y = a + bX, then
σY2 = b2 σX
2
10
Population Percentiles
• For continuous random variables.
Let X be a continuous random variable with CDF F and α a number between
0 and 1. The 100(1 − α)-th percentile of X is xα such that
F (xα ) = P (X ≤ xα ) = 1 − α
11
Models for Discrete Random Variables
Recall: A discrete random variable is a random variable whose sample space
has a finite or at most a countably infinite number of values.
• We can group some discrete random variables into classes.
Bernoulli Distribution
• A Bernoulli trial or experiment is one whose outcome is either a success
or a failure.
• Notation: X ∼ Bern(p).
The PMF:
x 0 1
p(x) 1 − p p
What is the expected value of X?
12
Example: The probability that an electronic product will last more that 5000
hours is 0.05. Let X take the value 1 if a randomly selected product lasts
more than 5000 hours and the value 0 otherwise. Find the mean value and
variance of X.
Binomial Distribution
• A binomial experiment is when n Bernoulli experiments, each having
probability of success p, are performed independently.
• The binomial random variable Y is the number of successes in the n
Bernoulli trials.
13
Example: Suppose 70% of all purchases in a certain store are made with a
credit card. Let Y denote the number of credit card uses in the next 10
purchases. What is P (5 ≤ Y ≤ 8)?
Hypergeometric Distribution
• Suppose a population consists of M1 objects labeled 1 and M2 objects
labeled 0, and that a sample of size n is selected at random without
replacement.
• The hypergeometric random variable X is the number of objects labeled
1 in the sample.
• Notation: X ∼ Hyp(M1 , M2 , n)
14
The PMF:
M1 M2
x n−x
p(x) = P (X = x) = M1 +M2
n
Example: A crate contains 50 light bulbs of which 5 are defective and 45 are
not. A quality control inspector randomly samples 4 bulbs without replace-
ment. Let X be the number of defective bulbs in the sample.
15
Example: What is the expected number of defective light bulbs in the sam-
ple? What is the variance?
• For large population size N , the difference between sampling with and
without replacement is very small.
• Suppose X ∼ Hyp(M1 , M2 , n) and n
N ≤ 0.05 where N = M1 + M2 . Then
P (X = x) ≈ P (Y = x)
M1
where Y ∼ Bin(n, p = N ).
Example: Suppose M1 = 100, M2 = 900, and n = 25. Find P (X = 3).
i. Using the hypergeometric distribution:
> dhyper(3,100,900,25)
[1] 0.229574
16
ii. Using the binomial approximation:
> dbinom(3,25,0.1)
[1] 0.2264973
Geometric Distribution
• In a geometric experiment, independent Bernoulli trials, each with prob-
ability of success p, are performed until the first success occurs.
• The geometric random variable X is the number of trials up to and in-
cluding the first success.
• Notation: X ∼ Geo(p)
The PMF:
p(x) = P (X = x) = (1 − p)x−1 p, x = 1, 2, 3, . . .
The CDF:
F (x) = P (X ≤ x) = 1 − (1 − p)x , x = 1, 2, 3, . . .
17
Example: Suppose you need to find a store that carries a special printer ink.
You know that of the stores that carry printer ink, 15% of them carry the
special ink. You randomly call each store until one has the ink you need.
a. What is the probability that you first find the special ink at the third
store you call?
b. What is the probability you first find the special ink in 3 calls or less?
c. What is the expected number of calls until you first find the special ink?
What is the variance?
18
• R will compute the PMF, P (Y = y): dnbinom(y − r,r,p)
• R will also compute the CDF, P (Y ≤ y): pnbinom(y − r,r,p)
Example: An oil company conducts a geological study that indicates that an
exploratory oil well should have a 20% chance of striking oil.
a. What is the probability the first strike comes on the third well drilled?
b. What is the probability that the third strike comes on the seventh well
drilled?
> dnbinom(4,3,0.2)
[1] 0.049152
Poisson Distribution
• The Poisson distribution is used to model the probability that a number
of events occur in an interval of time or space.
• The Poisson random variable X denotes the number of events that oc-
curred.
• Notation: X ∼ Poisson(λ)
19
Examples:
i. Let X equal the number of cars passing through an intersection in one
minute.
ii. Let X equal the number of students arriving during office hours.
iii. Let X equal the number of Alaskan salmon caught in a squid driftnet.
The PMF:
e−λ λx
p(x) = P (X = x) = , x = 0, 1, 2, . . . , λ>0
x!
The mean and variance for X ∼ Poisson(λ) are
2
E(X) = λ σX =λ
Example: Let X equal the number of typos on a printed page with a mean
of 3 typos per page.
a. What is the probability of one typo on a randomly selected page?
20
a. Find the probability of no more than two colds for a person taking sup-
plements and a person not taking supplements.
> ppois(2,3)
[1] 0.4231901
> ppois(2,5)
[1] 0.124652
Exponential Distribution
• The exponential distribution is often used to model lifetimes of equip-
ment or waiting times until events occur.
• Notation: X ∼ Exp(λ)
21
The PDF: (
λe−λx , x ≥ 0
f (x) =
0, x<0
22
Example: Let X be the amount of time (in minutes) a postal clerk spends
with his or her customer. The time spent has an exponential distribution
with λ = 0.25.
a. Find the expected time spent with each customer.
b. What is the probability the clerk spends between 2 and 4 minutes with
a customer?
Example: The number of miles that a particular car can run before its bat-
tery wears out is exponentially distributed with an average of 10,000 miles.
The owner of the car needs to take a 5000-mile trip. What is the probability
that he will be able to complete the trip without having to replace the car
battery?
23
Normal Distribution
• A standard normal random variable Z has PDF and CDF:
Z z
1 −z 2 /2
φ(z) = √ e and Φ(z) = φ(y)dy
2π −∞
for −∞ < z < ∞.
24
R commands for X ∼ N (µ, σ 2 ):
25
Properties of Normal Random Variables
X−µ
ii. σ ∼ N (0, 1)
iii. Let xα denote the (1 − α)-100th percentile of X, and let zα denote the
(1 − α)-100th percentile of Z. Then
xα = µ + σzα
26
b. The demand for the alternative brand of tires is such that 30% of the
total output should be sold under the alternative brand name. What
should the critical thickness, originally 7.9 mm, be set at in order to
meet the demand?
Q-Q Plots
How do we know when to use the normal distribution to model the dis-
tribution of data from a random sample?
• Plot the sample percentiles against the percentiles from the normal dis-
tribution.
• If the normal distribution is a good approximation, the plotted points
should fall approximately on a straight line.
Example: The vector x in R contains data for which the normal distribution
is a good approximation.
> qqnorm(x)
> qqline(x)
27
Example: The vector y in R contains data for which the normal distribution
is not a good approximation.
> qqnorm(y)
> qqline(y)
28