You are on page 1of 13

STA408: Statistics for Science and Engineering

Chapter 1: Probability Distribution

1.1 Probability

Sample Space
The set of all possible outcomes of a statistical experiment is called a sample space and is represented
by the symbol S.

Example 1: Consider tossing a fair die.


If we are interested in the number that shows on the top face, the sample space is
S  {1, 2, 3, 4, 5, 6}.

If we are interested in whether the number is odd or even, the sample space will be
S  {odd, even}.

Events
An event is a subset of a sample space.

Example 2: Consider tossing a fair die.


Let event A be the outcome that when the die is tossed, the number shows on the top face
is divisible by 3.
A  {3, 6}

Complement
The complement of an event A with respect to S is the subset of all elements in S that are not in A. We
denote the complement of A by the symbol A.

Example 3: Consider the event in Example 2.


The complement of event A is
A  {1, 2, 4, 5}.

Probability of an Event
The probability of an event A is the sum of all weights/probabilities of all sample points in A and is
denoted by 𝑃(𝐴). Therefore
0 ≤ 𝑃(𝐴) ≤ 1, 𝑃(∅) = 0, and 𝑃(𝑆) = 1.

If an experiment can result in any one of N different equally likely outcomes, and if exactly n of these
outcomes correspond to event A, then the probability of event A is
𝑛
𝑃(𝐴) = .
𝑁

If A and A are complementary events, then


𝑃(𝐴) + 𝑃(𝐴′) = 1.
STA408 Chapter 1: Probability Distribution

1.2 Random Variables

Concept of a Random Variable


A random variable is a function that associates a real number with each element in the sample space.

Example 4: The sample space giving a detailed description of each possible outcome when three
electronic components are tested may be written as
S  {NNN, NND, NDN, DNN, NDD, DND, DDN, DDD}
where N denotes non-defective and D denotes defective.

Let the random variable X be the number of defective items when three electronic
components are tested, then

Sample Space 𝑥
NNN 0
NND or NDN or DNN 1
NDD or DND or DDN 2
DDD 3

Discrete Sample Space


If a sample space contains a finite number of possibilities or an unending sequence with as many
elements as there are whole numbers, it is called a discrete sample space.

A random variable is called a discrete random variable if its set of possible outcomes are countable.

Example 5: Some discrete random variables:


- Number of defective items
- Number of calls in a minute
- Number of accidents on the highway

Continuous Sample Space


If a sample space contains an infinite number of possibilities equal to the number of points on a line
segment, it is called a continuous sample space.

When a random variable can take on values on a continuous scale, it is called a continuous random
variable. In most practical problems, continuous random variables represent measured data.

Example 6: Some continuous random variables:


- Heights
- Weights
- Time
- Distance

2
STA408 Chapter 1: Probability Distribution

1.3 Special Discrete Probability Distribution

Binomial Distribution

A binomial experiment is a probability experiment that satisfies the following four requirements.

 There must be a fixed number of trials.


 Each trial can have only two (2) outcomes or the outcomes that can be reduced to two (2). These
outcomes can be considered as either a success or a failure.
 The outcomes of each trial must be independent of one another.
 The probability of success must remain the same for each trial.

The outcomes of a binomial experiment and the corresponding probabilities of these outcomes are called
a binomial distribution.

Notation for symbols used in a Binomial Distribution

𝒑 Probability of a success
𝒒 or 𝟏 − 𝒑 Probability of a failure
𝒏 Number of trials
𝒙 The number of successes in 𝑛 trials.
Note: 𝑥 = 0, 1, 2, … , 𝑛; and 𝑝 + 𝑞 = 1.

Example 7
Ten percent of all DVD players manufactured by a large electronic company are defective. Five DVD
players are randomly selected from the production line of this company. The selected DVD players are
inspected to determine whether each of them is defective or good. Is this experiment a binomial
experiment?
 There are _______________ DVD players (_______________ trials);
 ___________________outcomes, i.e., ______________________or _______________________ ;
 Each DVD player is ______________________________________ of each other;
 The probability of a defective DVD player is __________________, i.e., 𝑝 = _________________.
Since the four conditions of a binomial experiment are satisfied this is an example of a binomial
experiment.

Binomial Probability Formula


In a binomial experiment, the probability of exactly 𝑋 successes in 𝑛 trials is
𝑛
𝑃(𝑋 = 𝑥) = ( ) 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥 ; 𝑥 = 0, 1, 2, … , 𝑛 .
𝑥
Note that 𝑋 is a discrete random variable.

If a random variable 𝑋 has a binomial distribution with 𝑛 trials and the probability of success of each trial
is given as 𝑝, then 𝑋 can be written as
𝑋 ~ 𝐵𝑖𝑛(𝑛, 𝑝)

3
STA408 Chapter 1: Probability Distribution

Example 8
Refer to Example 7, what is the probability that exactly one of these five DVD players is defective?

Example 9
A fair coin is tossed 3 times. Find the probability of getting exactly two heads.

Example 10
A bag contains a large number of beads of which 45% are yellow. A random sample of 10 beads is taken
from the bag. Use a binomial distribution to calculate the probability that the sample contains
(a) more than 6 yellow beads; (b) exactly 6 yellow beads;
(c) at most 6 yellow beads; (d) less than 6 yellow beads;
(e) at most 6 beads which are not yellow.

4
STA408 Chapter 1: Probability Distribution

The Mean (Expected Value), Variance and Standard Deviation of a Binomial Distribution

Mean, 𝜇 𝑛𝑝
Variance, 𝜎 2 𝑛𝑝𝑞 = 𝑛𝑝(1 − 𝑝)
Standard Deviation, 𝜎 √𝑛𝑝𝑞 = √𝑛𝑝(1 − 𝑝)

Example 11
A coin is tossed 3 times. Find the mean, variance and standard deviation for the number of tails that will
be obtained.

Example 12
A die is rolled 350 times. Find the expected value, variance and standard deviation for the number of
fours that will be rolled.

Poisson Distribution

A discrete probability distribution that is useful when

 Sample size, 𝒏 is large and probability, p is small.


 The independent variables occur over a period of time, or when a density of items is distributed
over a given area / volume.

5
STA408 Chapter 1: Probability Distribution

Poisson Probability Formula


The probability of 𝑋 occurrences in an interval of time, volume, area, etc., for a variable where 𝜆 is the
mean number of occurrences per unit (time, volume, area, etc.) is
𝑒 −𝜆 𝜆𝑥
𝑃(𝑋 = 𝑥) = ; 𝑥 = 0, 1, 2, …
𝑥!
where 𝑒 is a constant 2.718281 … .

If a random variable 𝑋 has a Poisson distribution with a mean 𝜆, then 𝑋 can be written as
𝑋 ~ 𝑃𝑜(𝜆)

Example 13
During a laboratory experiment, the average radioactive particles passing through a counter in 1
millisecond is 4. What is the probability that in a given millisecond
(a) no particles enter the counter, (b) at least 4 particles enter the counter,
(c) more than 4 particles enter the counter (d) at most 4 particles enter the counter.

Example 14
The number of accidents per week at a certain road intersection has a Poisson distribution with
parameter 2.5. Find the probability that
(a) exactly 5 accidents will occur in a week;
(b) at most 5 accidents will occur in two weeks;
(c) less than 5 accidents will occur in four weeks;
(d) more than 1 accident will occur in three weeks.

6
STA408 Chapter 1: Probability Distribution

Example 15
Eight is the average number of oil tankers arriving each day at a certain port. The facilities at the port can
handle at most 15 tankers per day.
(a) What is the probability that on a given day tankers have to be turned away.
(b) Assume that there are 24 hours in a day. What is the probability that there is no tanker arriving
in 6 hours?

Mean (Expected value), Variance and Standard Deviation for a Poisson Distribution

Mean, 𝜇 𝜆
Variance, 𝜎 2 𝜆
Standard Deviation, 𝜎 √𝜆

Example 16
Refer to Example 15 where the number of accidents per week at a certain road intersection has a
Poisson distribution with parameter 2.5. Let 𝑋 be the number of accidents per week, find the expected
number of accidents per week. Find also the variance and standard deviation for the distribution of 𝑋.

7
STA408 Chapter 1: Probability Distribution

Probability of Success and the Shape of the Binomial Distribution


For any number of trials n:
 The binomial probability distribution is symmetric if 𝑝 = 0.5.
 The binomial probability distribution is skewed to the right if 𝑝 < 0.5.
 The binomial distribution is skewed to the left if 𝑝 > 0.5.

Probability distribution of 𝑥 for


𝑛 = 4 and 𝑝 = 0.5
𝑥 𝑃(𝑥)
0 0.0625
1 0.2500
2 0.3750
3 0.2500
4 0.0625

Probability distribution of 𝑥 for


𝑛 = 4 and 𝑝 = 0.3
𝑥 𝑃(𝑥)
0 0.2401
1 0.4116
2 0.2646
3 0.0756
4 0.0081

Probability distribution of 𝑥 for


𝑛 = 4 and 𝑝 = 0.8
𝑥 𝑃(𝑥)
0 0.0016
1 0.0256
2 0.1536
3 0.4096
4 0.4096

8
STA408 Chapter 1: Probability Distribution

Mean and the Shape of the Poisson Distribution


The form of Poisson distribution becomes more and more symmetric, even bell-shaped, as the mean
grows large. The figure below shows the plots of probability function for 𝜇 = 0.1, 𝜇 = 2 and 𝜇 = 5.

Poisson Distribution as an Approximation to Binomial Distribution


In the case of binomial, if 𝑛 is quite large and 𝑝 is small, the conditions begin to simulate the continuous
space or time implications of the Poisson process. Therefore,
 If 𝑛 is large and 𝑝 is close to 0, the Poisson distribution can be used, with 𝜇 = 𝑛𝑝, to approximate
binomial probabilities.
 If 𝑝 is close to 1, can still use the Poisson distribution to approximate binomial probabilities by
interchanging what we have defined to be a success and a failure, i.e., by changing 𝑝 to a value
close to 0.

Let 𝑋 be a binomial random variable with probability distribution 𝐵𝑖𝑛(𝑛, 𝑝). When 𝑛 → ∞, 𝑝 → 0 and
𝑛→∞
𝑛𝑝 → 𝜇 remains constant,
𝑛→∞
𝐵𝑖𝑛(𝑛, 𝑝) → 𝑃𝑜(𝜇)

Example 17
In a certain industrial facility, accidents occur frequently. It is known that the probability of an accident
on any given day is 0.005 and accidents are independent of each other. What is the probability that any
given period of 400 days there will be an accident on one day?

9
STA408 Chapter 1: Probability Distribution

1.4 Continuous Probability Distribution

Probability of a continuous random variable


𝑃(−1 ≤ 𝑥 ≤ 2) = area bounded by the curve, 𝑥-axis, 𝑥 = −1 and 𝑥 = 2.

The probability of a continuous random variable 𝑥 between the values 𝑎 and 𝑏 is the area under the
curve and between the vertical lines 𝑥 = 𝑎 and 𝑥 = 𝑏 as given below.
𝑃(𝑎 ≤ 𝑥 ≤ 𝑏) = 𝑃(𝑎 < 𝑥 < 𝑏)
= 𝑃(𝑎 ≤ 𝑥 < 𝑏)
= 𝑃(𝑎 < 𝑥 ≤ 𝑏)

Note: The probability that a continuous random variable 𝑥 assumes a single value is always zero
because a single line has no area under the curve, i.e.,

𝑃(𝑥 = 𝑎) = 𝑃(𝑥 = 𝑏) = 0

A Special Continuous Probability Distribution – Normal Distribution


A normal distribution is continuous, symmetric and bell-shaped distribution of a variable.

Figure 3: A normal distribution curve

Properties of a normal distribution


 A normal distribution curve is bell-shaped.
 The mean, median and mode are equal and located in the centre of the distribution.
 The curve is symmetric about the mean.
 The curve is continuous.
 The curve never touches the 𝑥-axis.
 The total area under the curve is 1 or 100%.

10
STA408 Chapter 1: Probability Distribution

If the random variable 𝑋 has a normal distribution with a mean 𝜇 and a standard deviation 𝜎 then 𝑋 can
be written as
𝑋 ~ 𝑁(𝜇, 𝜎 2 ).

Standard Normal Distribution


To evaluate probabilities associated with normal distributions, the standard normal distribution is
used.
The standard normal distribution is a normal distribution with a mean of 0 and a standard deviation
of 1, denoted by
𝑍 ~ 𝑁(𝜇 = 0, 𝜎 2 = 12 ).

All normally distributed variables, 𝑋 can be transformed into the standard normally distributed variable
𝑍 by the formula for a standard score:
𝑥−𝜇
𝑧=
𝜎

Example 18
Given 𝑍 ~ 𝑁(0, 12 ), find

(a) 𝑃(𝑍 > 0.0) = (b) 𝑃(𝑍 > 1.0) = (c) 𝑃(𝑍 < −1.0) =

1.00 0 1.00 𝑧 1.00 0 1.00 𝑧 1.00 0 1.00 𝑧

(d) 𝑃(𝑍 < 1.0) (e) 𝑃(𝑍 > −1.0) (f) 𝑃(−1.0 < 𝑍 < 1.0)

1.00 0 1.00 𝑧 1.00 0 1.00 𝑧 1.00 0 1.00 𝑧

(g) 𝑃(𝑍 < −1.0 or 𝑍 > 1.0) = (h) 𝑃(𝑍 = 1.0) =

1.00 0 1.00 𝑧
1.00 0 1.00 𝑧

11
STA408 Chapter 1: Probability Distribution

Example 19
Given that 𝑋 ~ 𝑁(56, 102 ), find
(a) 𝑃(𝑋 > 68) (b) 𝑃(56 < 𝑋 < 65) (c) 𝑃(42 ≤ 𝑋 ≤ 52) (d) 𝑃(𝑋 = 59)
(e) 𝑃(52 < 𝑋 < 65)

(a) 𝑃(𝑋 > 68)

0 𝑧

(b) 𝑃(56 < 𝑋 < 65)

0 𝑧

(c) 𝑃(42 ≤ 𝑋 ≤ 52)

0 𝑧

(d) 𝑃(𝑋 = 59)


(e) 𝑃(52 < 𝑋 < 65)

0 𝑧

Example 20
Given a standard normal distribution, find the value of 𝑘 such that
(a) 𝑃(𝑍 > 𝑘) = 0.3015 (b) 𝑃(𝑘 < 𝑍 < −0.18) = 0.4197

(a)

0 𝑧

(b)

0 𝑧

12
STA408 Chapter 1: Probability Distribution

Example 21
Given a random variable 𝜇 = 40 and 𝜎 = 6, find the value of 𝑥 that has
(a) 15% of the area to the right and (b) 45% of the area to the left.

(a)

0 𝑧

(b)

0 𝑧

Example 22
During production in a cement plant, test cubes of cement are taken at regular intervals and their
compressive strengths, in kg cm2, are determined. Analysis of data over a long time has shown that
compressive strengths is normally distributed with a mean of 468 kg cm2 and a standard deviation of 16
kg cm2. Calculate the probability that a randomly chosen cube has a compressive strength
(a) greater than 480kg cm2; (b) between 450 kg cm2 and 475 kg cm2.

13

You might also like