You are on page 1of 6

Republic of the Philippines

North Eastern Mindanao State University


Rosario, Tandag City, Surigao del Sur 8300
Telefax No. 086-214-4221
Website: www.sdssu.edu.ph

iSTAT 1 – RESEARCH STATISTICS


First Semester, A. Y. 2022 - 2023

MODULE 2 - PROBABILITY DISTRIBUTIONS

Objectives – Probability distributions play an important role in the application of Statistics. They are used
to model the behavior of many variables of interest. In this lesson, the students can make a distinction
between discrete and continuous random variables and their probability distributions.

Random Variable
A random variable is a function whose value is a real number determined by each element in the
sample space. It provides a convenient way of expressing elements of a sample space as numbers. Capital
letters such as X, Y or Z are used to denote random variable.

Example 1. Suppose 3 coins are tossed simultaneously. Let X be the random variable denoting
the number of heads that may turn up. The possible values of X are 0, 1, 2 and 3.

Outcomes HHH HHT HTH HTT THH THT TTH TTT


X= Number of Heads 3 2 2 1 2 1 1 0

Types of Random Variable

Random variables are classified according to the values that they assume. The types of random
variable are discrete and continuous random variable.

Discrete Random Variable – a variable that assumes only a finite number of values
usually integers. Example: number of students in a class.
Continuous Random Variable – a variable which can assume all values between two
points in a continuous scale. Examples: weight, height, age.

Probability Distribution of a Random Variable


When making estimates about unknown population parameters, values are computed only based
on the sample. When a different sample is used, a different value resulted. The sample mean is an example
of statistic which takes on different values when a different sample is obtained in a population. These
computed values, called statistics are assumed to be values of a random variable. For every value that a
statistic takes on a particular sample, a corresponding probability can be computed. This leads us to the
idea of a probability distribution of a random variable.
Probability Distributions can be viewed as ordered pairs in which the first element is the value of
the random variable and the second element is the associated probability. Probability distribution provides
a way of measuring accuracy of the estimates which are based only on sample information.

Types of Probability Distributions


Discrete Probability Distribution – a table or a formula that provides a listing of all possible
values of a discrete random variable, X and its associated probabilities P(x).
Example 2. In example 1 above, the probability distribution of x is:

X 0 1 2 3
P(x) 1/8 3/8 3/8 1/8

A Useful Discrete Probability Distributions – The Binomial Probability Distribution


This is an example of a discrete probability distribution. Some statistical problems involved
repeated trails, which are independent and dichotomous (i.e. it involves two possible outcomes often called
success or failure). If all trials have identical probability of success, then this type of experiment is called
binomial experiment.

A binomial experiment is one that has the following properties:


1. The experiment consists of n repeated trials.
2. Each trial results in an outcome that maybe classified as a success or a failure.
3. The probability of success, denoted by p remains constant from trial to trail. The probability
of failure is q = 1 – p.
4. The repeated trails are independent.

In a binomial probability distribution, the probability of x successes in n trials is given by:


n!
P(x) = px qn – x for values of x = 0, 1, 2, 3 . . . , n.
x!(n – x)!

Example 3. Some field representatives of the Environmental Protection Agency are doing spot check on
water pollution on streams. Historically, 8 out of 10 such tests produce favorable result,
that is, no pollution. If the field group will perform 6 tests, what is the probability of getting
exactly 3 favorable results?

Solution: Let p = 8/10 or 0.8, q = 0.2, n = 6, and x=3


6!
P(3) = (0.8)3 (0.2) 3 = 0.08192
3! 3!
Example 4. By very careful screening applicants, the Bank of the Philippine Islands has been able to
limit bad debts losses on consumer loans to 8 percent. What is the probability that at least
four of the five applicants will repay their loans?

Solution: Let p = 0.92, q = 0.08 and n = 5 and x = 4 and 5


5!
P(4) = (0.92)4 (0.08)1 = 0.2866
4! 1!
5!
P(5) = (0.92)5 (0.08)0 = 0.6591
5! 0!
Probability of at least 4 of the 5 applicants to repay their loans = P(4) + P(5)
= 0.2866 + 0.659 = 0.9457

A Useful Continuous Probability Distributions – The Normal Probability Distribution


The normal distribution plays a fundamental role in statistics. In fact, most of the techniques we
learn today are based on the assumption that the data were generated from a normal distribution. While
real data are never exactly normal, the normal distribution is often a useful approximation to the true
population distribution. One advantage of the normal distribution stems from the fact that it is
mathematically tractable and nice results can be obtained. This is frequently not the case for other data-
generating distributions.

Properties of the normal curve

1. The normal curve is bell-shaped.


2. The peak of the curve (i.e., the mode) is the population mean μ.
3. The normal curve is symmetric about and centered at the mean μ. So the median — which
is the equal-areas point — is also the mean, the center of gravity.
4. The spread of the normal curve depends on the standard deviation σ: the larger σ is, the
flatter and more spread out is the distribution.

The points at which a change of curvature takes place are located at distance σ on either side of
the mean μ. Which of the two normal distributions above has a larger spread?

5. The normal curve is completely determined by the mean μ and the standard deviation σ.

That is, two normally distributed variables having the same mean and standard deviation must
have the same distribution. We often identify a normal curve by stating the corresponding mean μ and
standard deviation σ and calling those the parameters of the normal curve.

If X is normally distributed with mean μ and standard deviation σ, we write X ~ N(μ, σ2).

6. The total area under the normal curve is 1 (or 100%). Almost all the area lies within three
standard deviations of the mean.

7. The normal curve extends indefinitely in both directions, approaching, but never touching the
horizontal axis as it does so.

Consequently, once we know the mean and standard deviation of a normally distributed random
variable, we can obtain the percentage of all observations that lie within any specified range by
determining the corresponding area under its associated normal curve. Now the question is: How do we
find areas under a normal curve?

About 68% of the measurements fall within one standard deviation σ of the mean μ.
In symbols, we write P(μ − σ < X < μ + σ ) ≈ 0.68

About 95% of the measurements fall within two standard deviations σ of the mean μ.

In symbols, we write P(μ − 2σ < X < μ + 2σ ) ≈ 0.95


About 99.7% of the measurements fall within three standard deviations σ of the mean μ.

In symbols, we write P(μ − 3σ < X < μ + 3σ ) ≈ 0.997


Example 5. Students’ Achievement Test scores are normally distributed with mean μ = 500 and
standard deviation σ = 100. Why do SAT scores range between 200 and 800?
Answer. Almost all the possible observations of the normal random variable lie within three
standard deviations to either side of the mean.
Let X be the verbal SAT scores at a certain high school, where the verbal SAT scores follow a
normal distribution with mean 504 and standard deviation 111. We thus write X ~ N(504, 1112)
How do we draw this distribution? What percent of the observations fall within: 1σ of the mean μ,
2σ of μ, and 3σ of μ?

A standard normal distribution has mean μ = 0 and has standard deviation σ = 1.


Standard normal probabilities are given in Appendix A1 at the end part of this book.

How to use the normal table

The table gives the probability that the standard normal variable Z falls below some specified value
z. Read the value of z down and across the top row, and read the probability from the body of the table.

In the figure above, the area between the two values a and b as points in the horizontal axis of the
curve where a < b represents the probability that x lies between the two values denoted by P (a < x < b ).
This probability can be obtained using a Table of Areas under the Normal Curve (found at Appendix
1A or 1B) But there is a need to transform the normal random variable x into a standard normal random
variable z using the transformation: z = (x – μ) / σ. Consequently, we write: P (a < x < b ) = P ( (a – μ)/σ
< z < (b – μ)/σ ).
The table gives the area under the standard normal curve for the values of z ranging from -3.49 to
3.49. To illustrate the use of this table, let us find the probability that z is less than -0.67. In the normal
curve illustration below, our concern is the shaded area which is to the left of -0.67. First, we locate the
value of z equal to -0.6 in the leftmost column of the table then move across the row to the column 0.07
where we read 0.2514. Thus, we write: P (z < -0.67) = 0.2514 or 25.14%

Example 6. Find P(Z > 2.33)

Standardizing a normally distributed variable

Subtraction of μ converts the mean to 0 and division by σ converts the standard deviation to 1.

Example 7. Given a normal distribution with a mean of 50 and a standard deviation of 10, find
the probability that a random variable x takes a value between 45 and 62?
Solution: The z values corresponding to x1 = 45 and x2 = 62 are respectively:
z1 = (45 – 50) / 10 = -0.5 and z2 = (62 – 50) / 10 = 1.2
Thus, P(45 < x < 62) = P(-0.5< z < 1.2) = P(z < 1.2) – P(z < -0.5)
= 0.8849 – 0.3085 = 0.5764 or 57.64%

PROBLEM SET 2

I. Binomial Distribution

1. A bank hires 12 MBA’s each year and assigns them in various divisions. After a year’s experience, the
chance that anyone MBA will be performing satisfactorily have historically been 40%. In this year’s
group, what is the probability that 6 or more will be performing satisfactorily?

2. The probability that a patient recovers from a delicate heart operation is 80%. What is the probability
that at least 4 of the next 9 patients having this operation survives?

3. A nationwide survey of seniors in big universities reveals that 60% disapprove of wearing school
uniform. If 15 seniors are selected at random and asked about their opinion, find the probability that
the number who disapprove of wearing school uniform is;
a. Anywhere from 7 to 9.
b. At most 5.
c. Not less than 8.

II. Normal Distribution

1. A research scientist reports that mice will live an average of 42 months when their diets are sharply
restricted and then enriched with vitamins and proteins. Assuming that the lifetimes of these mice are
normally distributed with a standard deviation of 7.2 months, find the probability that a given mouse
will live; a. More than 32 months.
b. Less than 28 months.
c. Between 37 and 49 months.

2. Given a normally distributed variable x with mean of 15 and a standard deviation of 2.2, find;
a. P(x < 14)
b. P(17< x < 21)
c. The value of k such that P(x < k) = 32.28%

3. If a set of scores in a Statistics examination are approximately normally distributed with a mean of
74 and a standard deviation of 7.9, find the probability that a student has a score of between 75 and
80?
4. Two students were informed that they receive standard score of 1.15 and –0.3 respectively on an
examination in philosophy. If their marks were 95 and 66 respectively, find the mean and the standard
deviation.
5. The IQ’s of 600 applicants to a certain college are approximately normally distributed with a mean
of 110 and standard deviation of 15. If the college requires an IQ of at least 95, how many of these
students will be rejected on this basis regardless of their other qualifications?

Prepared by: GAMALIEL A. SENOC, Ph. D.


Professor 6

You might also like