You are on page 1of 19

What is a Probability Distribution?

A probability distribution is a table or an equation that links each outcome of a statistical


experiment with its probability of occurrence.

Probability Distribution Prerequisites


To understand probability distributions, it is important to understand variables. random
variables, and some notation.

A variable is a symbol (A, B, x, y, etc.) that can take on any of a specified set of
values.

When the value of a variable is the outcome of a statistical experiment, that variable is
a random variable.

Generally, statisticians use a capital letter to represent a random variable and a lower-case
letter, to represent one of its values. For example,

X represents the random variable X.

P(X) represents the probability of X.

P(X = x) refers to the probability that the random variable X is equal to a particular
value, denoted by x. As an example, P(X = 1) refers to the probability that the random
variable X is equal to 1.

Probability Distributions
An example will make clear the relationship between random variables and probability
distributions. Suppose you flip a coin two times. This simple statistical experiment can have
four possible outcomes: HH, HT, TH, and TT. Now, let the variable X represent the number
of Heads that result from this experiment. The variable X can take on the values 0, 1, or 2. In
this example, X is a random variable; because its value is determined by the outcome of a
statistical experiment.
A probability distribution is a table or an equation that links each outcome of a statistical
experiment with its probability of occurrence. Consider the coin flip experiment described
above. The table below, which associates each outcome with its probability, is an example of
a probability distribution.
Number of heads
0
1
2

Probability
0.25
0.50
0.25

The above table represents the probability distribution of the random variable X.

Cumulative Probability Distributions


A cumulative probability refers to the probability that the value of a random variable falls
within a specified range.
Let us return to the coin flip experiment. If we flip a coin two times, we might ask: What is
the probability that the coin flips would result in one or fewer heads? The answer would be a
cumulative probability. It would be the probability that the coin flip experiment results in
zero heads plus the probability that the experiment results in one head.
P(X < 1) = P(X = 0) + P(X = 1) = 0.25 + 0.50 = 0.75
Like a probability distribution, a cumulative probability distribution can be represented by a
table or an equation. In the table below, the cumulative probability refers to the probability
than the random variable X is less than or equal to x.
Number of heads:
x
0
1
2

Probability:
P(X = x)
0.25
0.50
0.25

Cumulative Probability:
P(X < x)
0.25
0.75
1.00

Uniform Probability Distribution


The simplest probability distribution occurs when all of the values of a random variable occur
with equal probability. This probability distribution is called the uniform distribution.
Uniform Distribution. Suppose the random variable X can assume k different values.
Suppose also that the P(X = xk) is constant. Then,
P(X = xk) = 1/k

Example 1
Suppose a die is tossed. What is the probability that the die will land on 5 ?
Solution: When a die is tossed, there are 6 possible outcomes represented by: S = { 1, 2, 3, 4,
5, 6 }. Each possible outcome is a random variable (X), and each outcome is equally likely to
occur. Thus, we have a uniform distribution. Therefore, the P(X = 5) = 1/6.

Example 2
Suppose we repeat the dice tossing experiment described in Example 1. This time, we ask
what is the probability that the die will land on a number that is smaller than 5 ?

Solution: When a die is tossed, there are 6 possible outcomes represented by: S = { 1, 2, 3, 4,
5, 6 }. Each possible outcome is equally likely to occur. Thus, we have a uniform distribution.
This problem involves a cumulative probability. The probability that the die will land on a
number smaller than 5 is equal to:
P( X < 5 ) = P(X = 1) + P(X = 2) + P(X = 3) + P(X = 4) = 1/6 + 1/6 + 1/6 + 1/6 = 2/3
All probability distributions can be classified as discrete probability distributions or as
continuous probability distributions, depending on whether they define probabilities
associated with discrete variables or continuous variables.

Discrete vs. Continuous Variables


If a variable can take on any value between two specified values, it is called a continuous
variable; otherwise, it is called a discrete variable.
Some examples will clarify the difference between discrete and continuous variables.

Suppose the fire department mandates that all fire fighters must weigh between 150
and 250 pounds. The weight of a fire fighter would be an example of a continuous
variable; since a fire fighter's weight could take on any value between 150 and 250
pounds.

Suppose we flip a coin and count the number of heads. The number of heads could be
any integer value between 0 and plus infinity. However, it could not be any number
between 0 and plus infinity. We could not, for example, get 2.5 heads. Therefore, the
number of heads must be a discrete variable.

Just like variables, probability distributions can be classified as discrete or continuous.

Discrete Probability Distributions


If a random variable is a discrete variable, its probability distribution is called a discrete
probability distribution.
An example will make this clear. Suppose you flip a coin two times. This simple statistical
experiment can have four possible outcomes: HH, HT, TH, and TT. Now, let the random
variable X represent the number of Heads that result from this experiment. The random
variable X can only take on the values 0, 1, or 2, so it is a discrete random variable.
The probability distribution for this statistical experiment appears below.
Number of heads
0
1
2

Probability
0.25
0.50
0.25

The above table represents a discrete probability distribution because it relates each value of
a discrete random variable with its probability of occurrence. In subsequent lessons, we will
cover the following discrete probability distributions.

Binomial probability distribution

Hypergeometric probability distribution

Multinomial probability distribution

Negative binomial distribution

Poisson probability distribution

Note: With a discrete probability distribution, each possible value of the discrete random
variable can be associated with a non-zero probability. Thus, a discrete probability
distribution can always be presented in tabular form.

Continuous Probability Distributions


If a random variable is a continuous variable, its probability distribution is called a
continuous probability distribution.
A continuous probability distribution differs from a discrete probability distribution in several
ways.

The probability that a continuous random variable will assume a particular value is
zero.

As a result, a continuous probability distribution cannot be expressed in tabular form.

Instead, an equation or formula is used to describe a continuous probability


distribution.

Most often, the equation used to describe a continuous probability distribution is called a
probability density function. Sometimes, it is referred to as a density function, a PDF, or a
pdf. For a continuous probability distribution, the density function has the following
properties:

Since the continuous random variable is defined over a continuous range of values
(called the domain of the variable), the graph of the density function will also be
continuous over that range.

The area bounded by the curve of the density function and the x-axis is equal to 1,
when computed over the domain of the variable.

The probability that a random variable assumes a value between a and b is equal to
the area under the density function bounded by a and b.

For example, consider the probability density function shown in the graph below. Suppose we
wanted to know the probability that the random variable X was less than or equal to a. The
probability that X is less than or equal to a is equal to the area under the curve bounded by a
and minus infinity - as indicated by the shaded area.

Note: The shaded area in the graph represents the probability that the random variable X is
less than or equal to a. This is a cumulative probability. However, the probability that X is
exactly equal to a would be zero. A continuous random variable can take on an infinite
number of values. The probability that it will equal a specific value (such as a) is always zero.

Binomial Distribution
A binomial random variable is the number of successes x in n repeated trials of a binomial
experiment. The probability distribution of a binomial random variable is called a binomial
distribution.
Suppose we flip a coin two times and count the number of heads (successes). The binomial
random variable is the number of heads, which can take on values of 0, 1, or 2. The binomial
distribution is presented below.
Number of heads
0
1
2

Probability
0.25
0.50
0.25

The binomial distribution has the following properties:

The mean of the distribution (x) is equal to n * P .

The variance (2x) is n * P * ( 1 - P ).

The standard deviation (x) is sqrt[ n * P * ( 1 - P ) ].

Binomial Formula and Binomial Probability

The binomial probability refers to the probability that a binomial experiment results in
exactly x successes. For example, in the above table, we see that the binomial probability of
getting exactly one head in two coin flips is 0.50.
Given x, n, and P, we can compute the binomial probability based on the binomial formula:
Binomial Formula. Suppose a binomial experiment consists of n trials and results in x
successes. If the probability of success on an individual trial is P, then the binomial
probability is:
b(x; n, P) = nCx * Px * (1 - P)n - x
or
b(x; n, P) = { n! / [ x! (n - x)! ] } * Px * (1 - P)n - x

Example 1
Suppose a die is tossed 5 times. What is the probability of getting exactly 2 fours?
Solution: This is a binomial experiment in which the number of trials is equal to 5, the
number of successes is equal to 2, and the probability of success on a single trial is 1/6 or
about 0.167. Therefore, the binomial probability is:
b(2; 5, 0.167) = 5C2 * (0.167)2 * (0.833)3
b(2; 5, 0.167) = 0.161

We need to start to solve this problem by using the Binomial distribution to find
the probability that a packet of 10 blades as zero defective or two defective
blades.
Let X be the number of defective blades in a packet. X has the binomial
distribution with n = 10 trials and success probability p = 0.002 .
In general, if X has the binomial distribution with n trials and a success
probability of p then
P[X = x] = n!/(x!(n-x)!) * p^x * (1-p)^(n-x)
for values of x = 0, 1, 2, ..., n
P[X = x] = 0 for any other value of x.
The probability mass function is derived by looking at the number of combination
of x objects chosen from n objects and then a total of x success and n - x failures.
Or, to be more accurate, the binomial is the sum of n independent and identically
distributed Bernoulli trials.
X ~ Binomial( n , p )
the mean of the binomial distribution is n * p = 0.02
the variance of the binomial distribution is n * p * (1 - p) = 0.01996
the standard deviation is the square root of the variance = ( n * p * (1 - p)) =
0.1412799
P( X = 0 ) = 0.980179043351949
P( X = 2 ) = 0.0001771400795612779

the Poisson approximation to the binomial works with n is large and p is small.
Here you have n = 10 and p = 0.002.
Let Y be the number of defects in a packet of blades. Y has the Poisson
distribution, approximately, with parameter 0.02.
In general you have:
Y ~ Poisson( t )

P(Y = y) = ( t )^y * eyp( -t ) / y! for y = 0, 1, 2, 3, 4, ...


P(Y = y) = 0 otherwise
the mean of the Poisson distribution is the parameter, t
the variance of the Poisson distribution is the parameter, t
In this problem we have
= 0.002
t = 10 time unit(s)
this results in our random variable Y ~ Poisson( 0.02 )

Find P(Y = 0 ) = 0.02 ^ 0 * exp( -0.02 ) / 0 ! = 0.9801987


Find P(Y = 2 ) = 0.02 ^ 2 * exp( -0.02 ) / 2 ! = 0.0001960397
these approximations are not to bad.
if you have 100,000 packets we would expect
100,000 * 0.9801987 = 98019.87 to have zero defects
100,000 * 0.0001960397 = 19.60397 to have two defects
if you have 1,000,000 packets we would expect
980198.7 to have zero defects
196.0397 to have two defects

Sampling Distributions
Suppose that we draw all possible samples of size n from a given population. Suppose further
that we compute a statistic (e.g., a mean, proportion, standard deviation) for each sample. The
probability distribution of this statistic is called a sampling distribution. And the standard
deviation of this statistic is called the standard error.

Variability of a Sampling Distribution


The variability of a sampling distribution is measured by its variance or its standard
deviation. The variability of a sampling distribution depends on three factors:

N: The number of observations in the population.

n: The number of observations in the sample.

The way that the random sample is chosen.

If the population size is much larger than the sample size, then the sampling distribution has
roughly the same standard error, whether we sample with or without replacement. On the
other hand, if the sample represents a significant fraction (say, 1/20) of the population size,
the standard error will be meaningfully smaller, when we sample without replacement.

Sampling Distribution of the Mean


Suppose we draw all possible samples of size n from a population of size N. Suppose further
that we compute a mean score for each sample. In this way, we create a sampling distribution
of the mean.
We know the following about the sampling distribution of the mean. The mean of the
sampling distribution (x) is equal to the mean of the population (). And the standard error of
the sampling distribution (x) is determined by the standard deviation of the population (),
the population size (N), and the sample size (n). These relationships are shown in the
equations below:
x =

and

x = [ / sqrt(n) ] * sqrt[ (N - n ) / (N - 1) ]

In the standard error formula, the factor sqrt[ (N - n ) / (N - 1) ] is called the finite population
correction or fpc. When the population size is very large relative to the sample size, the fpc is
approximately equal to one; and the standard error formula can be approximated by:
x = / sqrt(n).
You often see this "approximate" formula in introductory statistics texts. As a general rule, it
is safe to use the approximate formula when the sample size is no bigger than 1/20 of the
population size.

Sampling Distribution of the Proportion


In a population of size N, suppose that the probability of the occurrence of an event (dubbed a
"success") is P; and the probability of the event's non-occurrence (dubbed a "failure") is Q.
From this population, suppose that we draw all possible samples of size n. And finally, within
each sample, suppose that we determine the proportion of successes p and failures q. In this
way, we create a sampling distribution of the proportion.
We find that the mean of the sampling distribution of the proportion (p) is equal to the
probability of success in the population (P). And the standard error of the sampling
distribution (p) is determined by the standard deviation of the population (), the population
size, and the sample size. These relationships are shown in the equations below:
p = P
p = [ / sqrt(n) ] * sqrt[ (N - n ) / (N - 1) ]
p = sqrt[ PQ/n ] * sqrt[ (N - n ) / (N - 1) ]
where = sqrt[ PQ ].
Like the formula for the standard error of the mean, the formula for the standard error of the
proportion uses the finite population correction, sqrt[ (N - n ) / (N - 1) ]. When the population
size is very large relative to the sample size, the fpc is approximately equal to one; and the
standard error formula can be approximated by:
p = sqrt[ PQ/n ]
You often see this "approximate" formula in introductory statistics texts. As a general rule, it
is safe to use the approximate formula when the sample size is no bigger than 1/20 of the
population size.

Central Limit Theorem


The central limit theorem states that the sampling distribution of the mean of any
independent, random variable will be normal or nearly normal, if the sample size is large
enough.
How large is "large enough"? The answer depends on two factors.

Requirements for accuracy. The more closely the sampling distribution needs to
resemble a normal distribution, the more sample points will be required.

The shape of the underlying population. The more closely the original population
resembles a normal distribution, the fewer sample points will be required.

In practice, some statisticians say that a sample size of 30 is large enough when the
population distribution is roughly bell-shaped. Others recommend a sample size of at least
40. But if the original population is distinctly not normal (e.g., is badly skewed, has multiple
peaks, and/or has outliers), researchers like the sample size to be even larger.

T-Distribution vs. Normal Distribution


The t distribution and the normal distribution can both be used with statistics that have a bellshaped distribution. This suggests that we might use either the t-distribution or the normal
distribution to analyze sampling distributions. Which should we choose?
Guidelines exist to help you make that choice. Some focus on the population standard
deviation.

If the population standard deviation is known, use the normal distribution

If the population standard deviation is unknown, use the t-distribution.

Other guidelines focus on sample size.

If the sample size is large, use the normal distribution. (See the discussion above in
the section on the Central Limit Theorem to understand what is meant by a "large"
sample.)

If the sample size is small, use the t-distribution.

In practice, researchers employ a mix of the above guidelines. On this site, we use the normal
distribution when the population standard deviation is known and the sample size is large. We
might use either distribution when standard deviation is unknown and the sample size is very
large. We use the t-distribution when the sample size is small, unless the underlying
distribution is not normal. The t distribution should not be used with small samples from
populations that are not approximately normal.

Test Your Understanding


In this section, we offer two examples that illustrate how sampling distributions are used to
solve commom statistical problems. In each of these problems, the population sample size is
known; and the sample size is large. So you should use the Normal Distribution Calculator,
rather than the t-Distribution Calculator, to compute probabilities for these problems.

Normal Distribution Calculator

The normal calculator solves common statistical problems, based on the normal distribution.
The calculator computes cumulative probabilities, based on three simple inputs. Simple
instructions guide you to an accurate solution, quickly and easily. If anything is unclear,
frequently-asked questions and sample problems provide straightforward explanations. The
calculator is free. It can be found under the Stat Tables tab, which appears in the header of
every Stat Trek web page.
Normal Calculator
Example 1
Assume that a school district has 10,000 6th graders. In this district, the average weight of a
6th grader is 80 pounds, with a standard deviation of 20 pounds. Suppose you draw a random
sample of 50 students. What is the probability that the average weight of a sampled student
will be less than 75 pounds?
Solution: To solve this problem, we need to define the sampling distribution of the mean.
Because our sample size is greater than 30, the Central Limit Theorem tells us that the
sampling distribution will approximate a normal distribution.
To define our normal distribution, we need to know both the mean of the sampling
distribution and the standard deviation. Finding the mean of the sampling distribution is easy,
since it is equal to the mean of the population. Thus, the mean of the sampling distribution is
equal to 80.
The standard deviation of the sampling distribution can be computed using the following
formula.
x = [ / sqrt(n) ] * sqrt[ (N - n ) / (N - 1) ]
x = [ 20 / sqrt(50) ] * sqrt[ (10,000 - 50 ) / (10,000 - 1) ] = (20/7.071) * (0.995) = 2.81
Let's review what we know and what we want to know. We know that the sampling
distribution of the mean is normally distributed with a mean of 80 and a standard deviation of
2.82. We want to know the probability that a sample mean is less than or equal to 75 pounds.
Because we know the population standard deviation and the sample size is large, we'll use the
normal distribution to find probability. To solve the problem, we plug these inputs into the
Normal Probability Calculator: mean = 80, standard deviation = 2.81, and normal random
variable = 75. The Calculator tells us that the probability that the average weight of a sampled
student is less than 75 pounds is equal to 0.038.
Note: Since the population size is more than 20 times greater than the sample size, we could
have used the "approximate" formula x = [ / sqrt(n) ] to compute the standard error. Had
we done that, we would have found a standard error equal to [ 20 / sqrt(50) ] or 2.83.
Example 2
Find the probability that of the next 120 births, no more than 40% will be boys. Assume equal

probabilities for the births of boys and girls. Assume also that the number of births in the
population (N) is very large, essentially infinite.
Solution: The Central Limit Theorem tells us that the proportion of boys in 120 births will be
approximately normally distributed.
The mean of the sampling distribution will be equal to the mean of the population
distribution. In the population, half of the births result in boys; and half, in girls. Therefore,
the probability of boy births in the population is 0.50. Thus, the mean proportion in the
sampling distribution should also be 0.50.
The standard deviation of the sampling distribution (i.e., the standard error) can be computed
using the following formula.
p = sqrt[ PQ/n ] * sqrt[ (N - n ) / (N - 1) ]
Here, the finite population correction is equal to 1.0, since the population size (N) was
assumed to be infinite. Therefore, standard error formula reduces to:
p = sqrt[ PQ/n ]
p = sqrt[ (0.5)(0.5)/120 ] = sqrt[0.25/120 ] = 0.04564
Let's review what we know and what we want to know. We know that the sampling
distribution of the proportion is normally distributed with a mean of 0.50 and a standard
deviation of 0.04564. We want to know the probability that no more than 40% of the sampled
births are boys.
Because we know the population standard deviation and the sample size is large, we'll use the
normal distribution to find probability. To solve the problem, we plug these inputs into the
Normal Probability Calculator: mean = .5, standard deviation = 0.04564, and the normal
random variable = .4. The Calculator tells us that the probability that no more than 40% of
the sampled births are boys is equal to 0.014.
Note: This problem can also be treated as a binomial experiment. Elsewhere, we showed how
to analyze a binomial experiment. The binomial experiment is actually the more exact
analysis. It produces a probability of 0.018 (versus a probability of 0.14 that we found using
the normal distribution). Without a computer, the binomial approach is computationally
demanding. Therefore, many statistics texts emphasize the approach presented above, which
uses the normal distribution to approximate the binomial.

Difference Between Proportions


Statistics problems often involve comparisons between two independent sample proportions.
This lesson explains how to compute probabilities associated with differences between
proportions.

Difference Between Proportions: Theory

Suppose we have two populations with proportions equal to P1 and P2. Suppose further that
we take all possible samples of size n1 and n2. And finally, suppose that the following
assumptions are valid.

The size of each population is large relative to the sample drawn from the population.
That is, N1 is large relative to n1, and N2 is large relative to n2. (In this context,
populations are considered to be large if they are at least 20 times bigger than their
sample.)

The samples from each population are big enough to justify using a normal
distribution to model differences between proportions. The sample sizes will be big
enough when the following conditions are met: n1P1 > 10, n1(1 -P1) > 10, n2P2 > 10,
and n2(1 - P2) > 10. (This criterion requires that at least 40 observations be sampled
from each population. When P1 or P1 is more extreme than 0.5, even more
observations are required.)

The samples are independent; that is, observations in population 1 are not affected by
observations in population 2, and vice versa.

Given these assumptions, we know the following.

The set of differences between sample proportions will be normally distributed. We


know this from the central limit theorem.

The expected value of the difference between all possible sample proportions is equal
to the difference between population proportions. Thus, E(p1 - p2) = P1 - P2.

The standard deviation of the difference between sample proportions (d) is


approximately equal to:
d = sqrt{ [P1(1 - P1) / n1] + [P2(1 - P2) / n2] }

It is straightforward to derive the last bullet point, based on material covered in previous
lessons. The derivation starts with a recognition that the variance of the difference between
independent random variables is equal to the sum of the individual variances. Thus,
2d = 2P1 - P2 = 21 + 22
If the populations N1 and N2 are both large relative to n1 and n2, respectively, then
21 = P1(1 - P1) / n1

And

22 = P2(1 - P2) / n2

Therefore,
2d = [ P1(1 - P1) / n1 ] + [ P2(1 - P2) / n2 ]
And
d = sqrt{ [ P1(1 - P1) / n1 ] + [ P2(1 - P2) / n2 ] }

Difference Between Proportions: Sample Problem

In this section, we work through a sample problem to show how to apply the theory presented
above. In this example, we will use Stat Trek's Normal Distribution Calculator to compute
probabilities. The calculator is free.

Normal Distribution Calculator


The normal calculator solves common statistical problems, based on the normal distribution.
The calculator computes cumulative probabilities, based on three simple inputs. Simple
instructions guide you quickly to an accurate solution. If anything is unclear, frequentlyasked questions and sample problems provide straightforward explanations. The calculator is
free. It can be found under the menu tab, at the top of every Stat Trek web page. Tap Menu Statistical tables - Normal distribution. Or you can tap the "Normal Calculator" button below.
Normal Calculator
Problem 1
In one state, 52% of the voters are Republicans, and 48% are Democrats. In a second state,
47% of the voters are Republicans, and 53% are Democrats. Suppose 100 voters are surveyed
from each state. Assume the survey uses simple random sampling.
What is the probability that the survey will show a greater percentage of Republican voters in
the second state than in the first state?
(A) 0.04
(B) 0.05
(C) 0.24
(D) 0.71
(E) 0.76
Solution
The correct answer is C. For this analysis, let P1 = the proportion of Republican voters in the
first state, P2 = the proportion of Republican voters in the second state, p1 = the proportion of
Republican voters in the sample from the first state, and p2 = the proportion of Republican
voters in the sample from the second state. The number of voters sampled from the first state
(n1) = 100, and the number of voters sampled from the second state (n2) = 100.
The solution involves four steps.

Make sure the samples from each population are big enough to model differences with
a normal distribution. Because n1P1 = 100 * 0.52 = 52, n1(1 - P1) = 100 * 0.48 = 48,
n2P2 = 100 * 0.47 = 47, and n2(1 - P2) = 100 * 0.53 = 53 are each greater than 10, the
sample size is large enough.

Find the mean of the difference in sample proportions: E(p1 - p2) = P1 - P2 = 0.52 0.47 = 0.05.

Find the standard deviation of the difference.

d = sqrt{ [ P1(1 - P1) / n1 ] + [ P2(1 - P2) / n2 ] }


d = sqrt{ [ (0.52)(0.48) / 100 ] + [ (0.47)(0.53) / 100 ] }
d = sqrt (0.002496 + 0.002491) = sqrt(0.004987) = 0.0706

Find the probability. This problem requires us to find the probability that p1 is less
than p2. This is equivalent to finding the probability that p1 - p2 is less than zero. To
find this probability, we need to transform the random variable (p1 - p2) into a z-score.
That transformation appears below.
zp1 - p2 = (x - p1 - p2) / d = = (0 - 0.05)/0.0706 = -0.7082
Using Stat Trek's Normal Distribution Calculator, we find that the probability of a zscore being -0.7082 or less is 0.24.

Therefore, the probability that the survey will show a greater percentage of Republican voters
in the second state than in the first state is 0.24.
Note: Some analysts might have used the t-distribution to compute probabilities for this
problem. We chose the normal distribution because the population variance was known and
the sample size was large. In a previous lesson, we offered some guidelines for choosing
between the normal and the t-distribution.

Difference Between Means


Statistics problems often involve comparisons between two independent sample means. This
lesson explains how to compute probabilities associated with differences between means.

Difference Between Means: Theory


Suppose we have two populations with means equal to 1 and 2. Suppose further that we
take all possible samples of size n1 and n2. And finally, suppose that the following
assumptions are valid.

The size of each population is large relative to the sample drawn from the population.
That is, N1 is large relative to n1, and N2 is large relative to n2. (In this context,
populations are considered to be large if they are at least 10 times bigger than their
sample.)

The samples are independent; that is, observations in population 1 are not affected by
observations in population 2, and vice versa.

The set of differences between sample means is normally distributed. This will be true
if each population is normal or if the sample sizes are large. (Based on the central
limit theorem, sample sizes of 40 would probably be large enough).

Given these assumptions, we know the following.

The expected value of the difference between all possible sample means is equal to
the difference between population means. Thus, E(x1 - x2) = d = 1 - 2.

The standard deviation of the difference between sample means (d) is approximately
equal to:
d = sqrt( 12 / n1 + 22 / n2 )

It is straightforward to derive the last bullet point, based on material covered in previous
lessons. The derivation starts with a recognition that the variance of the difference between
independent random variables is equal to the sum of the individual variances. Thus,
2d = 2 (x1 - x2) = 2 x1 + 2 x2
If the populations N1 and N2 are both large relative to n1 and n2, respectively, then
2 x1 = 21 / n1

2 x2 = 22 / n2

And

Therefore,
d2 = 12 / n1 + 22 / n2

And

d = sqrt( 12 / n1 + 22 / n2 )

Difference Between Means: Sample Problem


In this section, we work through a sample problem to show how to apply the theory presented
above. In this example, we will use Stat Trek's Normal Distribution Calculator to compute
probabilities.

Normal Distribution Calculator


The normal calculator solves common statistical problems, based on the normal distribution.
The calculator computes cumulative probabilities, based on three simple inputs. Simple
instructions guide you quickly to an accurate solution. If anything is unclear, frequentlyasked questions and sample problems provide straightforward explanations. Access this free
calculator from the Stat Tables tab, which appears in the header of every Stat Trek web page.
Normal Calculator
Problem 1
For boys, the average number of absences in the first grade is 15 with a standard deviation of
7; for girls, the average number of absences is 10 with a standard deviation of 6.
In a nationwide survey, suppose 100 boys and 50 girls are sampled. What is the probability
that the male sample will have at most three more days of absences than the female sample?
(A) 0.025
(B) 0.035

(C) 0.045
(D) 0.055
(E) None of the above
Solution
The correct answer is B. The solution involves three or four steps, depending on whether you
work directly with raw scores or z-scores. The "raw score" solution appears below:

Find the mean difference (male absences minus female absences) in the population.
d = 1 - 2 = 15 - 10 = 5

Find the standard deviation of the difference.


d = sqrt( 12 / n1 + 22 / n2 )
d = sqrt(72/100 + 62/50) = sqrt(49/100 + 36/50) = sqrt(0.49 + .72) = sqrt(1.21) = 1.1

Find the probability. This problem requires us to find the probability that the average
number of absences in the boy sample minus the average number of absences in the
girl sample is less than 3. To find this probability, we use Stat Trek's Normal
Distribution Calculator. Specifically, we enter the following inputs: 3, for the normal
random variable; 5, for the mean; and 1.1, for the standard deviation. We find that the
probability of the mean difference (male absences minus female absences) being 3 or
less is about 0.035.

Thus, the probability that the difference between samples will be no more than 3 days is
0.035.

Alternatively, we could have worked with z-scores (which have a mean of 0 and a standard
deviation of 1). Here's the z-score solution:

Find the mean difference (male absences minus female absences) in the population.
d = 1 - 2 = 15 - 10 = 5

Find the standard deviation of the difference.


d = sqrt( 12 / n1 + 22 / n2 )
d = sqrt(7 /100 + 6 /50) = sqrt(49/100 + 36/50) = sqrt(0.49 + .72) = sqrt(1.21) = 1.1
2

Find the z-score that is produced when boys have three more days of absences than
girls. When boys have three more days of absences, the number of male absences
minus female absences is three. And the associated z-score is
z = (x - )/ = (3 - 5)/1.1 = -2/1.1 = -1.818

Find the probability. To find this probability, we use Stat Trek's Normal Distribution
Calculator. Specifically, we enter the following inputs: -1.818, for the normal random
variable; 0, for the mean; and 1, for the standard deviation. We find that the
probability of probability of a z-score being -1.818 or less is about 0.035.

Of course, the result is the same, whether you work with raw scores or z-scores.
Note: Some analysts might have used the t-distribution to compute probabilities for this
problem. We chose the normal distribution because the population variance was known and
the sample size was large. In a previous lesson, we offered some guidelines for choosing
between the normal and the t-distribution.

You might also like