You are on page 1of 32

Sampling Distributions

Book: Statistics for Business and Economics (Chapter 7)


Author: Anderson, Sweeney, et. al.
Edition: 13th Edition

Faculty: Suvechcha Sengupta


Introduction

 An element is the entity on which data are collected.


 A population is a collection of all the elements of interest.
 A sample is a subset of the population.
 The sampled population is the population from which the sample is drawn.
 A frame is a list of the elements that the sample will be selected from.

K J Somaiya Institute of Management, India 2


 The reason we select a sample is to collect data to answer a research question about a population.
 The sample results provide only estimates of the values of the population characteristics.
 The reason is simply that the sample contains only a portion of the population.
 With proper sampling methods, the sample results can provide “good” estimates of the population
characteristics.

K J Somaiya Institute of Management, India 3


Selecting a Sample

Finite Population Infinite Population

• A simple random sample of • A sample selected such that the


size n from a finite following conditions are
population of size N is a satisfied:
• Each element selected comes from the
sample selected such that population of interest
each possible sample of size • Each element is selected independently
n has the same probability
of being selected

K J Somaiya Institute of Management, India 4


Point Estimation

 Point estimation is a form of statistical inference.


 In point estimation we use the data from the sample to compute a value of a sample statistic that serves
as an estimate of a population parameter.
 We refer to as the point estimator of the population mean .
 s is the point estimator of the population standard deviation .

K J Somaiya Institute of Management, India 5


Example

 St. Andrew’s College received 900 applications from prospective students. The application form
contains a variety of information including the individual’s Scholastic Aptitude Test (SAT) score and
whether or not the individual desires on-campus housing.
 At a meeting in a few hours, the Director of Admissions would like to announce the average SAT score
for the population of 900 applicants.
 However, the necessary data on the applicants have not yet been entered in the college’s computerized
database. So, the Director decides to estimate the values of the population parameters of interest based
on sample statistics. The sample of 30 applicants is selected using computer-generated random
numbers.

K J Somaiya Institute of Management, India 6


Sample Statistic

 𝑥 ̅ as Point Estimator of µ


= = 1684

 s as Point Estimator of σ

𝑠= √ ∑ ¿ ¿ ¿ ¿
 Note: Different random numbers would have identified a different sample which would have resulted
in different point estimates.

K J Somaiya Institute of Management, India 7


Population Parameter

 Once all the data for the 900 applicants were entered in the college’s database, the values of the
population parameters of interest were calculated.
 Population Mean SAT Score
𝜇=
∑ 𝑥𝑖
=1697
900
 Population Standard Deviation for SAT Score

𝜎=√ ∑ ¿¿¿¿¿

K J Somaiya Institute of Management, India 8


Summary of Estimates from a Simple Random
Sample

Population Parameter Point Point


Parameter Value Estimator Estimate

µ = Population mean 1697 = Sample mean 1684


SAT score SAT score

σ = Population std. 87.4 s = Sample standard 85.2


deviation for deviation for SAT
SAT score score

K J Somaiya Institute of Management, India 9


Sampling Distribution of
 Process of Statistical Inference

Population A simple random sample


with mean of n elements is selected
m=? from the population.

The value of is used to The sample data


make inferences about provide a value for
the value of m. the sample mean .

K J Somaiya Institute of Management, India 10


Sampling Distribution of 𝑥 ̅

• The sampling distribution of is the probability distribution of all possible


values of the sample mean .
• Expected Value of
E() = 
where:  = the population mean
• When the expected value of the point estimator equals the population
parameter, we say the point estimator is unbiased.

K J Somaiya Institute of Management, India 11


Sampling Distribution of 𝑥 ̅

• We will use the following notation to define the standard deviation of the Sampling distribution of .

= the standard deviation of . Also referred to as the standard error of the


mean and is given by
s = the standard deviation of the population
n = the sample size
N = the population size
• When the population has a normal distribution, the sampling distribution of is normally distributed
for any sample size.
• In most applications, the sampling distribution of can be approximated by a normal distribution
whenever the sample is size 30 or more.
• The sampling distribution of can be used to provide probability information about how close the
sample mean is to the population mean m .

K J Somaiya Institute of Management, India 12


Central Limit Theorem

 When the population from which we are selecting a random sample does not have a normal
distribution, the central limit theorem is helpful in identifying the shape of the sampling
distribution of 𝑥 ̅.
CENTRAL LIMIT THEOREM
In selecting random samples of size n from a population, the sampling distribution of the
sample mean 𝑥 ̅ can be approximated by a normal distribution as the sample size becomes large.

 The larger the sample size, the more closely the sampling distribution of 𝑿 ̅ will resemble a
normal distribution.

K J Somaiya Institute of Management, India 13


Central Limit Theorem…

If the population is normal, then is normally distributed for all values of n.

If the population is non-normal, then is approximately normal only for larger values of
n.

In most practical situations, a sample size of 30 may be sufficiently large to allow us to


use the normal distribution as an approximation for the sampling distribution of .
But, if population is extremely non normal (e.g. heavily skewed; multimodal) the
sampling distribution will also be non normal even for moderately large values of n.

K J Somaiya Institute of Management, India 9.14


Sampling Distribution of the
Sample Mean
1.

2.

3. If X is normal, is normal. If X is non-normal, is approximately normal for


sufficiently large sample sizes. The definition of “sufficiently large” depends on the
extent of nonnormality of X

K J Somaiya Institute of Management, India 9.16


Sampling Distribution of the
Sample Mean
Therefore,
If then,

K J Somaiya Institute of Management, India 9.17


Example

The foreman of a bottling plant has observed that the amount of soda in each “32-
ounce” bottle is actually a normally distributed random variable, with a mean of 32.2
ounces and a standard deviation of .3 ounce.

If a customer buys one bottle, what is the probability that the bottle will contain more
than 32 ounces?

K J Somaiya Institute of Management, India 9.18


Example

We want to find P(X > 32), where X is normally distributed and µ = 32.2 and σ =.3

(Excel formula: 1-NORM.S.DIST(-0.67,TRUE)


“there is about a 75% chance that a single bottle of soda contains more than 32oz.”

K J Somaiya Institute of Management, India 9.19


Example

The foreman of a bottling plant has observed that the amount of soda in each “32-
ounce” bottle is actually a normally distributed random variable, with a mean of 32.2
ounces and a standard deviation of .3 ounce.

If a customer buys a carton of four bottles, what is the probability that the mean
amount of the four bottles will be greater than 32 ounces?

K J Somaiya Institute of Management, India 9.20


Example

We want to find P( , where X is normally distributed with µ = 32.2 and σ =.3

Things we know:
1) X is normally distributed, therefore so will
2)
3) 0.15

K J Somaiya Institute of Management, India 9.21


Example

If a customer buys a carton of four bottles, what is the probability that the mean
amount of the four bottles will be greater than 32 ounces?

(Excel formula: 1-NORM.S.DIST(-1.33,TRUE)


“There is about a 91% chance the mean of the four bottles will exceed 32oz.”

K J Somaiya Institute of Management, India 9.22


Graphically Speaking…
mean=32.2

what is the probability that one bottle will what is the probability that the mean of
contain more than 32 ounces? four bottles will exceed 32 oz?

K J Somaiya Institute of Management, India 9.23


Practical Value of Sampling Distribution of

Whenever a simple random sample is selected and the value of the sample mean is used to estimate the
value of the population mean m, we cannot expect the sample mean to exactly equal the population mean.
The practical reason we are interested in the sampling distribution of is that it can be used to provide
probability information about the difference between the sample mean () and the population mean
().
Example: A personnel director claims that the mean salary of managers is $51800 with a standard
deviation of $4000. To test this claim he drew a sample of 30 managers and their annual salary was
recorded. The personnel director believes that the sample mean will be an acceptable representation of the
population mean only if the sample mean is within $500 of the population mean.
(It is not possible to guarantee that the sample mean will be within $500 of the population mean.
Therefore, we talk in terms of probability. i.e. What is the probability that the sample mean computed
using a simple random sample of 30 managers will be within $500 of the population mean?)

K J Somaiya Institute of Management, India 9.24


Probability Of A Sample Mean Being Within $500 Of The Population Mean For A Simple
Random Sample Of 30 Managers

K J Somaiya Institute of Management, India 25


Solution

(Excel formula: NORM.S.DIST(0.68,TRUE)-NORM.S.DIST(0.68,TRUE))

Therefore, there is an approximately 50 % chance that a sample of 30 managers will provide a sample
mean that lies within $500 of the population mean

K J Somaiya Institute of Management, India 26


Using the Sampling Distribution for Inference

Here’s another way of expressing the probability calculated from a sampling


distribution.
P(-1.96 < Z < 1.96) = .95
Substituting the formula for the sampling distribution

With a little algebra


-z +z

K J Somaiya Institute of Management, India 9.27


Using the Sampling Distribution for Inference

We can also produce a general form of this statement

In this formula α (Greek letter alpha) is the probability that does not fall into the interval.

To apply this formula all we need to do is substitute the values for µ, σ, n, and α.

α/2 α/2
(1-α)

-z +z
K J Somaiya Institute of Management, India 9.28
From Here to Inference

The figure below symbolically represents the use of probability distributions.

Simply put, knowledge of the population and its parameter(s) allows us to use the
probability distribution to make probability statements about individual members of the
population.

Probability Distribution Individual

K J Somaiya Institute of Management, India 9.29


From Here to Inference

In this chapter we developed the sampling distribution, wherein knowledge of the


parameter(s) and some information about the distribution allow us to make probability
statements about a sample statistic.

Statistic

K J Somaiya Institute of Management, India 9.30


From Here to Inference

Statistical inference works by reversing the direction of the flow of knowledge in the previous figure.

Statistic Parameter

K J Somaiya Institute of Management, India 9.31


Thank You
simsr.somaiya.edu

K J Somaiya Institute of Management, India

You might also like