Lecture 4 - Sampling Distributions

Sampling Distributions
Book: Statistics for Business and Economics (Chapter 7)

Author: Anderson, Sweeney, et. al.
Edition: 13th Edition
Faculty: Suvechcha Sengupta

Introduction
 An element is the entity on which data are collected.

 A population is a collection of all the elements of interest.
 A sample is a subset of the population.
 The sampled population is the population from which the sample is drawn.
 A frame is a list of the elements that the sample will be selected from.
K J Somaiya Institute of Management, India 2

 The reason we select a sample is to collect data to answer a research question about a population.
 The sample results provide only estimates of the values of the population characteristics.
 The reason is simply that the sample contains only a portion of the population.
 With proper sampling methods, the sample results can provide “good” estimates of the population
characteristics.

Selecting a Sample
Finite Population Infinite Population
• A simple random sample of • A sample selected such that the

size n from a finite following conditions are
population of size N is a satisfied:
• Each element selected comes from the
sample selected such that population of interest
each possible sample of size • Each element is selected independently
n has the same probability
of being selected

Point Estimation
 Point estimation is a form of statistical inference.

 In point estimation we use the data from the sample to compute a value of a sample statistic that serves
as an estimate of a population parameter.
 We refer to as the point estimator of the population mean .
 s is the point estimator of the population standard deviation .

Example
 St. Andrew’s College received 900 applications from prospective students. The application form
contains a variety of information including the individual’s Scholastic Aptitude Test (SAT) score and
whether or not the individual desires on-campus housing.
 At a meeting in a few hours, the Director of Admissions would like to announce the average SAT score
for the population of 900 applicants.
 However, the necessary data on the applicants have not yet been entered in the college’s computerized
database. So, the Director decides to estimate the values of the population parameters of interest based
on sample statistics. The sample of 30 applicants is selected using computer-generated random
numbers.

Sample Statistic
 𝑥 ̅ as Point Estimator of µ

= = 1684
 s as Point Estimator of σ
𝑠= √ ∑ ¿ ¿ ¿ ¿
 Note: Different random numbers would have identified a different sample which would have resulted
in different point estimates.

Population Parameter
 Once all the data for the 900 applicants were entered in the college’s database, the values of the
population parameters of interest were calculated.
 Population Mean SAT Score
𝜇=
∑ 𝑥𝑖
=1697
900
 Population Standard Deviation for SAT Score
𝜎=√ ∑ ¿¿¿¿¿

Summary of Estimates from a Simple Random
Sample
Population Parameter Point Point

Parameter Value Estimator Estimate
µ = Population mean 1697 = Sample mean 1684

SAT score SAT score
σ = Population std. 87.4 s = Sample standard 85.2

deviation for deviation for SAT
SAT score score

Sampling Distribution of
 Process of Statistical Inference
Population A simple random sample

with mean of n elements is selected
m=? from the population.
The value of is used to The sample data

make inferences about provide a value for
the value of m. the sample mean .

Sampling Distribution of 𝑥 ̅
• The sampling distribution of is the probability distribution of all possible

values of the sample mean .
• Expected Value of
E() = 
where:  = the population mean
• When the expected value of the point estimator equals the population
parameter, we say the point estimator is unbiased.

Sampling Distribution of 𝑥 ̅
• We will use the following notation to define the standard deviation of the Sampling distribution of .
= the standard deviation of . Also referred to as the standard error of the

mean and is given by
s = the standard deviation of the population
n = the sample size
N = the population size
• When the population has a normal distribution, the sampling distribution of is normally distributed
for any sample size.
• In most applications, the sampling distribution of can be approximated by a normal distribution
whenever the sample is size 30 or more.
• The sampling distribution of can be used to provide probability information about how close the
sample mean is to the population mean m .

Central Limit Theorem
 When the population from which we are selecting a random sample does not have a normal
distribution, the central limit theorem is helpful in identifying the shape of the sampling
distribution of 𝑥 ̅.
CENTRAL LIMIT THEOREM
In selecting random samples of size n from a population, the sampling distribution of the
sample mean 𝑥 ̅ can be approximated by a normal distribution as the sample size becomes large.
 The larger the sample size, the more closely the sampling distribution of 𝑿 ̅ will resemble a
normal distribution.

Central Limit Theorem…
If the population is normal, then is normally distributed for all values of n.
If the population is non-normal, then is approximately normal only for larger values of
n.
In most practical situations, a sample size of 30 may be sufficiently large to allow us to

use the normal distribution as an approximation for the sampling distribution of .
But, if population is extremely non normal (e.g. heavily skewed; multimodal) the
sampling distribution will also be non normal even for moderately large values of n.
K J Somaiya Institute of Management, India 9.14

Sampling Distribution of the
Sample Mean
1.
2.
3. If X is normal, is normal. If X is non-normal, is approximately normal for

sufficiently large sample sizes. The definition of “sufficiently large” depends on the
extent of nonnormality of X

Sampling Distribution of the
Sample Mean
Therefore,
If then,

Example
The foreman of a bottling plant has observed that the amount of soda in each “32-
ounce” bottle is actually a normally distributed random variable, with a mean of 32.2
ounces and a standard deviation of .3 ounce.
If a customer buys one bottle, what is the probability that the bottle will contain more
than 32 ounces?

Example
We want to find P(X > 32), where X is normally distributed and µ = 32.2 and σ =.3
(Excel formula: 1-NORM.S.DIST(-0.67,TRUE)

“there is about a 75% chance that a single bottle of soda contains more than 32oz.”

Example
The foreman of a bottling plant has observed that the amount of soda in each “32-
ounce” bottle is actually a normally distributed random variable, with a mean of 32.2
ounces and a standard deviation of .3 ounce.
If a customer buys a carton of four bottles, what is the probability that the mean
amount of the four bottles will be greater than 32 ounces?

Example
We want to find P( , where X is normally distributed with µ = 32.2 and σ =.3
Things we know:
1) X is normally distributed, therefore so will
2)
3) 0.15

Example
If a customer buys a carton of four bottles, what is the probability that the mean
amount of the four bottles will be greater than 32 ounces?
(Excel formula: 1-NORM.S.DIST(-1.33,TRUE)

“There is about a 91% chance the mean of the four bottles will exceed 32oz.”

Graphically Speaking…
mean=32.2
what is the probability that one bottle will what is the probability that the mean of
contain more than 32 ounces? four bottles will exceed 32 oz?

Practical Value of Sampling Distribution of
Whenever a simple random sample is selected and the value of the sample mean is used to estimate the
value of the population mean m, we cannot expect the sample mean to exactly equal the population mean.
The practical reason we are interested in the sampling distribution of is that it can be used to provide
probability information about the difference between the sample mean () and the population mean
().
Example: A personnel director claims that the mean salary of managers is $51800 with a standard
deviation of $4000. To test this claim he drew a sample of 30 managers and their annual salary was
recorded. The personnel director believes that the sample mean will be an acceptable representation of the
population mean only if the sample mean is within $500 of the population mean.
(It is not possible to guarantee that the sample mean will be within $500 of the population mean.
Therefore, we talk in terms of probability. i.e. What is the probability that the sample mean computed
using a simple random sample of 30 managers will be within $500 of the population mean?)

Probability Of A Sample Mean Being Within $500 Of The Population Mean For A Simple
Random Sample Of 30 Managers

Solution
(Excel formula: NORM.S.DIST(0.68,TRUE)-NORM.S.DIST(0.68,TRUE))
Therefore, there is an approximately 50 % chance that a sample of 30 managers will provide a sample
mean that lies within $500 of the population mean

Using the Sampling Distribution for Inference
Here’s another way of expressing the probability calculated from a sampling

distribution.
P(-1.96 < Z < 1.96) = .95
Substituting the formula for the sampling distribution
With a little algebra

-z +z

Using the Sampling Distribution for Inference
We can also produce a general form of this statement
In this formula α (Greek letter alpha) is the probability that does not fall into the interval.
To apply this formula all we need to do is substitute the values for µ, σ, n, and α.
α/2 α/2
(1-α)
-z +z
From Here to Inference
The figure below symbolically represents the use of probability distributions.
Simply put, knowledge of the population and its parameter(s) allows us to use the
probability distribution to make probability statements about individual members of the
population.
Probability Distribution Individual

In this chapter we developed the sampling distribution, wherein knowledge of the

parameter(s) and some information about the distribution allow us to make probability
statements about a sample statistic.
Statistic

Statistical inference works by reversing the direction of the flow of knowledge in the previous figure.
Statistic Parameter

Thank You
simsr.somaiya.edu
K J Somaiya Institute of Management, India

Lecture 4 - Sampling Distributions

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 4 - Sampling Distributions

Uploaded by

Copyright:

Available Formats

Sampling Distributions

Book: Statistics for Business and Economics (Chapter 7)

Faculty: Suvechcha Sengupta

 An element is the entity on which data are collected.

K J Somaiya Institute of Management, India 2

K J Somaiya Institute of Management, India 3

Finite Population Infinite Population

• A simple random sample of • A sample selected such that the

K J Somaiya Institute of Management, India 4

 Point estimation is a form of statistical inference.

K J Somaiya Institute of Management, India 5

K J Somaiya Institute of Management, India 6

 𝑥 ̅ as Point Estimator of µ

K J Somaiya Institute of Management, India 7

K J Somaiya Institute of Management, India 8

Population Parameter Point Point

µ = Population mean 1697 = Sample mean 1684

σ = Population std. 87.4 s = Sample standard 85.2

K J Somaiya Institute of Management, India 9

Population A simple random sample

The value of is used to The sample data

K J Somaiya Institute of Management, India 10

• The sampling distribution of is the probability distribution of all possible

K J Somaiya Institute of Management, India 11

= the standard deviation of . Also referred to as the standard error of the

K J Somaiya Institute of Management, India 12

K J Somaiya Institute of Management, India 13

If the population is normal, then is normally distributed for all values of n.

In most practical situations, a sample size of 30 may be sufficiently large to allow us to

K J Somaiya Institute of Management, India 9.14

3. If X is normal, is normal. If X is non-normal, is approximately normal for

K J Somaiya Institute of Management, India 9.16

K J Somaiya Institute of Management, India 9.17

K J Somaiya Institute of Management, India 9.18

(Excel formula: 1-NORM.S.DIST(-0.67,TRUE)

K J Somaiya Institute of Management, India 9.19

K J Somaiya Institute of Management, India 9.20

We want to find P( , where X is normally distributed with µ = 32.2 and σ =.3

K J Somaiya Institute of Management, India 9.21

(Excel formula: 1-NORM.S.DIST(-1.33,TRUE)

K J Somaiya Institute of Management, India 9.22

K J Somaiya Institute of Management, India 9.23

K J Somaiya Institute of Management, India 9.24

K J Somaiya Institute of Management, India 25

(Excel formula: NORM.S.DIST(0.68,TRUE)-NORM.S.DIST(0.68,TRUE))

K J Somaiya Institute of Management, India 26

Here’s another way of expressing the probability calculated from a sampling

With a little algebra

K J Somaiya Institute of Management, India 9.27

We can also produce a general form of this statement

The figure below symbolically represents the use of probability distributions.

Probability Distribution Individual

K J Somaiya Institute of Management, India 9.29

In this chapter we developed the sampling distribution, wherein knowledge of the

K J Somaiya Institute of Management, India 9.30

K J Somaiya Institute of Management, India 9.31

K J Somaiya Institute of Management, India

You might also like