Professional Documents
Culture Documents
Sampling Distribution
ISOM2500 BUSINESS STATISTICS
(L1-3, Spring 2022/23)
Jason HO
Contents
Samples and surveys
Sampling distribution of the sample mean
Central limit theorem
Sampling distribution of any statistic
2
Journey in This Course
Descriptive statistics Building Blocks of Theory of Statistics
Module 1 & 2 Module 4a, 4b & 5
Module 3
Graphical Tools Random Variables
Probability • Discrete or
Continuous
Numerical Tools • Jointly distributed
Inferential statistics 3
Inferential Statistics: A General Set-up
Survey
Population
Parameter(s) Find and
to describe a compute
Sample Sample
characteristic of Statistic(s)
interest to estimate
the parameter
Survey
Population
Parameter(s) Find and
to describe a compute
Sample Sample
characteristic of Statistic(s)
interest to estimate
the parameter
6
Samples and Surveys
A survey gathers information of a subgroup of entities (i.e., sample)
who belong to a much larger group (i.e., population), providing the
necessary ingredients – THE DATA – for parameter estimation
7
Example 1: Use of Surveys in Daily Lives
When an election is approaching, there are nonstop reports and news
about the latest opinion poll
A retailer wants to know the market share of a brand before deciding
to stock the items on its shelves
The foreman of a warehouse will not accept a shipment of electronic
components unless virtually all the components in the shipment
operate correctly
Managers in the human resources department determine the salary
for the new employees based on wages paid around the country
8
Representative Sample and Sampling Bias
A sample that presents a “good” snapshot of the population (i.e.,
showing/preserving systematic patterns of the population) is said
to be representative
Samples that distort the population (e.g., one that systematically
omits a portion of the population) are said to have sampling bias
…… 100
14
Population = An (Underlying) Probability
Model or Distribution X or f(x)
Most statistical methods are developed Population
by often, if not always, assuming an
underlying probability distribution X
or f(x) for the population: Infinitely many values
• The population comprises infinitely Percentage
histogram
many figures
• The percentage histogram (or the An underlying
probability model
red smooth curve) of all these
A random variable X or
figures mimics the probability a probability
distribution f(x) of a RV, say, X distribution f(x)
15
Data = IID Samples From X
Assume that the data arise as a representative sample of size n from
the population (with an underlying probability distribution X or f(x))
• The data are modeled as RVs, which are independent and
identically distributed (iid) samples/draws from X or f(x),
represented by
• As all Xi’s are RVs, the sample mean is a RV, with its probability
distribution especially called the sampling distribution
Sampling distribution of the sample mean is the distribution of
the sample mean computed from a sample of size n. In theory, it
can be obtained from ALL possible samples of size n from the
population through repeated sampling
17
Percentage
histogram
18
Example 1: Sampling Distribution
https://onlinestatbook.com/stat_sim/sampli
ng_dist/index.html
19
Example 2: Sampling Distribution of the
Sample Mean
20
Normality of the Sample Mean from
Normal Population
When the population is normal, the sample mean is always
normally distributed for all sample sizes (n = 1,2,3,…)
Sampling distributions of from
21
CENTRAL LIMIT
THEOREM
22
Central Limit Theorem
For a random sample of size n from a
population with mean and variance (both
finite), the sample mean is approximately
normal when n is large (≥30)
sample mean
25
Importance & Takeaway From CLT
1. Justifies that the best way to
estimate a population mean
is to use the sample mean
─ Centers around
─ Smaller variation as n
─ Bell-shaped
By CLT
27
Example 5: CLT
A recent report stated that the day-care cost per week in a region is
$109. Suppose this figure is taken as the mean cost per week and
that the standard deviation is known to be $20
31
Example 6: Sample Proportion
Coke bottles are filled by a machine so that contents X have a normal
distribution with mean 298ml and SD 3ml
<295ml?
What is the proportion of bottles with less than 295ml?
Let X be the content (in ml) of any coke bottle, then
100 bottles
What if when we have a carton of 100 bottles of cokes
Example 6 (Cont’d 1)
≥295ml
By CLT, since np = 100 x 0.1586 > 10, and n(1-p) > 10,
34
Student’s t Statistic: Replacing an
unknown with s
When the population SD is unknown, the standardized sample
mean with replaced by s
• when n ≥ 30
35
SAMPLING
DISTRIBUTION OF
ANY STATISTIC
36
Are Other Statistics Approximately Normal?
Summary
other Statistic
than other than
Sample Counterpart
mean sample mean
40
Statistic Sampling distribution of the statistic
Known Yes
? Slide 21
Yes
No Sample No
Sample Normal size n
Mean X? ≥30? Slide 35
No Sample Yes
Yes
size n
≥30? or
No Slide 23 Slide 35
Summary Statistic
other than An approximate sampling distribution
sample mean constructed via repeated sampling Slide 37
41
Takeaway
Sampling variation; repeated sampling
Sampling distribution of the sample mean
Central Limit Theorem
Sampling distribution of the sample proportion
Sampling distribution of other sample statistics
42