Professional Documents
Culture Documents
Considerations
1
Statistics is a Science of
Inference
• Statistical Inference: On basis of sample
– Predict and forecast values of statistics derived from
population parameters... limited and incomplete
– Test hypotheses about values of sample information
population parameters...
– Make decisions...
Unbiased
Sample
Unbiased, representative
sample drawn at random
Democrats Republicans from the entire
Population
population.
Biased
People who have phones Sample Biased, unrepresentative
and/or cars and/or are
Digest readers. sample drawn from
Democrats
people who have cars
Republicans
Population and/or telephones and/or
read the Digest.
Sampling vs. Census ?
Go On-Line
www.surveysampling.com
4
Sampling Design Process
5
To obtain a representative
sample . . . .
Steps to follow:
6
Representative Sample
7
▪Population inferences can be made...
▪...by selecting a representative sample
from the population
Target Population
10
Sampling Unit
11
Sampling Frame
12
Sampling Methods
Go On-Line
www.svys.com
Probability
Non-Probability
13
Probability vs. Non-Probability Sampling
14
Convenience Sampling
15
Judgment Sampling
16
Simple Random Sampling
17
Random Sampling Techniques
▪ N = 30
▪ n=6
Simple Random Sampling:
Random Number Table
N = 30
n=6
Systematic Sampling
▪N
▪k▪ = ▪ ▪▪,
. . . a process that involves
▪n
randomly selecting an initial
▪where▪:
starting point on a list, and
thereafter every nth element in ▪n▪=▪ sample size
the sampling frame.
▪N▪=▪ population size
23
Cluster Sampling
. . . a form of probability
sampling in which the
relatively homogeneous
individual clusters where
sampling occurs are chosen
randomly and not all
clusters are sampled.
24
Cluster Sampling
▪ Advantages
• More convenient for geographically
dispersed populations
• Reduced travel costs to contact sample
elements
• Simplified administration of the survey
• Unavailability of sampling frame prohibits
using other random sampling methods
Cluster Sampling
▪ Disadvantages
• Statistically less efficient when the cluster
elements
are similar
• Costs and problems of statistical analysis
are greater
than for simple random sampling
Determining sample size involves achieving a
balance between several factors:
28
Three decisions to make when statistical
formulas are used to determine sample size:
29
SAMPLING
DISTRIBUTION
=====================
Sample Statistics as Estimators of
Population Parameters
X ~ N (, )
n
f(X)
0.2
Sampling Distribution: n =2
centered on the population 0.1
Normal population
Normal population
mean, but becomes more 0.0
compactly distributed around
that population mean
The Central Limit Theorem
n=5
When sampling from a population with mean 0.25
P(X)
0.15
0.10
P(X)
(n >30). 0.1
0.0
X
f(X)
0.2
0.1
0.0
-
X
The Central Limit Theorem Applies to
Sampling Distributions from Any Population
Population
n=2
n = 30
X X X X
Confidence Intervals
Sample
General Formula
• Confidence Level
• Confidence in which the interval
will contain the unknown
population parameter
• A percentage (less than 100%)
Confidence Level, (1-) (continued)
Population Population
Mean Proportion
σ Known σ Unknown
Confidence Interval for μ (σ Known)
• Assumptions
• Population standard deviation σ is known
• Population is normally distributed
• If population is not normal, use large sample
Z = 1.96
• Consider a 95% confidence interval:
1− = .95
α α
= .025 = .025
2 2
Confidence
Confidence
Coefficient, Z value
Level
1−
80% .80 1.28
90% .90 1.645
95% .95 1.96
98% .98 2.33
99% .99 2.58
99.8% .998 3.08
99.9% .999 3.27
Intervals and Level of Confidence
Sampling Distribution of the Mean
/2 1− /2
x
Intervals μx = μ
extend x1
fromσ x2 (1-)x100%
X+Z of intervals
n
constructed
to σ contain μ;
X−Z
n ()x100% do
Confidence not.
Intervals
Example
Population Population
Mean Proportion
σ Known σ Unknown
Confidence Interval for μ
(σ Unknown)
(continued)
▪ Assumptions
▪ Population standard deviation is unknown
▪ Population is normally distributed
▪ If population is not normal, use large sample
▪ Use Student’s t Distribution
▪ Confidence Interval Estimate:
S
X t n-1
n
(where t is the critical value of the t distribution with n-1 d.f. and an
area of α/2 in each tail)
Student’s t Distribution
d.f. = n - 1
Degrees of Freedom (df)
Idea: Number of observations that are free to vary
after sample mean has been calculated
Example: Suppose the mean of 3 numbers is 8.0
Standard
Normal
(t with df = )
t (df = 13)
t-distributions are bell-
shaped and symmetric, but
have ‘fatter’ tails than the t (df = 5)
normal
0 t
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 7-54
Student’s t Table
Confidence t t t Z
Level (10 d.f.) (20 d.f.) (30 d.f.) ____
Note: t Z as n increases
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 7-56
Example
A random sample of n = 25 has X = 50 and
S = 8. Form a 95% confidence interval for μ
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 7-59
Determining Sample Size
Determining
Sample Size
For the
Mean Sampling error
(margin of error)
σ σ
XZ e=Z
n n
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 7-60
Determining Sample Size
(continued)
Determining
Sample Size
For the
Mean
σ Z σ
2 2
e=Z Now solve
for n to get n= 2
n e
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 7-61
Determining Sample Size
(continued)
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 7-62
Required Sample Size Example
If = 45, what sample size is needed to
estimate the mean within ± 5 with 90%
confidence?
Z σ 2
(1.645) (45) 2 2 2
n= 2
= 2
= 219.19
e 5
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 7-63