Professional Documents
Culture Documents
• Aims of Sampling
• Probability Distributions
• Sampling Distributions
• The Central Limit Theorem
• Types of Samples
Aims of sampling
• Reduces cost of research (e.g. political polls)
• Generalize about a larger population (e.g.,
benefits of sampling city r/t neighborhood)
• In some cases (e.g. industrial production)
analysis may be destructive, so sampling is
needed
Probability
• Probability: what is the chance that a given
event will occur?
• Probability is expressed in numbers between 0
and 1. Probability = 0 means the event never
happens; probability = 1 means it always
happens.
• The total probability of all possible event
always sums to 1.
Probability distributions: Permutations
0.4
0.3
0.2
0.1
0
0 1 2
How about family of three?
Num. Girls child #1 child #2 child #3
0 B B B
1 B B G
1 B G B
1 G B B
2 B G G
2 G B G
2 G G B
3 G G G
Probability distribution of number of girls
0.5
0.4
0.3
0.2
0.1
0
0 1 2 3
How about a family of 10?
0.3
0.25
0.2
0.15
0.1
0.05
0
0 1 2 3 4 5 6 7 8 9 10
As family size increases, the binomial distribution
looks more and more normal.
A C
Coin toss
b
e
r
o
f
s
a 18 19 20 21 22 23 24 25 26
m
p Sampling Distribution of Income, 1992 (thousands)
l
e
s
Standard Deviation
( X X )
2
67,300
= 16,825 = 129.71
s= n 1
=
4
Standard Deviation and Normal Distribution
Distribution of Sample Means with 21
Samples
10
S.D. = 2.02
Mean of means = 41.0
8 Number of Means = 21
Frequency
0 37 38 39 40 41 42 43 44 45 46
Sample Means
Distribution of Sample Means with 96
Samples
14
S.D. = 1.80
12 Mean of Means = 41.12
Number of Means = 96
10
8
Frequency
0 37 38 39 40 41 42 43 44 45 46
Sample Means
Distribution of Sample Means with 170
Samples
30
S.D. = 1.71
Mean of Means= 41.12
Number of Means= 170
20
Frequency
10
0 37 38 39 40 41 42 43 44 45 46
Sample Means
The standard deviation of the sampling
distribution is called the standard error
The Central Limit Theorem
Standard error can be estimated from a single sample:
Where
s is the sample standard deviation (i.e., the
sample based estimate of the standard deviation of the
population), and
n is the size (number of observations) of the sample.
Confidence intervals
Because we know that the sampling distribution is normal,
we know that 95.45% of samples will fall within two
standard errors.
.
x n © 2002 The Wadsworth Group
Standardizing a Sample Mean
on a Normal Curve
• The standardized z-score is how far above or
below the sample mean is compared to the
population mean in units of standard error.
– “How far above or below” = sample mean minus µ
– “In units of standard error” = divide by
• Standardized sample mean
n
sample mean x –
z
standard error
n
Central Limit Theorem
• According to the Central Limit Theorem (CLT),
the larger the sample size, the more normal
the distribution of sample means becomes. The
CLT is central to the concept of statistical
inference because it permits us to draw
conclusions about the population based strictly
on sample data without having knowledge
about the distribution of the underlying
population.
Sampling Distribution of the Mean
• When the population is not normally distributed
– Shape: When the sample size taken from such a
population is sufficiently large, the distribution of
its sample means will be approximately normally
distributed regardless of the shape of the
underlying population those samples are taken
from. According to the Central Limit Theorem, the
larger the sample size, the more normal the
distribution of sample means becomes.
Sampling Distribution of the Mean
• When the population is not normally distributed
– Center: The mean of the distribution of sample
means is the mean of the population, µ. Sample
size does not affect the center of the distribution.
– Spread: The standard deviation of the distribution
of sample means, or the standard error, is
.
x n
Example: Standardizing a Mean
• Problem: When a production machine is properly calibrated, it
requires an average of 25 seconds per unit produced, with a
standard deviation of 3 seconds. For a simple random sample of n
= 36 units, the sample mean is found to be 26.2 seconds per unit.
When the machine is properly calibrated, what is the probability
that the mean for a simple random sample of this size will be at
least 26.2 seconds?
–
x 26.2, sample
– Standardized 25, mean:
3 26.2 25
z 2.40
3
36
P ( x 26.2) P ( z 2.40) 0.0082
Sampling Distribution of the Proportion
• When the sample statistic is generated by a count
not a measurement, the proportion of successes in a
sample of n trials is p, where
– Shape: Whenever both n and n(1 – ) are
greater than or equal to 5, the distribution of
sample proportions will be approximately
normally distributed.
Sampling Distribution of the Proportion
• When the sample proportion of successes in a sample
of n trials is p,
– Center: The center of the distribution of sample
proportions is the center of the population, .
– Spread: The standard deviation of the distribution
of sample proportions, or the standard error, is
p (1– ) .
n
Standardizing a Sample Proportion on a Normal Curve
47
Hypothesis Testing
• The general goal of a hypothesis test is to rule
out chance (sampling error) as a plausible
explanation for the results from a research
study.
• Hypothesis testing is a technique to help
determine whether a specific treatment has
an effect on the individuals in a population.
48
Hypothesis Testing
The hypothesis test is used to evaluate the
results from a research study in which
1. A sample is selected from the
population.
2. The treatment is administered to the
sample.
3. After treatment, the individuals in the
sample are measured.
49
Hypothesis Testing (cont.)
• If the individuals in the sample are noticeably
different from the individuals in the original
population, we have evidence that the
treatment has an effect.
• However, it is also possible that the difference
between the sample and the population is
simply sampling error
51
Hypothesis Testing (cont.)
• The purpose of the hypothesis test is to decide
between two explanations:
1. The difference between the sample and the
population can be explained by sampling error
(there does not appear to be a treatment effect)
2. The difference between the sample and the
population is too large to be
explained by sampling error (there does
appear to be a treatment effect).
53
The Null Hypothesis, the Alpha Level, the Critical
Region, and the Test Statistic
55
Step 1
58
Step 3
60
Step 4
A large value for the test statistic shows that the
obtained mean difference is more than would be
expected if there is no treatment effect. If it is large
enough to be in the critical region, we conclude that
the difference is significant or that the treatment has
a significant effect. In this case we reject the null
hypothesis. If the mean difference is relatively
small, then the test statistic will have a low value. In
this case, we conclude that the evidence from the
sample is not sufficient, and the decision is fail to
reject the null hypothesis.
61
Errors in Hypothesis Tests
• Just because the sample mean (following
treatment) is different from the original
population mean does not necessarily indicate
that the treatment has caused a change.
• You should recall that there usually is some
discrepancy between a sample mean and the
population mean simply as a result of
sampling error.
63
Errors in Hypothesis Tests (cont.)
• Because the hypothesis test relies on sample
data, and because sample data are not
completely reliable, there is always the risk
that misleading data will cause the hypothesis
test to reach a wrong conclusion.
• Two types of error are possible.
64
Type I Errors
• A Type I error occurs when the sample data appear to show
a treatment effect when, in fact, there is none.
• In this case the researcher will reject the null hypothesis and
falsely conclude that the treatment has an effect.
• Type I errors are caused by unusual, unrepresentative
samples. Just by chance the researcher selects an extreme
sample with the result that the sample falls in the critical
region even though the treatment has no effect.
• The hypothesis test is structured so that Type I errors are
very unlikely; specifically, the probability of a Type I error is
equal to the alpha level.
65
Type II Errors
• A Type II error occurs when the sample does not
appear to have been affected by the treatment
when, in fact, the treatment does have an effect.
• In this case, the researcher will fail to reject the null
hypothesis and falsely conclude that the treatment
does not have an effect.
• Type II errors are commonly the result of a very small
treatment effect. Although the treatment does have
an effect, it is not large enough to show up in the
research study.
66
Directional Tests
• When a research study predicts a specific
direction for the treatment effect (increase or
decrease), it is possible to incorporate the
directional prediction into the hypothesis test.
69
Measuring Effect Size
• A hypothesis test evaluates the statistical
significance of the results from a research study.
• That is, the test determines whether or not it is likely
that the obtained sample mean occurred without
any contribution from a treatment effect.
• The hypothesis test is influenced not only by the size
of the treatment effect but also by the size of the
sample.
• Thus, even a very small effect can be significant if it is
observed in a very large sample.
70
Measuring Effect Size
• Because a significant effect does not necessarily
mean a large effect, it is recommended that the
hypothesis test be accompanied by a measure of the
effect size.
• We use Cohen=s d as a standardized measure of
effect size.
• Much like a z-score, Cohen=s d measures the size of
the mean difference in terms of the standard
deviation.
71
Power of a Hypothesis Test
• The power of a hypothesis test is defined is
the probability that the test will reject the null
hypothesis when the treatment does have an
effect.
• The power of a test depends on a variety of
factors including the size of the treatment
effect and the size of the sample.
73
Hypothesis Testing
Outline
• Null and Alternative Hypotheses
• Test Statistic
• P-Value
• Significance Level
• One-Sample z Test
• Power and Sample Size
Terms Introduce in Prior Chapter
• Population all possible values
• Sample a portion of the population
• Statistical inference generalizing from a sample
to a population with calculated degree of certainty
• Two forms of statistical inference
– Hypothesis testing
– Estimation
• Parameter a characteristic of population, e.g., population
mean µ
• Statistic calculated from data in the sample, e.g., sample
mean ( x )
Distinctions Between Parameters and
Statistics
Parameters Statistics
Vary No Yes
Calculated No Yes
Sampling Distributions of a Mean
x ~ N , SE x
where SE x
n
Hypothesis Testing
• Is also called significance testing
• Tests a claim about a parameter using
evidence (data in a sample
• The technique is introduced by considering a
one-sample z test
• The procedure is broken into four steps
• Each element of the procedure must be
understood
Hypothesis Testing Steps
A. Null and alternative hypotheses
B. Test statistic
C. P-value and interpretation
D. Significance level (optional)
Null and Alternative Hypotheses
• Convert the research question to null and
alternative hypotheses
• The null hypothesis (H0) is a claim of “no
difference in the population”
• The alternative hypothesis (Ha) claims “H0 is
false”
• Collect data and seek evidence against H0 as a
way of bolstering Ha (deduction)
Illustrative Example: “Body Weight”
• The problem: In the 1970s, 20–29 year old
men in the U.S. had a mean μ body weight of
170 pounds. Standard deviation σ was 40
pounds. We test whether mean body weight
in the population now differs.
• Null hypothesis H0: μ = 170 (“no difference”)
• The alternative hypothesis can be either Ha: μ
> 170 (one-sided test) or
Ha: μ ≠ 170 (two-sided test)
Test Statistic
This is an example of a one-sample test of a
mean when σ is known. Use this statistic to
test the problem:
x 0
z stat
SE x
where 0 population mean assuming H 0 is true
and SE x
n
Illustrative Example: z statistic
• For the illustrative example, μ0 = 170
• We know σ = 40
• Take an SRS of n = 64. Therefore
40
SE x 5
n 64
• If we found a sample mean of 173, then
x 0 173 170
z stat 0.60
SE x 5
Illustrative Example: z statistic
If we found a sample mean of 185, then
x 0 185 170
zstat 3.00
SE x 5
Reasoning Behinµzstat
x ~ N 170,5
Sampling distribution of xbar
under H0: µ = 170 for n = 64
P-value
• The P-value answer the question: What is the
probability of the observed test statistic or one
more extreme when H0 is true?
• This corresponds to the AUC in the tail of the
Standard Normal distribution beyond the zstat.
• Convert z statistics to P-value :
For Ha: μ> μ0 P = Pr(Z > zstat) = right-tail beyond zstat
For Ha: μ< μ0 P = Pr(Z < zstat) = left tail beyond zstat
For Ha: μμ0 P = 2 × one-tailed P-value
One-sided P-value for zstat of 0.6
One-sided P-value for zstat of 3.0
Two-Sided P-Value
• One-sided Ha
AUC in tail beyond
zstat
• Two-sided Ha
consider potential
deviations in both Examples: If one-sided P
directions = 0.0010, then two-sided
double the one- P = 2 × 0.0010 = 0.0020.
sided P-value If one-sided P = 0.2743,
then two-sided P = 2 ×
0.2743 = 0.5486.
Interpretation
• P-value answer the question: What is the
probability of the observed test statistic …
when H0 is true?
• Thus, smaller and smaller P-values provide
stronger and stronger evidence against H0
• Small P-value strong evidence
Interpretation
Conventions*
P > 0.10 non-significant evidence against H0
0.05 < P 0.10 marginally significant evidence
0.01 < P 0.05 significant evidence against H0
P 0.01 highly significant evidence against H0
Examples
P =.27 non-significant evidence against H0
P =.01 highly significant evidence against H0