MATH2114 Week 10, Tuesday A&B: Fundamentals of Hypothesis Testing

MATH2114
Week 10, Tuesday A&B

Fundamentals of Hypothesis Testing
Slides Adopted & modified from “Business Statistics: A First

Course” 7th Ed, by Levine, Szabat and Stephan, 2016 Pearson
Education Inc.
Objectives
 The basic principles of hypothesis testing

 How to use hypothesis testing to test a mean or proportion
 The assumptions of each hypothesis-testing procedure, how to evaluate them,
and the consequences if they are seriously violated
 Pitfalls & ethical issues involved in hypothesis testing
 How to avoid the pitfalls involved in hypothesis testing
DCOVA
What is a Hypothesis?
 A hypothesis is a claim (assertion) about a population parameter:
 Population Mean
Example: The mean monthly cell phone bill

in this city is μ = $42
 Population Proportion
Example: The proportion of adults in this

city with cell phones is π = 0.68
DCOVA
The Hypothesis Testing Philosophy
Null Hypothesis, H0
 States the claim or assertion to be tested

 E.g. The mean diameter of a manufactured bolt is 30mm: H 0 : μ  30
 Is always about a population parameter, not about a sample statistic
H 0 : μ  30 H 0 : X  30
DCOVA

Null Hypothesis, H0
 Begin with the assumption that the null hypothesis is true

 Similar to the notion of innocent until proven guilty
 Refers to the status quo, or historical value
 Always and ONLY contains “=“

 A null hypothesis is precise
 Precision is needed to calculate the numbers
 May or may not be rejected, depending on data/evidence
DCOVA

The Alternative Hypothesis, H1
▪ Differs from the null hypothesis.

▪ Example: The mean diameter of a manufactured bolt is not equal to 30mm: H1: μ ≠
30
▪ Challenges the status quo.
▪ Never contains the “=” sign.

▪ May contain the “≠”, or “<”, or “>” sign.
▪ May or may not be proven.
▪ Is generally the hypothesis that the researcher is trying to prove.
DCOVA
The Alternative Hypothesis, H1
 Claim: The population mean age is 50.

 H0: μ = 50, H1: μ ≠ 50
 Sample the population and find the sample mean.
Population
Sample
DCOVA
 Suppose the sample mean age was X = 20.
 This is significantly lower than the claimed mean population age of

50.
 If the null hypothesis were true, the probability of getting such a different
sample mean would be very small, so you reject the null hypothesis .
 In other words, getting a sample mean of 20 is so unlikely if the population

mean was 50, you conclude that the population mean must not be 50.
DCOVA
Sampling
Distribution of X
X
20 μ = 50
If H0 is true ... then you reject
If it is unlikely that you
the null hypothesis
would get a sample
that μ = 50.
mean of this value ... ... When in fact this were
the population mean…
DCOVA
Test Statistic and Critical Values?
 How far is “far enough” to reject H0?

 The critical value of a test statistic creates a
“line in the sand” for decision making -- it
answers the question of how far is far enough.
DCOVA
Critical Values
Sampling Distribution of the test statistic
Region of Region of
Rejection Rejection
Region of
Non-Rejection
Critical Values
“Too Far Away” From Mean of Sampling Distribution

DCOVA
Risks in Decision Making
 Type I Error
 Reject a true null hypothesis
 A type I error is a “false alarm”
 The probability of a Type I Error is 
 Called level of significance of the test
 Set by researcher in advance
 Type II Error
 Failure to reject a false null hypothesis
 Type II error represents a “missed opportunity”
 The probability of a Type II Error is β
DCOVA
Risks in Decision Making
Possible Hypothesis Test Outcomes
Actual Situation
Decision H0 True H0 False
Do Not No Error Type II Error

Reject H0 Probability 1 - α Probability β
Type I Error No Error
Reject H0
Probability α Power 1 - β
DCOVA
Measures in Decision Making
 The confidence coefficient (1-α) is the probability of not

rejecting H0 when it is true.
 The confidence level of a hypothesis test is (1-α)*100%.
 The power of a statistical test (1-β) is the probability of

rejecting H0 when it is false.
DCOVA
Type 1&2 Error Relationship
▪ Type I and Type II errors cannot happen at

the same time
▪ A Type I error can only occur if H0 is true
▪ A Type II error can only occur if H0 is false
If Type I error probability () , then

Type II error probability (β)
DCOVA
A game of trade-offs: Shrinking areas
Decrease 𝛼 (Increases 𝛽) Increase the size of the effect

(Largely impossible)
𝜷 𝜶 𝜷 𝜶
Reduce SE for better distinction
Decrease 𝛽 (Increases 𝛼)
(Increase sample size)
𝜷 𝜶
𝜷 𝜶
DCOVA
Hypothesis test for the mean
Hypothesis
Tests for 
 Known  Unknown
(Z test) (t test)
DCOVA
Z Test of Hypothesis for the Mean (σ
Known)
 Convert sample statistic ( 𝑋ത ) to a ZSTAT test statistic
X μ
ZSTAT 
σ
n
DCOVA
Critical Value Approach
▪ For a two-tail test for the mean, σ known:

▪ Convert sample statistic ( 𝑋ത ) to test statistic (ZSTAT)
▪ Determine the critical Z values for a specified
level of significance  from a table or by using computer
software
▪ Decision Rule: If the test statistic falls in the rejection
region, reject H0 ; otherwise do not reject H0
DCOVA
Two-Tail Tests
 There are two cutoff values (critical values) defining the

rejection regions
/2 /2
H0: μ = 30 X
30
H1: μ  30
Reject H0 Do not reject H0 Reject H0
-Zα/2 0 +Zα/2 Z
Lower Upper
critical critical
value value
DCOVA
6 Steps in Hypothesis Testing
1. State the null hypothesis, H0 and the alternative hypothesis, H1

2. Choose the level of significance, , and the sample size, n. The
level of significance is based on the relative importance of Type I
and Type II errors
3. Determine the appropriate test statistic and sampling distribution
4. Determine the critical values that divide the rejection and
nonrejection regions
5. Collect data and compute the value of the test statistic
6. Make the statistical decision and state a conclusion. If the test
statistic falls into the nonrejection region, do not reject the null
hypothesis H0. If the test statistic falls into the rejection region,
reject the null hypothesis. Express the conclusion in the context of
the problem
DCOVA
Example
Test the claim that the true mean diameter
of a manufactured bolt is 30mm.
(Assume σ = 0.8)
1. State the appropriate null and alternative
hypotheses
 H0: μ = 30 H1: μ ≠ 30 (This is a two-tail test)
2. Specify the desired level of significance and the
sample size
 Suppose that  = 0.05 and n = 100 are chosen
for this test

DCOVA
Example
3. Determine the appropriate technique

 σ is assumed known so this is a Z test.
4. Determine the critical values

 For  = 0.05 the critical Z values are ±1.96
5. Collect the data and compute the test statistic

 Suppose the sample results are
n = 100, X = 29.84 (σ = 0.8 is assumed known)

So the test statistic is:
Xμ 29.84  30  0.16
Z STAT     2.0
σ 0.8 0.08
n 100
DCOVA
Example
6. Is the test statistic in the rejection region?
/2 = 0.025 /2 = 0.025
Reject H0 if Reject H0 Do not reject H0 Reject H0

ZSTAT < -1.96 or -Zα/2 = -1.96 0 +Zα/2 = +1.96
ZSTAT > 1.96;
otherwise do
not reject H0 Here, ZSTAT = -2.0 < -1.96, so the
test statistic is in the rejection
region
DCOVA
Example
6 (continued). Reach a decision and interpret the result
 = 0.05/2  = 0.05/2
Reject H0 Do not reject H0 Reject H0
-Zα/2 = -1.96 0 +Zα/2= +1.96

-2.0
Since ZSTAT = -2.0 < -1.96, reject the null hypothesis
and conclude there is statistically sufficient evidence
that the mean diameter of a manufactured bolt is not
equal to 30
Do you ever truly know 𝜎?
 Probably not!
 In virtually all real world engineering / business situations, σ is not known.
 If there is a situation where σ is known then µ is also known (since to

calculate σ you need to know µ.)
 If you truly know µ there would be no need to gather a sample to estimate it.
DCOVA
Hypothesis Testing where 𝜎 is unknown
 If the population standard deviation is unknown, you instead use the sample
standard deviation S.
 Because of this change, you use the t distribution instead of the Z distribution
to test the null hypothesis about the mean.
X μ
t STAT 
S
n
 When using the t distribution you must assume the population you are
sampling from follows a normal distribution
 (Or large sample)
 All other steps, concepts, and conclusions are the same.

DCOVA
Example
The average cost of a hotel

room in New York is said to
be $168 per night. To
determine if this is true, a
random sample of 25 hotels
is taken and resulted in an X
of $172.50 and an S of
$15.40. Test the appropriate H0: μ = 168
hypotheses at  = 0.05.
H1: μ  168
(Assume the population distribution is normal)
DCOVA
Example
H0: μ = 168 /2=.025 /2=.025

H1: μ  168
▪  = 0.05 Reject H0 Do not reject H0 Reject H0

t 24,0.025
-t 24,0.025 0
▪ n = 25, df = 25-1=24 2.0639
-2.0639 1.46
▪  is unknown, so
▪ use a t statistic Xμ 172.50  168
t STAT    1.46
S 15.40
▪ Critical Value:
n 25
▪ ±t24,0.025 = ± 2.0639
With tSTAT =1.46 < t /2 = 2.0639, do not reject
H0. There is statistically insufficient evidence
that true mean cost is different from $168.
DCOVA
Normality Assumption
 As long as the sample size is not very small and the population is not very
skewed, the t-test can be used.
 To evaluate the normality assumption:
 Determine how closely sample statistics match the normal distribution’s
theoretical properties.
 Construct a histogram or stem-and-leaf display or boxplot or a normal probability
plot.
 Produce a normal probability plot
10 Minute Break
Lady Tasting Tea
A historical story in randomised trials
 Bristol claims that she can taste whether milk or tea was poured into a cup
first
 Fisher provided her with 8 cups, 4 with milk first, 4 with tea first
 Random order, presented all at once Sir Ronald Fisher
 Bristol had to choose 4 cups prepared by one method
 H0: Bristol can’t tell and is just guessing:

8 8!
 = = 70 combinations
4 4! 8−4 !
 So, 1/70 of selecting 1 method, or 2/70=1/35 chance of selecting either
Dr. Muriel Bristol

DCOVA
p-Value approach to testing
 p-value: Probability of obtaining a test statistic

equal to or more extreme than the observed
sample value given H0 is true
 The p-value is also called the observed level of
significance
 Itis the smallest value of 𝛼 for which H0 can be
rejected
DCOVA
p-Value approach to testing
 Compare the p-value with 
 If p-value <  , reject H0

 If p-value   , do not reject H0
 Remember
 If the p-value is low then H0 must go

DCOVA
The 5 Step p-value approach to
Hypothesis Testing
1. State the null hypothesis, H0 and the alternative hypothesis, H1
2. Choose the level of significance, 𝛼, and the sample size, n. The level of
significance is based on the relative importance of the risks of a type I and a type
II error.
3. Determine the appropriate test statistic and sampling distribution
4. Collect data and compute the value of the test statistic and the p-value
5. Make the statistical decision and make a conclusion. If the p-value is < α then
reject H0, otherwise do not reject H0. State the conclusion in the context of the
problem
DCOVA
Calculating p-values in Excel
t Test for the Hypothesis of the Mean
Data
Null Hypothesis µ= $ 168.00
Level of Significance 0.05
Sample Size 25
Sample Mean $ 172.50
Sample Standard Deviation $ 15.40
Intermediate Calculations
Standard Error of the Mean $ 3.08 =B8/SQRT(B6)
Degrees of Freedom 24 =B6-1
t test statistic 1.46 =(B7-B4)/B11
Two-Tail Test
Lower Critical Value -2.0639 =-TINV(B5,B12)
Upper Critical Value 2.0639 =TINV(B5,B12)
p-value > α p-value 0.157 =TDIST(ABS(B13),B12,2)
So do not reject H0 Do Not Reject Null Hypothesis =IF(B18<B5, "Reject null hypothesis",
"Do not reject null hypothesis")
DCOVA
Example
Test the claim that the true mean
diameter of a manufactured bolt is 30mm.
(Assume σ = 0.8)
1. State the appropriate null and alternative
hypotheses
 H0: μ = 30 H1: μ ≠ 30 (This is a two-tail test)
2. Specify the desired level of significance and the
sample size
 Suppose that  = 0.05 and n = 100 are chosen
for this test

DCOVA
Example
3. Determine the appropriate technique

 σ is assumed known so this is a Z test.
4. Collect the data, compute the test statistic and the

p-value
 Suppose the sample results are
n = 100, X = 29.84 (σ = 0.8 is assumed known)

So the test statistic is:
X  μ 29.84  30  0.16
Z STAT     2.0
σ 0.8 0.08
n 100
DCOVA
Example
4. (continued) Calculate the p-value.

 How likely is it to get a ZSTAT of -2 (or something further from
the mean (0), in either direction) if H0 is true?
P(Z < -2.0) = 0.0228 P(Z > 2.0) = 0.0228
0 Z
-2.0 2.0
p-value = 0.0228 + 0.0228 = 0.0456
DCOVA
Example
▪ 5. Is the p-value < α?

▪ Since p-value = 0.0456 < α = 0.05 Reject H0
▪ 5. (continued) State the conclusion in the context
of the situation:
With α = 0.05 and p-value = 0.0456, we reject the null
hypothesis and accept the alternative hypothesis. There is
statistically sufficient evidence to conclude the mean
diameter of a manufactured bolt is not equal to 30mm (in
another word, the manufacturer’s claim is not true).
DCOVA
Connection between Two-Tail Tests and
Confidence Intervals
 For X = 29.84, σ = 0.8 and n = 100, the 95%
confidence interval is:
0.8 0.8
29.84 - (1.96) to 29.84  (1.96)
100 100
29.6832 ≤ μ ≤ 29.9968
 Since this interval does not contain the hypothesized

mean (30), we reject the null hypothesis at  = 0.05
DCOVA
One-Tail Tests
 In many cases, the alternative hypothesis focuses on a particular direction
This is a lower-tail test since the

H0: μ = 3
alternative hypothesis is focused on
H1: μ < 3 the lower tail below the mean of 3
H0: μ = 3 This is an upper-tail test since the

alternative hypothesis is focused on
H1: μ > 3
the upper tail above the mean of 3
DCOVA
Lower Tail Tests
H0: μ = 3
 There is only one H1: μ < 3
critical value, since
the rejection area
is in only one tail 
Reject H0 Do not reject H0

Z or t
-Zα or -tα 0
μ X
Critical value
DCOVA
Upper-Tail Tests
H0: μ = 3
 There is only one
critical value, since H1: μ > 3
the rejection area is
in only one tail 
Do not reject H0 Reject H0

Z or t Zα or tα
0
_
X μ
Critical value
DCOVA
Example
A phone industry manager thinks that customer monthly cell
phone bills have increased, and now average over $52 per
month. The company wishes to test this claim. (Assume a
normal population)
Form hypothesis test:

H0: μ = 52 the mean is $52 per month
H1: μ > 52 the mean is greater than $52 per month
(i.e., sufficient evidence exists to support the
manager’s claim)
DCOVA
Example
 Suppose that  = 0.10 is chosen for this test and

n = 25.
Find the rejection region: Reject H0
 = 0.10

0 1.318
Reject H0 if tSTAT > tcritical = 1.318

DCOVA
Example
Obtain sample and compute the test statistic
Suppose a sample is taken with the following results:

n = 25, 𝑋ത = 53.1, and S = 10
 Then the test statistic is:
Xμ 53.1  52
t STAT    0.55
S 10
n 25
DCOVA
Example
Reach a decision and interpret the result:
Reject H0
 = 0.10

1.318
0
tSTAT = 0.55
Conclusion:
Do not reject H0 since tSTAT = 0.55 < tcritical =1.318.
With 0.1, there is no statistically sufficient evidence
that the mean bill is over $5.
DCOVA
Utilizing the p-value approach
p-value = .2937
Reject H0
 = .10
0
Do not reject Reject
H0 1.318 H0
tSTAT = .55
Do not reject H0 since p-value = .2937 >  = .10
There is no statistically sufficient evidence that
the mean bill is over $52.
DCOVA
Hypothesis Tests for Proportions
 Involves categorical variables

 Two possible outcomes
 Possesses characteristic of interest
 Does not possess characteristic of interest
 Fraction or proportion of the population in the category of interest is denoted

by π
DCOVA
Proportions
 Sample proportion in the category of interest is denoted by p
X number in category of interest in sample

 p 
n sample size
 When both nπ and n(1-π) are at least 5, p can be approximated by a

normal distribution with mean and standard deviation
μp    (1  )
 σp 
n
DCOVA
Hypothesis Tests for Proportions
 The sampling
distribution of p is
approximately normal Hypothesis
(with known 𝜎), so Tests for p
the test statistic is a
ZSTAT value:
nπ  5 nπ < 5
and or
pπ n(1-π)  5
ZST AT  n(1-π) < 5
π (1  π )
Out of scope
n for this class
DCOVA
Example
A marketing company
claims that it receives
8% responses from its
mailing. To test this
claim, a random sample
of 500 were surveyed
Check:
with 25 responses. Test
at the  = 0.05 n π = (500)(.08) = 40
✓
significance level. n(1-π) = (500)(.92) = 460
DCOVA
Critical Value Solution
H0: π = 0.08 Test Statistic:

p π .05  .08
H1: π  0.08 ZSTAT    2.47
π (1  π ) .08(1  .08)
 = 0.05
n 500
n = 500, p = 0.05
Decision:
Critical Values: ± 1.96
Reject H0 at  = 0.05
Reject Reject
Conclusion:
.025 .025 With Z/2 = ± 1.96 & ZSTAT = - 2.47, we can
reject the null hypothesis and accept the
-1.96 0 1.96 z alternative hypothesis.There is statistically
-2.47 sufficient evidence that the company’s
claim of 8% response rate is not true.
DCOVA
P-Value Solution
Calculate the p-value and compare to 

(For a two-tail test the p-value is always two-tail)
Do not reject H0
Reject H0 Reject H0 p-value = 0.0136:
/2 = .025 /2 = .025
P(Z  2.47)  P(Z  2.47)
0.0068 0.0068
 2(0.0068)  0.0136
-1.96 0 1.96
Z = -2.47 Z = 2.47
Reject H0 since p-value = 0.0136 <  = 0.05

DCOVA
Questions to Address in the Planning
Stage
 What is the goal of the survey, study, or experiment?
 How can you translate this goal into a null and an alternative hypothesis?
 Is the hypothesis test one or two tailed?
 Can a random sample be selected?

 What types of data will be collected? Numerical? Categorical?
 What level of significance should be used?
 Is the intended sample size large enough to achieve the desired power?
 What statistical test procedure should be used and why?
 What conclusions & interpretations can you reach from the results of the planned
hypothesis test?
Failing to consider these questions can lead to:

 Bias or incomplete results
 p-hacking (more on this later)
DCOVA
Ethical Issues
 Should document & report both good & bad results

 Should not just report statistically significant results
 Poor research methodology vs. unethical behavior?
 People make mistakes, and pitfalls can be subtle
 Ethical issues can arise in:

 (D) The use of human subjects
 (C) The data collection method
 (O) The cleansing and discarding of data
 (A) The type of test being used
 The assumptions being made
 (A) The level of significance being used
 (A) The failure to report pertinent findings
DCOVA
Statistical Significance vs Practical
Significance (effect size)
 Statistically significant results (rejecting the null hypothesis) are not always
of practical significance
 This is more likely to happen when the sample size gets very large
 Practically important results might be found to be statistically insignificant
(failing to reject the null hypothesis)
 This is more likely to happen when the sample size is relatively small
DCOVA
Summary
 The basic principles of hypothesis testing

 How to use hypothesis testing to test a mean or proportion
 The assumptions of each hypothesis-testing procedure, how to evaluate them,
and the consequences if they are seriously violated
 Pitfalls & ethical issues involved in hypothesis testing
 How to avoid the pitfalls involved in hypothesis testing
Self-Learning Exercises
EXCEL exercises (p341- 242):

- Complete EG9.1 – EG9.4
- Attempt the following Problems of Chapter 9

9.2, 9.8, 9.16, 9.22, 9.28, 9.58, 9.72.
Note: answers to the questions are provided at end of the textbook p577-578.

MATH2114 Week 10, Tuesday A&B: Fundamentals of Hypothesis Testing

Uploaded by

Document Information

Original Title

Copyright

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

MATH2114 Week 10, Tuesday A&B: Fundamentals of Hypothesis Testing

Uploaded by

Copyright:

MATH2114

Week 10, Tuesday A&B

Slides Adopted & modified from “Business Statistics: A First

 The basic principles of hypothesis testing

 A hypothesis is a claim (assertion) about a population parameter:

Example: The mean monthly cell phone bill

Example: The proportion of adults in this

 States the claim or assertion to be tested

 Is always about a population parameter, not about a sample statistic

The Hypothesis Testing Philosophy

 Begin with the assumption that the null hypothesis is true

 Always and ONLY contains “=“

The Hypothesis Testing Philosophy

▪ Differs from the null hypothesis.

▪ Never contains the “=” sign.

 Claim: The population mean age is 50.

 Suppose the sample mean age was X = 20.

 This is significantly lower than the claimed mean population age of

 In other words, getting a sample mean of 20 is so unlikely if the population

 How far is “far enough” to reject H0?

“Too Far Away” From Mean of Sampling Distribution

Possible Hypothesis Test Outcomes

Decision H0 True H0 False

Do Not No Error Type II Error

 The confidence coefficient (1-α) is the probability of not

 The confidence level of a hypothesis test is (1-α)*100%.

 The power of a statistical test (1-β) is the probability of

▪ Type I and Type II errors cannot happen at

If Type I error probability () , then

Decrease 𝛼 (Increases 𝛽) Increase the size of the effect

▪ For a two-tail test for the mean, σ known:

 There are two cutoff values (critical values) defining the

1. State the null hypothesis, H0 and the alternative hypothesis, H1

for this test

3. Determine the appropriate technique

4. Determine the critical values

5. Collect the data and compute the test statistic

n = 100, X = 29.84 (σ = 0.8 is assumed known)

6. Is the test statistic in the rejection region?

/2 = 0.025 /2 = 0.025

Reject H0 if Reject H0 Do not reject H0 Reject H0

6 (continued). Reach a decision and interpret the result

Reject H0 Do not reject H0 Reject H0

-Zα/2 = -1.96 0 +Zα/2= +1.96

 In virtually all real world engineering / business situations, σ is not known.

 If there is a situation where σ is known then µ is also known (since to

 All other steps, concepts, and conclusions are the same.

The average cost of a hotel

H0: μ = 168 /2=.025 /2=.025

▪  = 0.05 Reject H0 Do not reject H0 Reject H0

 H0: Bristol can’t tell and is just guessing:

Dr. Muriel Bristol

 p-value: Probability of obtaining a test statistic

 Compare the p-value with 

 If p-value <  , reject H0

 If the p-value is low then H0 must go

3. Determine the appropriate test statistic and sampling distribution

for this test

3. Determine the appropriate technique

4. Collect the data, compute the test statistic and the

n = 100, X = 29.84 (σ = 0.8 is assumed known)

4. (continued) Calculate the p-value.

P(Z < -2.0) = 0.0228 P(Z > 2.0) = 0.0228

▪ 5. Is the p-value < α?

 Since this interval does not contain the hypothesized

 In many cases, the alternative hypothesis focuses on a particular direction