You are on page 1of 8

Topic: Foundations of Statistics and Probability for Data Science

Sub-Topics: Hypothesis Testing

LEARNING OBJECTIVES :

1. Be able to explain about Hypothesis Testing

1.1. Be able to explain Hypothesis Testing definition and applications in


real-world

1.2. Be able to explain the Null and Alternative hypothesis and significance
level ?

1.3. Be able to give explanation of critical region and Acceptance of


hypothesis, p-values and errors in hypothesis testing

Name of Presenter: Pavan Kumar S Page 1 of 8


Date of Presentation: 12/08/2021
Topic: Foundations of Statistics and Probability for Data Science
Sub-Topics: Hypothesis Testing

QUESTION 1 :

State the null and alternative hypotheses to be used in testing the following claims
and determine generally where the critical region is located :

Hypothesis Testing
Test a hypothesis, to make inference on the population based on the sample.
Null Hypothesis ( Ho ) : A claim about the population
Alternate Hypothesis ( Ha ): Evidence against the claim

i. To confirm the accuracy, that a bag of candies has exactly 50 candies. If less, the
customers may feel bad, and if more then it is a loss to the maker.
𝐻𝑜 : µ = 50 | 𝐻𝑎 : µ ≠ 50 | Critical region in both tails.

ii. No more than 20% of the faculty at the local university contributed to the annual
giving fund.
𝐻𝑜 : µ ≤ 0.2 | 𝐻𝑎 : µ > 0.2 | Critical region in right tail.

Name of Presenter: Pavan Kumar S Page 2 of 8


Date of Presentation: 12/08/2021
Topic: Foundations of Statistics and Probability for Data Science
Sub-Topics: Hypothesis Testing

QUESTION 2 :
Suppose a manufacturer claims that the mean lifetime of a light bulb is at least
10,000 hours. In a sample of 30 light bulbs, it was found that they only last 9,900
hours on average. Assuming the population standard deviation to be 120 hours, at
0.05 significance level, can we reject the claim by the manufacturer?

i. State Null and Alternative hypothesis


Ho : μ ≥ 10000

Ha : μ < 10000

This implies a lower tail test

ii. Choose Statistic


Sample Size, n = 30
Sample Mean = 9900
Population Standard Deviation = 120

As the population standard deviation is given and sample size is large enough, we
will use z-statistic
Z score = ( x̅ - μ) / ( σ /√n)

iii. Determine critical region; Compute critical value


Significance level, α = 0.05

The critical value at alpha = 0.05 is = qnorm(0.05) = -1.64

Name of Presenter: Pavan Kumar S Page 3 of 8


Date of Presentation: 12/08/2021
Topic: Foundations of Statistics and Probability for Data Science
Sub-Topics: Hypothesis Testing

iv. Determine the statistic value and p-value


Z score = ( x̅ - μ) / ( σ/ √n) = ( 9900 - 10000) / ( 120/ √30) = - 4.564

P ( z < - 4.564 ) = 2.509404e-06 = 0.000002509404


Or in R use : pnorm(9900, 10000, 120/sqrt(30))

v. Make your decision


Does the calculated sample statistic value lie in the critical region?
i. Z score << Z critical i.e. - 4.564 << - 1.64
It implies the sample statistic we calculated is in critical region, which tells that we
have enough evidence to reject the Null Hypothesis

ii. P value << alpha i.e. 0.000002509404 << 0.05


It implies the sample statistic we calculated is in critical region, which tells that we
have enough evidence to reject the Null Hypothesis

Therefore using both methods we conclude that we can reject the Null hypothesis
that the mean lifetime of a light bulb is at least 10,000 hours.

QUESTION 3 :
A car manufacturer claims that a model gives as good a mileage as over 25 miles per
gallon. A consumer group asked 40 owners of this model to calculate their mpg and
the mean value was found to be 22 with a standard deviation of 1.8. Do you support
the manufacturer’s claim?

Name of Presenter: Pavan Kumar S Page 4 of 8


Date of Presentation: 12/08/2021
Topic: Foundations of Statistics and Probability for Data Science
Sub-Topics: Hypothesis Testing

i. State Null and Alternative hypothesis.


𝐻𝑜 : µ ≥ 25
𝐻𝑎 : µ < 25
This implies a lower tail test

ii. Choose Statistic


Sample Size, n = 40
Sample Mean = 22
Sample Standard Deviation = 1.8

As the sample size is large enough, we will use z-statistic


Z score = ( x̅ - μ) / ( σ /√n)

iii. Determine critical region; Compute critical value


Significance level, α = 0.05

The critical value at alpha = 0.05 is = qnorm(0.05) = -1.64

iv. Determine the statistic value and p-value

22 − 25
Z score = ( x̅ - μ) / ( σ/ √n) = 1.8 = - 10.540
√40

P ( z < -10.540 ) = 2.824945e-26

v. Make your decision

Name of Presenter: Pavan Kumar S Page 5 of 8


Date of Presentation: 12/08/2021
Topic: Foundations of Statistics and Probability for Data Science
Sub-Topics: Hypothesis Testing

Does the calculated sample statistic value lie in the critical region?
i. Z score << Z critical i.e. - 10.540 << - 1.64
It implies the sample statistic we calculated is in critical region, which tells that we
have enough evidence to reject the Null Hypothesis

ii. P value << alpha i.e. 2.824945e-26 << 0.05


It implies the sample statistic we calculated is in critical region, which tells that we
have enough evidence to reject the Null Hypothesis

Therefore using both methods we conclude that we reject the manufacturer’s claim
that the mean mpg for the model exceeds 25 mpg

QUESTION 4 :
Suppose the mean weight of King Penguins found in an Antarctic colony last year
was 15.4 kg. From a sample of 35 penguins at about the same time of this year in the
same colony, the mean penguin weight is 14.6 kg. Assume the population standard
deviation is 2.5 kg. At .05 significance level, can we reject the null hypothesis that
the mean penguin weight does not differ from last year?

i. State Null and Alternative hypothesis


𝐻𝑜 : µ = 15.4
𝐻𝑎 : µ ≠ 15.4
This implies a two tailed test

ii. Choose Statistic

Name of Presenter: Pavan Kumar S Page 6 of 8


Date of Presentation: 12/08/2021
Topic: Foundations of Statistics and Probability for Data Science
Sub-Topics: Hypothesis Testing

Sample Size, n = 35
Sample Mean = 14.6
Population Standard Deviation = 2.5

As the sample size is large enough, we will use z-statistic


Z score = ( x̅ - μ) / ( σ /√n)

iii. Determine critical region; Compute critical value


Significance level on lower end, α = 0.025

The left tail critical value at alpha = 0.025 is = qnorm(0.025) = -1.95

iv. Determine the statistic value and p-value


𝑥−𝜇 14.6 − 15.4
𝑧= 𝑠 = 2.5 = -1.893146
√𝑛 √35

P ( z < -1.893146 ) = 0.02916926

v. Make your decision


Does the calculated sample statistic value lie in the critical region?
i. Z score > Z critical i.e. -1.893146 > - 1.95
It implies the sample statistic we calculated is not in critical region, which tells that
we don’t have enough evidence to reject the Null Hypothesis

ii. P value > alpha i.e. 0.02916926 > 0.05

Name of Presenter: Pavan Kumar S Page 7 of 8


Date of Presentation: 12/08/2021
Topic: Foundations of Statistics and Probability for Data Science
Sub-Topics: Hypothesis Testing

It implies the sample statistic we calculated is not in critical region, which tells that
we don’t have enough evidence to reject the Null Hypothesis
Therefore using both methods we conclude that we fail to reject the claim that the
mean weight of King Penguins found in an Antarctic colony does not differ from
last year.

Name of Presenter: Pavan Kumar S Page 8 of 8


Date of Presentation: 12/08/2021

You might also like