Professional Documents
Culture Documents
com
STB6BGNELH
Tests of Significance
Large
Sample
Test
gaurrohit0071@gmail.com
STB6BGNELH
Two
One Sample
Sample
Number of Number of
● One sample Defective Items Defective Items
from Production from Production
Line A Line B
gaurrohit0071@gmail.com
STB6BGNELH
1 5 Can we say that
Production Line
● Two sample 3 8
B produces less
5 2 defectives than
6 5
Production Line A?
2 3
4 6
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample tests
● The two sample tests are used to compare the equality of means of two populations
● Used to compare:
gaurrohit0071@gmail.com
STB6BGNELH
○ Performance of two machineries
● Let there be two samples of sizes n1 and n2 drawn from normal population
N(µ1, σ12) and N(µ2, σ22) respectively
gaurrohit0071@gmail.com
STB6BGNELH● The hypothesis to test whether the population means are equal.
µ H 0 : µ1 = µ2 against H1 : µ1 ≠ µ1
● It implies
H0: The two population means are equal (i.e µ1 = µ2)
against H1: The two population means are not equal µ0 (i.e µ1 ≠ µ2)
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test - hypothesis
● If σ12 and σ22 are not known then they are replaced by sample estimates s12 and s22
respectively provided n1≥ 30and n2 ≥ 30
where
Based on
Based on critical Based on p-
H1 confidence
region value
gaurrohit0071@gmail.com interval
STB6BGNELH
Reject H0 if |Z|≥
For two tailed test µ1 ≠ µ 2 Zα/2
Reject H0 if p- value Reject H0 if µ1 - µ2
is less than or equal
For left tailed test µ1 < µ 2 Reject H0 if Z ≤ Zα to the level of does not
significance lie in the
confidence interval
For right tailed test µ1 > µ 2 Reject H0 if Z ≥ Zα
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for population mean (σ unknown)
Question:
A study was carried out to understand amount of hemoglobin in blood for males and females.
gaurrohit0071@gmail.com
A random sample of 160 males and 180 females have means of 13 g/dl and 15 g/dl. The two
STB6BGNELH
samples have standard deviation of 4.1 g/dl for male donors and 3.5 g/dl for female donor .
Can it be said the population means of hemoglobin are the same for men and women? Use α
= 0.01.
n2 = 180, s2 = 3.5 = 15
gaurrohit0071@gmail.com
STB6BGNELH
i.e. If test statistic is less than -2.58 or greater than 2.58 then we reject H0.
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for population mean (σ unknown)
Python solution: Calculate test statistic
gaurrohit0071@gmail.com
STB6BGNELH
gaurrohit0071@gmail.com
STB6BGNELH
gaurrohit0071@gmail.com
STB6BGNELH
● Like the two sample Z test, the two sample t test are used to compare the equality
of means of two populations for unpaired data when population standard
deviations are not known or sample sizes are less than 30
gaurrohit0071@gmail.com
STB6BGNELH
● For unpaired data use t-test to test relative performances of two machineries,
investment portfolios, two drugs to reduce an outcome , etc
●
gaurrohit0071@gmail.com
STB6BGNELH Let there be two different populations such that they follow normal distribution
● Two samples of sizes n1 and n2 drawn from normal population N(µ1, σ12) and N(µ2,
σ22) respectively
H0 : µ1 = µ2 against H1 : µ1 ≠ µ1
gaurrohit0071@gmail.com
●
STB6BGNELH
It implies
against H1: The two population means are not equal µ0 (i.e µ1 ≠µ2)
H0 : µ1 ≤ µ2 vs H :µ >µ Or H : µ1 ≥ µ 2 vs
This file is meant 1for personal 1 2 by gaurrohit0071@gmail.com only. 0
use
H1 : µ1 < µ2 Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample tests - test statistic
● Note: If σ12 and σ22 are not known then they are replaced by the sample
estimates s12 and s22 respectively
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample tests - options
● If sample sizes are small and standard deviations of the populations are unknown
then we use the t-test. The t-test for small independent samples has two options
● First option is when variances are assumed equal. In this case the pooled sample
variance sp2 is used . Under H0, the test statistic follows t distribution with n1 + n2
gaurrohit0071@gmail.com
STB6BGNELH
- 2 df
● Second option is when variances are assumed unequal. In this case we use the
sample variances s12 and s22 with degrees of freedom equal to the smaller of n1 -
1 or n2 - 1
● In either options failure to reject H0 implies that the two population means are
equal (i.e µ1 = µ2)
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample tests - test statistic
Note that this assumes unequal population variance and performs Welch’s t-test
An experiment was conducted to compare the pain relieving hours of two new medicines. Two
groups of 14 and 15 patients were selected and were given comparable doses. Group 1 was given
gaurrohit0071@gmail.com
STB6BGNELH
medicine 1 and group 2 was given medicine 2. Following data is obtained from the two samples.
Test whether the two populations give the same mean hours of relief. Assume the data comes
from normal distribution has equal variance. [Use α = 0.01]
Medicine 1 Medicine 2
Mean of hours of 6.4 7.3
relief
S.D of This
hours of relief 1.4 1.5
file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for unpaired data
Solution:
- 2.771 2.771
Since 2.771>|-1.667|, we fail to reject H . 0
gaurrohit0071@gmail.com
STB6BGNELH
If test statistic is less than -2.77 or greater than 2.77, then we reject H0.
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for unpaired data
Python solution: Calculate test
statistic
gaurrohit0071@gmail.com
STB6BGNELH
gaurrohit0071@gmail.com
STB6BGNELH
then the Welch test is used which is the t-test for independent samples with
unequal variances
● The observations are recorded on the same individual/item twice resulting in pairs of
observations
gaurrohit0071@gmail.com
STB6BGNELH
Example:
H0 : µd ≥ µ0 against H1 : µd < µ0
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample tests for paired data
● Under H0, the test statistic follows t-distribution with n-1 degrees of
freedom
scipy.stats.ttest_rel(Sample_1, Sample_2)
An energy drink distributor claims that a new advertisement poster, featuring a life-size picture of
a well-known athlete. For a random sample of 10 outlets, the following data was collected. Test
gaurrohit0071@gmail.com
STB6BGNELH
that the null hypothesis that there the advertisement was effective in increasing the sales
Before 33 32 38 45 37 47 48 41 45
After 42 35 31 41 37 36 49 49 48
To test,
gaurrohit0071@gmail.com
STB6BGNELH
H0: The advertisement was not effective ( µd ≤ 0)
against
Before 33 32 38 45 37 47 48 41 45
Com pute di
After 42 35 31 41 37 36 49 49 48
di=yi-xi 9 3 -7 -4 0 -11 1 8 3
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample t-test for paired data
Solution:
To test,
H : The advertisement was not effective
gaurrohit0071@gmail.com
STB6BGNELH 0 against ( µd ≤ 0)
gaurrohit0071@gmail.com
STB6BGNELH
gaurrohit0071@gmail.com
STB6BGNELH
The test statistic (=0.1) < critical value (=1.86), also the p-value > 0.05, thus we fail to reject H0.
Test for
proportion
gaurrohit0071@gmail.com
STB6BGNELH
H0 : P = P 0 against H1 : P ≠ P0
gaurrohit0071@gmail.com
●
STB6BGNELH It implies
Sample size
Based on
H1 Based on critical region Based on p-value
confidence interval
gaurrohit0071@gmail.com
STB6BGNELH
From a sample 361 business owners had gone into bankruptcy due to recession. On taking a
survey, it was found that 105 of them had not consulted any professional for managing their
gaurrohit0071@gmail.com
STB6BGNELH
finance before opening the business. Test the null hypothesis that at most 25% of all businesses
had not consulted before opening the business.
From a sample 361 business owners had gone into bankruptcy due to recession.
i.e. n = 361
gaurrohit0071@gmail.com
STB6BGNELH
On taking a survey, it was found that 105 of them had not consulted any professional for
managing their finance before opening the business.
To test: The null hypothesis that at most 25% of all businesses had not consulted before
opening the business
gaurrohit0071@gmail.com
STB6BGNELH
Here P0 = 0.25
We may conclude that at least 25% of all businesses had not consulted before starting the
business.
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
One sample test for proportion
Python solution: Calculate test statistic
gaurrohit0071@gmail.com
STB6BGNELH
gaurrohit0071@gmail.com
STB6BGNELH
Test for
proportion
gaurrohit0071@gmail.com
STB6BGNELH
● Let there be two samples sizes n1 and n2 from different populations of such that x1 and
x2 are the number of specific items in each of them respectively
gaurrohit0071@gmail.com
STB6BGNELH
● Suppose these samples have proportions of specific items p1 and p2 respectively
● To test the equality of population proportion from which these samples are chosen
H0 : P1 = P 2 against H1 : P 1 ≠ P2
gaurrohit0071@gmail.com
●
STB6BGNELH It implies
against H1: The two population proportions are not equal (P1 ≠ P2)
● Failing to reject H0 implies that the two population proportions are equal
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Test for proportions
Based on
H1 Based on critical region Based on p-value
confidence interval
gaurrohit0071@gmail.com
STB6BGNELH
Steve owns a kiosk where sells two magazines - A and B in a month. He buys 100 copies of
magazine A out of which 78 were sold and 70 copies of magazine B out of which 65 were sold. Is
gaurrohit0071@gmail.com
STB6BGNELH
there enough evidence to say that magazine is B is more popular?
Steve owns a kiosk where sells two magazines - A and B in a month. Let X:
gaurrohit0071@gmail.com
STB6BGNELH
i.e H : P ≥ P
0 1
gaurrohit0071@gmail.com2 against H1: P1 < P2
STB6BGNELH
Where
P1: denotes population proportion of magazine A sold
P2: denotes population proportion of magazine B sold
-2.59
gaurrohit0071@gmail.com
STB6BGNELH
gaurrohit0071@gmail.com
STB6BGNELH
● The times when these assumptions are not satisfied, use the non-parametric tests