You are on page 1of 72

gaurrohit0071@gmail.

com
STB6BGNELH
Tests of Significance

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Agenda
● Large Sample Test
○ Two Sample

● Small Sample Test


○ One
gaurrohit0071@gmail.com Sample
STB6BGNELH
○ Two Sample

● Test for Population Proportion


○ One Sample
○ Two Sample

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Large sample tests

Large
Sample
Test
gaurrohit0071@gmail.com
STB6BGNELH

Two
One Sample
Sample

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Large Sample Tests
gaurrohit0071@gmail.com
STB6BGNELH

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Large sample tests

Number of Number of
● One sample Defective Items Defective Items
from Production from Production
Line A Line B
gaurrohit0071@gmail.com
STB6BGNELH
1 5 Can we say that
Production Line
● Two sample 3 8
B produces less
5 2 defectives than
6 5
Production Line A?

2 3

4 6
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample tests

● The two sample tests are used to compare the equality of means of two populations

● Used to compare:
gaurrohit0071@gmail.com
STB6BGNELH
○ Performance of two machineries

○ Performance of two portfolios

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Assumptions of Two sample Z tests
for Large Samples
● Let there be two different populations such that they follow normal distribution

● The two samples are independent of each other


gaurrohit0071@gmail.com

STB6BGNELH The population standard deviations of the variables must be known.

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test - hypothesis

● Let there be two samples of sizes n1 and n2 drawn from normal population
N(µ1, σ12) and N(µ2, σ22) respectively
gaurrohit0071@gmail.com
STB6BGNELH● The hypothesis to test whether the population means are equal.

µ H 0 : µ1 = µ2 against H1 : µ1 ≠ µ1

● It implies
H0: The two population means are equal (i.e µ1 = µ2)

against H1: The two population means are not equal µ0 (i.e µ1 ≠ µ2)
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test - hypothesis

● Like the one sample tests, it is possible to test for

H0 : µ1 ≤ µ2 against H1: µ1 > µ2


gaurrohit0071@gmail.com
STB6BGNELH Or

H0 : µ1 ≥ µ2 against H1: µ1 < µ2

● Failing to reject H0 implies that the null hypothesis is true

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample tests -Z test statistic

● The test statistic is Z given by

Sample Specified mean


gaurrohit0071@gmail.com mean
STB6BGNELH

● Under H0, the test statistic follows normal distribution

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample tests - Z test statistic

● If σ12 and σ22 are not known then they are replaced by sample estimates s12 and s22
respectively provided n1≥ 30and n2 ≥ 30

● The test statistic is Z given by


gaurrohit0071@gmail.com
STB6BGNELH

Sample Specified mean


mean

where

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
The python code to conduct a Z test for two population means is
gaurrohit0071@gmail.com
STB6BGNELH
statsmodels.stats.weightstats.ztest(Sample_1, Sample_2, value, alternative)

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample tests - decision rule

Based on
Based on critical Based on p-
H1 confidence
region value
gaurrohit0071@gmail.com interval
STB6BGNELH

Reject H0 if |Z|≥
For two tailed test µ1 ≠ µ 2 Zα/2
Reject H0 if p- value Reject H0 if µ1 - µ2
is less than or equal
For left tailed test µ1 < µ 2 Reject H0 if Z ≤ Zα to the level of does not
significance lie in the
confidence interval
For right tailed test µ1 > µ 2 Reject H0 if Z ≥ Zα
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for population mean (σ unknown)
Question:

A study was carried out to understand amount of hemoglobin in blood for males and females.
gaurrohit0071@gmail.com
A random sample of 160 males and 180 females have means of 13 g/dl and 15 g/dl. The two
STB6BGNELH
samples have standard deviation of 4.1 g/dl for male donors and 3.5 g/dl for female donor .
Can it be said the population means of hemoglobin are the same for men and women? Use α
= 0.01.

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for population mean (σ unknown)
Solution:

X: Amount of hemoglobin in blood for males Y: Amount


of hemoglobin in blood for females
gaurrohit0071@gmail.com
STB6BGNELH

Here n1 = 160, s1 = 4.1, = 13

n2 = 180, s2 = 3.5 = 15

To test H0: µ1 = µ2 against H1: µ1 ≠ µ2

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for population mean (σ unknown)
Solution:

The test statistic


gaurrohit0071@gmail.com
STB6BGNELH

Decision Rule: Reject |Zcalc| ≥ Zα/2

Here Zα/2 = 2.58

Since 4.807 > 2.58, reject H0. -2.58 2.58

We may conclude that both males and females have


different hemoglobin averages
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for population mean (σ unknown)
Python solution: Calculate critical z-value

gaurrohit0071@gmail.com
STB6BGNELH

i.e. If test statistic is less than -2.58 or greater than 2.58 then we reject H0.
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for population mean (σ unknown)
Python solution: Calculate test statistic

gaurrohit0071@gmail.com
STB6BGNELH

As test statistic (=-4.8068) < critical value (=-2.58), we


reject H0.
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for population mean (σ unknown)
Python solution: Calculate p-value

gaurrohit0071@gmail.com
STB6BGNELH

As p-value < 0.01, we reject H0.

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Small sample tests

gaurrohit0071@gmail.com
STB6BGNELH

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample tests

● Like the two sample Z test, the two sample t test are used to compare the equality
of means of two populations for unpaired data when population standard
deviations are not known or sample sizes are less than 30
gaurrohit0071@gmail.com
STB6BGNELH

● For unpaired data use t-test to test relative performances of two machineries,
investment portfolios, two drugs to reduce an outcome , etc

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample tests for unpaired data

● To test difference between population means of two populations


gaurrohit0071@gmail.com
STB6BGNELH Let there be two different populations such that they follow normal distribution

● Two samples of sizes n1 and n2 drawn from normal population N(µ1, σ12) and N(µ2,
σ22) respectively

● The two samples are independent of each other


This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test - hypothesis

● The hypothesis to test whether the population means are equal

H0 : µ1 = µ2 against H1 : µ1 ≠ µ1
gaurrohit0071@gmail.com

STB6BGNELH
It implies

H0: The two population means are equal (i.e µ1 = µ2)

against H1: The two population means are not equal µ0 (i.e µ1 ≠µ2)

● The one sided hypothesis are

H0 : µ1 ≤ µ2 vs H :µ >µ Or H : µ1 ≥ µ 2 vs
This file is meant 1for personal 1 2 by gaurrohit0071@gmail.com only. 0
use
H1 : µ1 < µ2 Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample tests - test statistic

● Recall for large samples the test statistic, Z is given by

σ12 and σ22 are known


gaurrohit0071@gmail.com
STB6BGNELH

● Note: If σ12 and σ22 are not known then they are replaced by the sample
estimates s12 and s22 respectively
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample tests - options
● If sample sizes are small and standard deviations of the populations are unknown
then we use the t-test. The t-test for small independent samples has two options
● First option is when variances are assumed equal. In this case the pooled sample
variance sp2 is used . Under H0, the test statistic follows t distribution with n1 + n2
gaurrohit0071@gmail.com
STB6BGNELH
- 2 df
● Second option is when variances are assumed unequal. In this case we use the
sample variances s12 and s22 with degrees of freedom equal to the smaller of n1 -
1 or n2 - 1
● In either options failure to reject H0 implies that the two population means are
equal (i.e µ1 = µ2)
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample tests - test statistic

● For small samples the test statistic, t is given by


Unequal Variance Equal Variance
gaurrohit0071@gmail.com
STB6BGNELH

Where is the pooled sample variance given by

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
The python code to conduct a t test for two population means which are not
gaurrohit0071@gmail.com
paired is
STB6BGNELH

scipy.stats.ttest_ind(Sample_1, Sample_2, equal_var=True)

Note that this assumes equal population variance.

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
The python code to conduct a t test for two population means which are not
paired is
gaurrohit0071@gmail.com
STB6BGNELH
scipy.stats.ttest_ind(Sample_1, Sample_2, equal_var=False)

Note that this assumes unequal population variance and performs Welch’s t-test

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for unpaired data
Question:

An experiment was conducted to compare the pain relieving hours of two new medicines. Two
groups of 14 and 15 patients were selected and were given comparable doses. Group 1 was given
gaurrohit0071@gmail.com
STB6BGNELH
medicine 1 and group 2 was given medicine 2. Following data is obtained from the two samples.
Test whether the two populations give the same mean hours of relief. Assume the data comes
from normal distribution has equal variance. [Use α = 0.01]

Medicine 1 Medicine 2
Mean of hours of 6.4 7.3
relief
S.D of This
hours of relief 1.4 1.5
file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for unpaired data
Solution:

Let X: patients receiving medicine 1 Y:


patients receiving medicine 2
gaurrohit0071@gmail.com
STB6BGNELH

To test H0: µ1 = µ2 against H1: µ1 ≠ µ2

The pooled variance is

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for unpaired data
Solution:

The test statistic is


gaurrohit0071@gmail.com
STB6BGNELH

Decision rule: Reject H0 if |tcal| ≥ tn1+n2-2, α/2

tn1+n2-2, α/2 = t27,0.005 = 2.771

- 2.771 2.771
Since 2.771>|-1.667|, we fail to reject H . 0

The two medicines have the


This same mean
file is meant hoursuse
for personal ofbysleep.
gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for unpaired data
Python solution: Calculate critical t-
value

gaurrohit0071@gmail.com
STB6BGNELH

If test statistic is less than -2.77 or greater than 2.77, then we reject H0.
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for unpaired data
Python solution: Calculate test
statistic

gaurrohit0071@gmail.com
STB6BGNELH

As test statistic (=-1.667) >This


critical value (=-2.77), we fail to reject H0. only.
file is meant for personal use by gaurrohit0071@gmail.com
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for unpaired data
Python solution: Calculate p-value

gaurrohit0071@gmail.com
STB6BGNELH

As p-value > 0.01, we fail to reject H0.

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
● The test is carried out under the assumption that the samples are drawn from
independent normal population
gaurrohit0071@gmail.com
● If the populations from which the samples are drawn do not have equal variance
STB6BGNELH

then the Welch test is used which is the t-test for independent samples with
unequal variances

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample tests for paired data

● For paired data effectiveness of some training/treatment is measured

● The observations are recorded on the same individual/item twice resulting in pairs of
observations
gaurrohit0071@gmail.com
STB6BGNELH

Example:

A energy drink manufacturing company wants to test if sales increase after


they advertise the drink with a life sized picture of a well-known athlete.

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample tests for paired data

● Let {(Xi, Yi), i = 1, 2, 3, … ,n} where X and Y are paired data


● Let µX, µY be the mean of data from X and Y respectively
● Define d = yi - xi
gaurrohit0071@gmail.comi
STB6BGNELH
● Let µd = µy - µx
● In paired t-test, we test for
H0 : µd = µ0 against H1 : µd ≠ µ0
H0 : µd ≤ µ0 against H1 : µd > µ0

H0 : µd ≥ µ0 against H1 : µd < µ0
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample tests for paired data

● Suppose mean of di = and variance of di =

● The test statistic is


gaurrohit0071@gmail.com
STB6BGNELH

● Under H0, the test statistic follows t-distribution with n-1 degrees of
freedom

● Failing to reject H0, implies that the null hypothesis is true


This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
The python code to conduct a t test for two population means which are
gaurrohit0071@gmail.com paired is
STB6BGNELH

scipy.stats.ttest_rel(Sample_1, Sample_2)

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample t-test for paired data
Question:

An energy drink distributor claims that a new advertisement poster, featuring a life-size picture of
a well-known athlete. For a random sample of 10 outlets, the following data was collected. Test
gaurrohit0071@gmail.com
STB6BGNELH
that the null hypothesis that there the advertisement was effective in increasing the sales

Before 33 32 38 45 37 47 48 41 45

After 42 35 31 41 37 36 49 49 48

Test the hypothesis using critical region technique. [Use α = 0.05].


This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample t-test for paired data
Solution:

To test,
gaurrohit0071@gmail.com
STB6BGNELH
H0: The advertisement was not effective ( µd ≤ 0)
against

H1: The advertisement was effective (µd > 0)

Before 33 32 38 45 37 47 48 41 45

Com pute di
After 42 35 31 41 37 36 49 49 48

di=yi-xi 9 3 -7 -4 0 -11 1 8 3
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample t-test for paired data
Solution:

To test,
H : The advertisement was not effective
gaurrohit0071@gmail.com
STB6BGNELH 0 against ( µd ≤ 0)

H1: The advertisement was effective (µd > 0)

We have mean and its variance

The test statistic is

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample t-test for paired data
Solution:

Decision Rule: Reject H0, if tcalc ≥ tdf,α


gaurrohit0071@gmail.com
Here tdf,α = tn-1,α= t8,0.05 = 1.86
STB6BGNELH

Since 1.86 > 0.0998, fail to reject H0


1.86

Thus, there is no effect of the advertisement.

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample t-test for paired data
Python solution: Calculate critical t-value

gaurrohit0071@gmail.com
STB6BGNELH

If test statistic is greater than 1.86, then we reject H0.


This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample t-test for paired data
Python solution: Calculate test statistic and p-
value

gaurrohit0071@gmail.com
STB6BGNELH

The test statistic (=0.1) < critical value (=1.86), also the p-value > 0.05, thus we fail to reject H0.

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Test for Population Proportion
gaurrohit0071@gmail.com
STB6BGNELH

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Test for proportion

● For qualitative data the proportion of a desired characteristic is obtained

● Test for proportion:


gaurrohit0071@gmail.com
STB6BGNELH
○ One sample: Testing population proportion (P) is equal to a specified value (P 0)

○ Two sample: Testing equality of Two population proportions (P 1 = P2)

● Similar to the tests of population mean

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Test for proportion

Test for
proportion
gaurrohit0071@gmail.com
STB6BGNELH

One Sample Two Sample

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
One sample test - hypothesis

● The hypothesis to test the population proportion is equal to a specified value

H0 : P = P 0 against H1 : P ≠ P0
gaurrohit0071@gmail.com

STB6BGNELH It implies

H0: The population proportion is equal to P0

against H1: The population proportion is not equal to P0

● Failing to reject H0 implies that the population proportion is equal to P0


This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Test for proportions

● The test statistic is given by


Specified
Sample
proportion
proportion
gaurrohit0071@gmail.com
STB6BGNELH

Sample size

● Under H0, the test statistic follows standard normal distribution

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
One sample test for proportion - decision rule

Based on
H1 Based on critical region Based on p-value
confidence interval
gaurrohit0071@gmail.com
STB6BGNELH

For two tailed test P ≠ P0 Reject H0 if |Z|≥ Zα/2


Reject H0 if p-value
Reject H0 if P0 does
is less than or equal
For left tailed test P < P0 Reject H0 if Z ≤ -Zα not lie in the
to the level of
confidence interval
significance
For right tailed test P > P0 Reject H0 if Z ≥ Zα

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
One sample test for proportion
Question:

From a sample 361 business owners had gone into bankruptcy due to recession. On taking a
survey, it was found that 105 of them had not consulted any professional for managing their
gaurrohit0071@gmail.com
STB6BGNELH
finance before opening the business. Test the null hypothesis that at most 25% of all businesses
had not consulted before opening the business.

Test the claim using p-value technique. [Use α = 0.05].

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
One sample test for proportion
Solution:

From a sample 361 business owners had gone into bankruptcy due to recession.
i.e. n = 361
gaurrohit0071@gmail.com
STB6BGNELH

On taking a survey, it was found that 105 of them had not consulted any professional for
managing their finance before opening the business.

Let X: business which did not consult before x = 105

The sample proportion (p) = x/n = 105/361 = 0.2909

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
One sample test for proportion
Solution:

To test: The null hypothesis that at most 25% of all businesses had not consulted before
opening the business
gaurrohit0071@gmail.com
STB6BGNELH

Here P0 = 0.25

To test, H0: P ≤ 0.25 against H1: P > 0.25

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
One sample test for proportion
Solution:

The test statistic


gaurrohit0071@gmail.com
STB6BGNELH

The p-value = P(Z > 1.79) = 0.0367 1.79

As p-value < 0.05, reject H0.

We may conclude that at least 25% of all businesses had not consulted before starting the
business.
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
One sample test for proportion
Python solution: Calculate test statistic

gaurrohit0071@gmail.com
STB6BGNELH

This file is meant for personal use by gaurrohit0071@gmail.com only.


Sharing or publishing the contents in part or full is liable for legal action.
One sample test for proportion
Python solution: Calculate p-value

gaurrohit0071@gmail.com
STB6BGNELH

As the p-value < 0.05, we reject H0.

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Test for proportion

Test for
proportion
gaurrohit0071@gmail.com
STB6BGNELH

One Sample Two Sample

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample tests for population proportion

● Let there be two samples sizes n1 and n2 from different populations of such that x1 and
x2 are the number of specific items in each of them respectively

gaurrohit0071@gmail.com
STB6BGNELH
● Suppose these samples have proportions of specific items p1 and p2 respectively

● To test the equality of population proportion from which these samples are chosen

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test - hypothesis

● The hypothesis to test the population proportion

H0 : P1 = P 2 against H1 : P 1 ≠ P2
gaurrohit0071@gmail.com

STB6BGNELH It implies

H0: The two population proportions are equal (P1 = P2)

against H1: The two population proportions are not equal (P1 ≠ P2)

● Failing to reject H0 implies that the two population proportions are equal
This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Test for proportions

● The test statistic is given by

where is the proportion


gaurrohit0071@gmail.com
STB6BGNELH
of pooled sample such that

● Under H0, the test statistic follows standard normal distribution

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
The python code to conduct a Z test for two population proportions is
gaurrohit0071@gmail.com
STB6BGNELH
statsmodels.api.stats.proportions_ztest(Sample_1, Sample_2)

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for proportion - decision rule

Based on
H1 Based on critical region Based on p-value
confidence interval
gaurrohit0071@gmail.com
STB6BGNELH

For two tailed test P1 ≠ P2 Reject H0 if |Z| ≥ Zα/2


Reject H0 if p-value
Reject H0 if P1 - P2
is less than or
For left tailed test P1 < P2 Reject H0 if Z ≤ -Zα does not lie in the
equal to the level
confidence interval
of significance

For right tailed test P1 > P2 Reject H0 if Z ≥ Zα

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for proportion
Question:

Steve owns a kiosk where sells two magazines - A and B in a month. He buys 100 copies of
magazine A out of which 78 were sold and 70 copies of magazine B out of which 65 were sold. Is
gaurrohit0071@gmail.com
STB6BGNELH
there enough evidence to say that magazine is B is more popular?

Test the claim using p-value technique. [Use α = 0.05].

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for proportion
Solution:

Steve owns a kiosk where sells two magazines - A and B in a month. Let X:
gaurrohit0071@gmail.com
STB6BGNELH

the number of magazines sold


Out of 100 copies of magazine A 78 are sold Out of 70 copies of magazine B 65 are sold
Here, x1 = 78 and n1 = 100 Here, x2 = 65 and n1 = 70
Let p1 be the proportion of sell of magazine A Let p2 be the proportion of sell of magazine B
p1= x1/n1 = 78/100 = 0.78 p2 = x2/n2 = 65/70 = 0.928

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for proportion
Solution:

To test, whether magazine B is more popular

i.e H : P ≥ P
0 1
gaurrohit0071@gmail.com2 against H1: P1 < P2
STB6BGNELH

Where
P1: denotes population proportion of magazine A sold
P2: denotes population proportion of magazine B sold

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for proportion
Solution:

The pooled proportion is


gaurrohit0071@gmail.com
STB6BGNELH

The test statistic is

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for proportion
Solution:

The test statistic Z = -2.5905


gaurrohit0071@gmail.com
The p-value = P(Z < Zcalc, under H0) = P(Z < -2.5905, µ = 13) = 0.0048
STB6BGNELH

Since p-value < 0.05, we reject H0.

-2.59

Thus there is enough evidence to conclude that magazine is B is more popular.

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Two sample test for proportion
Python solution: Calculate test statistic and p-value

gaurrohit0071@gmail.com
STB6BGNELH

As the p-value < 0.05, we reject H0.


This file is meant for personal use by gaurrohit0071@gmail.com only.
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Summary

gaurrohit0071@gmail.com
STB6BGNELH

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
Parametric tests

● The tests considered so far have two features:

○ The probability distribution of the samples was assumed to be known


gaurrohit0071@gmail.com
STB6BGNELH ○ The hypothesis test was about the parameter of the probability distribution

● These tests are known as the parametric tests

● The times when these assumptions are not satisfied, use the non-parametric tests

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.
gaurrohit0071@gmail.com
STB6BGNELH
Thank You

This file is meant for personal use by gaurrohit0071@gmail.com only.


Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Sharing or publishing the contents in part or full is liable for legal action.

You might also like