RM Study Material - Unit 4

M. COM. SEM.
– 2
Research Methodology
Unit – 4
Testing of Hypothesis, Small and large sample tests
Complied by Dr. Ankit Bhojak
Testing of Hypothesis
Population:
Population is the well-defined collection of large number of individuals or objects or units which
possess the same characteristics. In other words population is the total set of observations.
For example, if we are studying the height of men, the population is the set of heights of all the
men in the world.
Sample:
Sample is a set of data collected and/or selected from a statistical population. It is a part of
population. It is the representative of the population. It contains all the characteristics of the
population. . For example, if we are studying the height of men, the population is the set of
heights of all the men in the world and the sample is the set of heights of men belong to a
particular country like India, China etc.
The main objective to select a sample from population is to get the information about the
population under study. Statistical inference is the technique to estimate the unknown
characteristics of population from the sample. It can be classified into two parts.
 Estimation
 Testing of hypothesis.
Let us see the basic terminology of statistical inference.
Parameter:
Parameter is any numerical quantity that characterizes a given population. In simple terms, a
constant obtained from all the observations of a population is known as parameter. This means
the parameter tells us something about the whole population.
1
Statistic:
Statistic is any numerical quantity that characterizes a given sample. In simple terms, a constant
obtained from all the observations of a sample is known as statistic. This means the statistic tells
us something about the sample.
Examples of some of the parameters and statistics.
Particulars Parameter Statistics

Size N n
Mean µ x̅
Standard deviation σ S
Proportion of attribute P p
Estimator:
A statistic is used to predict the value of the parameter of the population is called estimator. The
procedure to estimate the value of parameter with the help of statistic is known as estimation.
Sampling distribution:
From a population of size N, different samples of size n are selected. From this selected samples
different values of statistic are obtained and arranged in the form of frequency distribution which
is known as sampling distribution of a statistic. i.e. A sampling distribution is a
probability distribution of a statistic obtained through a large number of samples drawn from a
specific population.
Standard Error:
The standard deviation of the sample statistic obtained from the sampling distribution is known
as standard error of that statistic.
Uses:
1) To test the randomness of a sample
2) To get the confidence interval for the parameter of the population.
3) To test whether the difference between the value of sample statistic and population
parameter is significant or not.
4) To determine the precision of the sample estimates
1
Precision of statistics =
standard error of statistic
2
Statistical Hypothesis: Statistical hypothesis is a logical statement or assumption made about
the parameter of the population. Statistical hypothesis is a conjecture which can be tested by
some procedure at the end of the procedure we may either accept or reject the hypothesis.
Hypothesis can never be proved.
Let see some of the examples of statistical hypothesis:
(1) The mean of the population is 10.

(2) The mean of the marks of students in statistics of college X is greater than that of students
of college Y.
(3) Proportion of smokers is higher than 30 %.
Simple Hypothesis: If the hypothesis specify the value of the parameter of the population then it
is known as simple hypothesis. From above examples, (1) is the example of simple hypothesis.
Composite Hypothesis: If the hypothesis does not specify the value of the parameter of the
population completely then it is known as composite hypothesis. From above examples, (2) and
(3) are the examples of composite hypothesis.
Null Hypothesis: A hypothesis under consideration is called null hypothesis. It is written for the
possible acceptance. Null hypothesis is denoted by H0.
Let us see the examples of null hypothesis.
(1)
Mean of the population is 10. i.e. H0 : µ = 10
1
(2)
The die is fair. i.e. H0 : P =
6
(3)
The proportions of smokers in the state of Gujarat and the state of Maharashtra are equal.
i.e. H0 : P1 = P2
(4)
Mean of both the populations are equal. i.e. H0 : µ1 = µ2.
Alternative Hypothesis: A hypothesis complementary to the null hypothesis is called alternative

hypothesis. It is denoted by H1.
Following are the examples of alternative hypothesis:
(1)
Mean of the population is not 10. i.e. H1 : µ ≠ 10
1
(2)
The die is unfair. i.e. H1 : P ≠
6
3
(3)
The proportion of smokers in the state of Gujarat is less than the state of Maharashtra. i.e.
H1 : P1> P2
(4)
Mean of one population is larger than that of the other population. i.e. H1 : µ1> µ2.
Test of hypothesis:
The test of a hypothesis is a procedure to decide whether to accept or reject the null hypothesis or
to reject it. If the test procedure reject the null hypothesis then we must have an alternative
hypothesis which is to be accepted.
Thus, H0 is tested against H1.
Type I and Type II errors: In testing of any statistical hypothesis the following situation may
arise.
accept H0 reject H0
H0 is true correct decision incorrect decision
H0 is false incorrect decision correct decision
Here, accepting a false hypothesis and rejecting true hypothesis are considered as incorrect
decisions or errors.
The error committed in rejecting a true null hypothesis is called Type – I error and its probability
is denoted by α.
Thus α = P (Type – I error)
= P (reject H0 / H0 is true)
The error committed in accepting a false null hypothesis is called Type – II error and its
probability is denoted by β.
Thus β = P (Type – II error)
= P (accept H0 / H0 is false)
Level of Significance:
In any test procedure both the types of errors should be kept minimum. But as these are inter
related, it is not possible to minimize both the errors simultaneously. Hence, the probability of
4
type
5
– I error is fixed and probability of type – II error is minimised. This predetermined fixed value
of probability of type – I error is called level of significance and it is denoted by α. Hence, the
level of significance is the probability of rejecting a hypothesis which is likely to be accepted.
The commonly used level of significance are 5 % and 1 %. If we consider 5 % level of
significance then it means that out of 100 cases in 5 cases we are rejecting a null hypothesis
which is likely to be accepted.
Critical Region:
The area of probability curve is divided into two regions by predetermined level of significance.
The area of probability curve corresponding to type – I error is called critical region. It is also
known as region of rejection. The area of probability curve other than critical region is called the
acceptance region.
Critical region may be given in two ways:
(1) two tailed on probability curve

(2) One tailed on probability curve
If it is required to test whether the sample statistics is significantly different from the population
parameter then it is called two tailed test. If it is required to test whether the sample statistics is
significantly greater than or lesser than the population parameter then it is called one tailed test.
Power of the test:
The probability of rejecting a null hypothesis when it is false is known as power of test and is
given by 1 – β.
i.e. 1 – β = 1 – P (Type – II error)
= 1 – P (accept H0 / H0 is false)
= P ( reject H0 / H0 is false)
6
Illustration:
𝟐
Q.1
A hypothesis Ho : P = ½ vs H1 : P = is to be tested. For this a coin is tossed 10 times
𝟑
and if 8 or more trials give heads, then H0 is rejected. Determine probabilities of

type 1 error, type 2 error and power of test.
Info.
p (x) = nCx  px  qn-x Where, x = 0, 1, 2,.......n

0p1
Here, n = 10 q=1–p
Probability of type I error
 = P (Reject Ho/Ho istrue.)
= P (x  8 / P = ½ )
= P (x = 8) + P (x = 9) + P (x = 10)
8
= C 1
1 9
1 10−8 1 10−9
10 8 ( ) ∙ () + 10C9( ) ∙ ( )
2 2 2 2
+ C 
1 10 1 10−10
10 10 () ∙ ( )
2 2
= 45 1 10 1 10 1 10
( ) + 10( ) + 1 ( )
2 2 2
= 1 10
( ) (45 + 10 + 1)
2
56 7×8 7
= 210 = 1024 = 128
 = 7
128 = 0.0547
Probability of type II error
 = P (accept Ho/Ho istrue.)
7
= P (accept Ho/H1 istrue.)
8
= P (x < 8 / 𝑃 = 2)
3
= 1 – P (x  8 / P = 2)
3
= 1 – [P (x = 8) + P (x = 9) + P (x = 10)]
=1–[ 8
C 2 9
2
1 10−8 1 10−9
10 8 () ∙ () + 10C9( ) ∙ ( )
3 3 3 3
+ C 
10 0
10 10 (2 ) ∙ ( 1) ]
3 3
8
= 1 – [45 2
10 × 29 210
310
+ 310
+ 310
]
8
=1– 2
[45 + (10 × 2) + 22]
3 1
256
= 1 – 243 ×243 (45 + 20 + 4) = 1 – 0.2991
= 0.7009
Power of test
1 –  = 1 – 0.7009
1 –  = 0.2991
Q.2
A coin is tossed 6 times & the hypothesis Ho is P = ½ is rejected in
𝟑
favour of H1 is p = . If the number of heads is greater than 4, find &.
𝟒
Here, n = 6
= P (𝑥 > 4 / 𝑃 = 1)
2
= P (x = 5) + P (x = 6)
9
5 1 6
= 6C5(2) ( 2) + 6C6(2 )
1 1 1
1 16 6
= 6 (2 ) + ( 2)
6
= (1) ( 6 + 1)
2
 = 7
64 = 0.1094
 = P (accept Ho/Ho isfalse.)
= P (accept Ho/H1 istrue.)
= 1 – P (𝑥  4 / 𝑃 = 3)
4
= 1 – [ P (x = 5) + P (x = 6)]
3 5 1 6
= 1 – [6C 5( ) (1 ) + 6C 6(3 )
4 4 4
5 6
= 1 – [6 × 3 + 3
]
46 46
5
=1–3
4 (6 − 3)
6
243 ×9
= 1 – 64 × 64
2187
= 1 – 4096
= 1 – 0.534
 = 0.466
10
Q.3
It is claimed that on an average 3 accident per month occur on a particular road.
For this a test is conducted and the H 0 : m = 3 v/s H1 : m = 2 is tested. During the last
month if less than 3 the null hypothesis is rejected. Find probability of type – 1 and
type -2 error and also find power of test.
−𝑚
𝑀𝑥
Info : p(x) = 𝑒
𝑥𝑖 Where, x = 0, 1, 2,.....n
e-1 = 0.3679
Here, m = 3 & m = 2
= P (𝑥 < 3/𝑚 = 3)
= P (x = 0) + P(x = 1) + P(x=2)
𝑒−330 𝑒−331 𝑒−332

= 0! + 1! + 2!
= 𝑒−3 [1 + 3 + 4.5]
= 0.0498 (8.5)
 = 0.4233
 = P (accept Ho/H1 istrue.)
= P (𝑥  3/𝑚 = 2)
= 1 – P (𝑥 < 3/𝑚 = 2)
= 1– [P (x = 0) + P(x = 1) + P(x=2)]
𝑒−220 𝑒−221 𝑒−222

=1[ 0! + 1! + 2! ]
11
= 1 − 𝑒−2 [1 + 2 + 2]
= 1 – 0.1354(5)
= 1 – 06768
 = 0.3232
Power of test
1 –  = 1 – 0.3232
1 –  = 0.67668
Q.4
It is observed that there are 0.5 misprint per page of a book. It is necessary to test H0
: m = 0.5 vs H1 : m = 1. For thse 10 pages are observed and if it contains more than 3
misprint, then H0 is rejected find , & power of test.
Here, m = 1 & m = 0.5
= P (𝑥 > 3/𝑚 = 0.5)
= 1 – P (𝑥  3/𝑚 = 0.5)
= 1 – [P (x = 0) + P(x = 1) + P(x=2) + P(x=3)]
𝑒−0.5(0.5)0 𝑒−0.5(0.5)1 𝑒−0.5(0.5)2 𝑒−0.5(0.5)3

=1–[ 0! + 1! + 2! + 3! ]
= 1 − 𝑒−0.5 [1 + 0.5 + 0.125 + 0.0208]
= 1 – 0.6065 (1.6458)
= 1 – 0.9982
 = 0.0018
12
 = P (accept Ho/H1 istrue.)
= P (𝑥  3/𝑚 = 1)
= [P (x = 0) + P(x = 1) + P(x=2) + P(x=3)
𝑒−1(1)0 𝑒−1(1)1 𝑒−1(1)2 𝑒−1(1)3
= 0! + 1! + 2! + 3!
= e-1 (1 + 1 + 0.5 + 0.1667)

= 0.3679 (2.6667)
 = 0.9811
Power of test
1 –  = 1 – 0.9811
1 –  = 0.0189
Exercise
(1) In an experiment of tossing a coin, p denotes the probability of getting head. In order to
3
test the hypothesis Ho : P = ½ against H1: p = , the coin is tossed 5 times and if more
4
than 3 times head obtained then Ho is rejected. Find the probabilities for type – I and type
– II errors. Also find the power of test. (Ans: 0.1875,0.3672, 0.6328)
(2) It is desired to test the hypothesis that a coin is unbiased. It is agree to reject the
hypothesis if the number of heads (x) in 9 different tosses is
x ≤ 2 or x ≥ 7. What is the probability of committing type – I error?
(Ans:0.3594)
(3) It is observed that on an average 3 items are defective in a lot of items. A sample is
selected from the lot and if it shows 3 or more defective items then the lot is rejected.
Find probability of committing type – I error. The test is Ho: m = 3 v/s H1 : m ≠ 3.
(Ans: 0.5767)
(4) In order to test the hypothesis, that Ho: m = 2 v/s H1 : m = 3, a sample of size 100 units
is selected at random from the big lot. If the sample shows 2 or less defective items,
then the lot is accepted. Find probabilities for type – I and type – II errors. Also find
power of test.
(Ans:0.3235, 0.4233, 0.5767)
13
Large Sample Tests
For testing a given hypothesis a random sample is drawn from a population. If the number of
units in the sample is greater than 30, it is generally regarded as a large sample. We shall study
the tests of significance for large samples. The tests will be discussed in this chapter are
(1) Tests of Variables
(i) Test of significance of a mean
(ii) Test of significance of difference between two means
(iii) Test of significance of difference between two standard deviations
(2) Tests of Attributes
(iv) Test of significance of proportion of successes
(v) Test of significance of difference between two proportions
For reference the critical values at important levels of significance are given in the below table.
5% 1%
Two tailed test 1.96 2.58
One tailed test 1.64 2.33
(i) Test of significance of a mean:

Suppose, a random sample of size n is drawn from a large population with mean 𝑥̅. If we want to
test the hypothesis that population mean is 𝜇0i.e. H0: 𝜇 = 𝜇0. We can use the following steps.
Step 1: H0: 𝜇 = 𝜇0 Vs H1: 𝜇 ≠ 𝜇0
Step 2: Difference = |𝑥̅ − 𝜇0|
Step 3: Standard Error S.E. of 𝑥̅ = 𝜎
√𝑛
|
Step 4: Calculate Z = 𝑥̅−𝜇0
|
( )
√𝑛
Then compare Z with the critical value of Z. If the calculated value of Z > 1.96 at 5% level of
significance then the null hypothesis is rejected and it may be concluded that the difference
between the sample mean and the population mean is significant. If the calculate value of Z <
1.96 then the null hypothesis is accepted and it may be concluded that the difference between
sample mean and population mean is not significant.
14
Example: The mean life time of 100 light tubes produced by a company is computed to be 1570
hours with a standard deviation of 120 hours. The company claims that the average life of the
tubes produced by the company is 1600 hours. Is the claim justified? Use 5 % level of
significance.
Solution:
Step 1: H0: 𝜇 = 1600 Vs H1: 𝜇 ≠ 1600
Step 2: Difference = |𝑥̅ − 𝜇0|
⇒ |𝑥̅ − 𝜇0| = |1570 − 1600|
⇒ |𝑥̅ − 𝜇0| = 30
Step 3: Standard Error S.E. of 𝑥̅ = 𝜎
√𝑛
𝜎
⇒ √𝑛 120
= √100
𝜎
⇒ √𝑛 120
= 10
𝜎
⇒ √𝑛 = 12
|
Step 4: Calculate Z = 𝑥̅−𝜇0
|
( )
√𝑛
⇒𝑍= 30
12
⇒ 𝑍 = 2.5
Here, Z = 2.5 > 1.96 at 5 % level of significance.
⇒ H0 is rejected.
⇒ Company’s claimed is not justified.
⇒ The mean life time of light tube is not 1600 hours.
15
(ii) Test of significance of difference between two means:
Suppose, two independent random samples are drawn from two different population with means
𝑥̅1& 𝑥̅2 respectively with sizes 𝑛1& 𝑛2 respectively. If we want to test the hypothesis that
population means are equal i.e. H0: 𝜇1 = 𝜇2. We can use the following steps.
Step 1: H0: 𝜇1 = 𝜇2. Vs H1: 𝜇1 ≠ 𝜇2.
Step 2: Difference = |𝑥̅1 − 𝑥̅2|
2 2
Step 3: Standard Error S.E. of 𝑥̅1& 𝑥̅2 = √ 𝜎1 + 𝜎2
𝑛1 𝑛2
|𝑥̅1− 𝑥̅2|
Step 4: Calculate Z =
𝜎 𝜎 2 2
(√ 1 + 2 )
𝑛 𝑛
1 2
Here 𝜎2and 𝜎2 are population variances.

1 2
If they are not known, sample variances S2and S2 can be used as their estimates.
1 2
If 𝜎 is given as the variance of both the populations then 𝜎2 = 𝜎2 = 𝜎2 should be taken.
2
1 2
Then compare Z with the critical value of Z. at required level of significance and decide whether
to accept or to reject null hypothesis.
Example: The following information is about the height of students of two colleges.
College A College B
Mean Height (in inches) 67.42 67.25
S.D. (in inches) 2.58 2.50
Sample size 1000 1200
Is the difference in the means significant?

Solution:
Step 1: H0:There is no significant difference in mean height of students of two
colleges. Step 2: Difference = |𝑥̅1 − 𝑥̅2|
⇒ |𝑥 − 𝑥̅2| = |67.42 − 67.25|
⇒ |𝑥̅1 − 𝑥̅2| = 0.17
2 2
Step 3: Standard Error S.E. of 𝑥̅1& 𝑥̅2 = √ 𝜎1 + 𝜎2
𝑛1 𝑛2
𝜎 𝜎2 2 2.582 2.502
⇒√ 1+ 2 =√ +
16
𝑛1 𝑛2 1000 1200
17
2 2
⇒ √ 𝜎1 + 𝜎2 = √0.0066 + 0.0052
𝑛1 𝑛2
2 2
⇒ √ 𝜎1 + 𝜎 2 = √0.0118
𝑛1 𝑛2
2 2
⇒ √ 𝜎1 + 𝜎2 = 0.1086
𝑛1 𝑛2
|𝑥̅1− 𝑥̅2|
𝜎 2
𝜎 2
(√ 1 + 2 )
𝑛 𝑛
1 2
0.17
⇒ 𝑍 = 0.1086
⇒ 𝑍 = 1.5654
Here, Z = 1.5654 < 1.96 at 5 % level of significance.
⇒ H0 is accepted.
⇒ There is no significant difference in mean height of students of two colleges.
(iii) Test of significance of difference between two standard deviations
Here, we want to test the hypothesis that the standard deviations of the two populations do not
differ significantly. So we can apply the following steps.
Step 1: H0: 𝜎1 = 𝜎2 Vs H1: 𝜎1 ≠ 𝜎2
Step 2: Difference = |S1 − S2|
2 𝜎22 2 S22
Step 3: Standard Error S.E. of S1 − S2 = √ 𝜎1 + = √ S1 +
2𝑛1 2𝑛2 2𝑛1 2𝑛2
|S1− S2|
𝜎2 𝜎2
(√ 1 + 2 )
2𝑛1 2𝑛2
18
Example: In a sample of 1000 the mean is 17.5 and the S.D. 2.5. In another sample of 800 the
mean is 18 and S.D. 2.7. Assuming that the samples are independent, discuss whether the two
samples can have come from a population which have the same S.D.
Solution:
Step 1: H0: 𝜎1 = 𝜎2 i.e. Two samples have come from a population which have the
same S.D..
Step 2: Difference = |S1 − S2|
⇒ |S1 − S2| = |2.5 − 2.7|
⇒ |S1 − S2| = 0.2
2 S22
Step 3: Standard Error S.E. of S1 − S2 = √ S1 +
2𝑛1 2𝑛2
S 2 S22 2.52 2.72

⇒√ 1 + =√ +
2𝑛1 2𝑛2 2000 1600
2 2
⇒ √ S1 + S2 = √0.0077
2𝑛1 2𝑛2
2 S22
⇒ √ S1 + = 0.0877
2𝑛1 2𝑛2
|S1− S2|
𝜎2 𝜎2
(√ 1 + 2 )
2𝑛1 2𝑛2
0.2
⇒ Z = 0.0877
⇒ Z = 2.28
⇒ H0 is rejected.
⇒ Two samples have not come from a population which have the same S.D.
19
(iv) Test of significance of proportion of successes
Suppose a random sample of n units is drawn from a population and x units of them possess a
particular characteristic. The sample proportion of the attribute is p and given by 𝑝 = 𝑥. In order
𝑛
to test the null hypothesis that the population proportion of the attribute is P. We apply following
steps:
Step 1: H0: Population Proportion = P Vs H1: Population Proportion ≠ P
Step 2: Difference = |𝑝 − P| = |𝑥 − P|
𝑛
Step 3: Standard Error S.E. of 𝑝=

√PQ where Q = 1 − P
𝑛
𝑥
| − P|
𝑛
PQ
√𝑛 )
(
Example: In a certain city 380 men out of 800 men were found to be smokers. Discuss whether
this information support the view that the majority of men in the city are smokers?
Solution:
Step 1: H0: Population Proportion = P = 1 Vs H1: Population Proportion > 1
2 2
Step 2: Difference = |𝑝 − P| = |𝑥 − P|
𝑛
⇒ |𝑥 − P| = |380 − 0.5|
𝑛 800
⇒ |𝑥 − P| = |0.475 − 0.5|
𝑛
⇒ |𝑥 − P| = 0.025
𝑛
Step 3: Standard Error S.E. of 𝑝=

√PQ where Q = 1 − P
𝑛
PQ 0.5×0.5 where Q = 1 − 0.5

⇒√ =√
𝑛 800
⇒ √PQ = 0.0003125
𝑛 √
⇒ √PQ = 0.0177
𝑛
20
𝑥
| − P|
𝑛
PQ
√𝑛 )
(
0.025
⇒𝑍= 0.0177
⇒ 𝑍 = 1.4069
Here, Z = 1.4069 < 1.645 at 5 % level of significance.
⇒ H0 is accepted.
⇒ Majority of the men in the city are not smokers.
(v) Test of significance of difference between two proportions
Suppose, a random sample of size 𝑛1is taken from one population and 𝑥1 units of them possess
some attribute. Hence, 𝑝 = 𝑥1 is the proportion of units possessing the attribute in the sample.
1 𝑛1
Suppose, another independent random sample of size 𝑛2is taken from another population and 𝑥2
units of them possess the same attribute. Hence, 𝑝 = 𝑥2. Here we want to test the hypothesis that
2 𝑛2
the population proportions of the attribute are equal. We apply following steps:
Step 1: H0: P1 = P2 Vs H1: P1 ≠ P2
Step 2: Difference = |𝑝1 − 𝑝2|
Step 3: Standard Error S.E. of 𝑝1 − 𝑝2= √P1Q1 + P2Q2
𝑛1 𝑛2
|𝑝1−𝑝2|
P Q P2Q2
(√ 1𝑛 1+ 𝑛 )
1 2
If the population proportion P1&P2 are unknown, their estimates are obtained by
combining two sample proportions. The pooled estimate P can be obtained as
P = 𝑥1+ 𝑥2 = 𝑛1𝑝1+ 𝑛2𝑝2 then S.E. of p = √PQ + PQ = √PQ ( 1 + 1 )

𝑛1+𝑛2 𝑛1+𝑛2 𝑛1 𝑛2 𝑛1 𝑛2
Example: In a large city A, 20 % of a random sample of 900 school boys had defective eye
sight. In another large city B, 15.5 % of a random sample of 1600 school boys has the same
defect. Is the difference between two proportions significant?
Solution:
Step 1: H0: P1 = P2 i.e. there is no significant difference between two proportions.
Step 2: Difference = |𝑝1 − 𝑝2|
21
20%(900) 15.5%(1600)
⇒ |𝑝1 − 𝑝2| = | 900 − 1600 |
⇒ |𝑝1 − 𝑝2| = |180 − 248 |
900 1600
⇒ |𝑝1 − 𝑝2| = |0.2 − 0.155|

⇒ |𝑝1 − 𝑝2| = 0.045
Step 3: Standard Error S.E. of 𝑝1 − 𝑝2= √PQ ( 1 + 1
)
𝑛1 𝑛2
Where P = 𝑥1+ 𝑥2
𝑛1+𝑛2
180+248
⇒ P = 900+1600
428
⇒ P = 2500
⇒ P = 0.1712
So, ⇒ Q = 1 − P = 0.8288
Now, √PQ ( 1 + 1
) = √0.1712 × 0.8288 ( 1
+ 1
)
𝑛1 𝑛2 900 1600
⇒ √PQ ( 1 + 1 ) = 0.0157
𝑛1 𝑛2
|𝑝1−𝑝2|
Step 4: Calculate Z = 1 1
(√PQ( + ))
𝑛1 𝑛2
0.045
⇒ Z = 0.0157
⇒ Z = 2.8662
⇒ H0 is rejected.
⇒ There is no significant difference between two proportions.
22
Exercise
Question 1. A sample of 400 students have a mean height of 171.38 cm. Can it be regarded as
a random sample from a large population with mean height 171.17 cm and
standard deviation 3.3 cm? (Ans: Z = 1.27)
Question 2. The mean of a random sample of 1000 units is 17.6 and the mean of another sample
of 800 units is 18. Can it be concluded that both the samples come from the same
population with S.D. = 2.6? (Ans: Z = 3.24)
Question 3. The information regarding marks of boys and girls of a college is given below:
Sample Mean S.D. n
Boys 83 10 121
Girls 81 12 81
Test whether the difference in standard deviation is significant.
(Ans: Z = 1.75)
Question 4. In a large consignment of fruits, 64 fruits out of sample of 400 fruits are found to
be bad. Test the hypothesis that the population proportion of bad fruits in the
consignment is 20% at 1% level of significance. (Ans: Z = 2)
Question 5. A machine produced 16 defective articles in a batch of 500 articles. After
overheating, it produced 3 defective articles in a sample of 100 articles. Has the
machine improved? (Ans: Z = 0.104)
23
Chi Square Test
 Definition of 2 :
If x1, x2, x3..........xn is a random sample of size n, from a normal population with mean 0 and
S.D. 1, then the distribution of ∑ 𝑥2 is called 2 distribution on n degrees of freedom. Similarly if
�
x1, x2, x3.........xn is a random sample of size n from a normal population with mean  and S.D. 
2
 xi   
n
then the distribution of    is also 2 distribution with n degrees of freedom.

i  1  
The probability density function of 2 distribution is
1 n
f χ 
2
χ2
n e

2 χ 
2 2
0 < χ2  
n
22
2
2 is a continuous distribution and the form of the distribution depends upon degrees of
freedom n. The mean of the distribution is n and its variance is 2n.
 Uses of 2
2 distribution has a large number of applications in statistics. We shall discuss the
following three main uses of 2
1. To test goodness of fit
2. To test independence of attributes
3. To test a specified value of the variance of the population.
 Goodness of Fit Test :
Suppose we have obtained an observed frequency distribution and we arc interested in
knowing whether the observed frequency distribution support a particular hypothesis. For. this a
very powerful test for testing the significance of the discrepancy between observed frequency
distribution and expected frequency distribution was given by Karl Pearson in 1900. The test is
known as 2 test of goodness of fit.
Under the null hypothesis that there is no significant difference between observed and
expected frequencies, the value of 2 is calculated by the formula :
(𝑜𝑖−𝑒𝑖)2
2 = ∑ 𝑒𝑖
24
If all the observed frequencies and expected frequencies are equal, the value of 2 will be
zero. This will signify a perfect agreement of observations with expectations. More the value of
2 more is the divergence between the observed and expected frequencies.
The value of 2 is calculated from the given data and it is compared with the table value
of 2 on n –1 degrees of freedom and at a required level of significance. If calculated value of 2
is less than table value of 2 the null hypothesis may be accepted and it may be concluded that
the given frequency distribution fits the hypothesis. And if the calculated value of 2 is greater
than the table value of 2 the hypothesis may be rejected and it may be concluded that the
observed frequency distribution does not fit the hypothesis.
Note : The degrees of freedom in applying goodness of fit test is n – k – l, where k is the number
of parameters estimated.
 Limitations of 2 Test
(1) The observations of the sample should be independent.
(2) Absolute frequencies should always be used
(3) If there are any constraints on class frequencies. then they must be linear.
(4) The frequency of any class should not be less than 5. If any class frequency is less than
five it should be combined with the frequency of the adjoing class or classes, so that the
total frequency of combined classes is more than 5.
(5) The class frequencies should be combined is such a way that degrees of freedom is more
than 0.
Illustration 1 : A die is thrown for 300 times and the following distribution is obtained. Can the
die be regarded unbiased.
Number of the die 1 2 3 4 5 6
Frequency 41 44 49 53 57 56
𝟏
Ans : H0 : Die is unbiased i.e. the probability of obtaining any number is
𝟔
Number of the die observed Expected (𝒐𝒊 − 𝒆𝒊)𝟐

frequency oi frequency ei 𝒆𝒊
81
1 41 50 = 1.62
50
2 44 50 0.72
25
3 49 50 0.02
4 53 50 0.18
5 57 50 0.98
6 56 50 0.72
2
300 300 4.24
2 = ∑ (𝑜𝑖−𝑒𝑖) = 4.24
𝑒𝑖
D.f. = n – 1 = 6 – 1 = 5
The value of 2 on 5 d.f. and at 5 % level of significance = 11.07
2cal < 2tab
 HO may be accepted.
 Die may be regarded unbiased.
Illustration 2: The units produced by a plant are classified into four grades. The past
performance of the plant shows that the respective proportion are 8 : 4 : 2 : 1. To check the run of
the plant 600 parts were examined and classified as follows. Is there any evidence of a change in
production standards.
Grade First Second Third Fourth Total
Units 340 130 100 30 600
Ans : Ho : There is no change in production standards.
Grade Units Unit (𝑜𝑖 − 𝑒𝑖)2
Observed Expected 𝑒𝑖
First 340 600 × 8
= 320 30
15
Second 130 600 × 4
= 160 5.625
15
Third 100 600 × 2
= 80 5.00
15
Fourth 30 600 × 1
= 40 2.50
15
600 14.375
2 =∑ 𝑒𝑖
= 14.375
D.f. = n – 1 = 4 – 1 = 3
Table value of 2 on 3 d.f. and at 5% level of significance = 7.815
2cal > 2tab

 HO may be rejected
26
 There is evidence in change of production standards.
Illustration 3 : Five coins are tossed for 320 times and the following distribution of number of
heads is obtained.
Number of heads 0 1 2 3 4 5
Frequency 8 42 116 90 52 12
Test the hypothesis that the coins are unbiased.
1
Ans : Ho : Coins are unbiased i.e P =
2
Number Observed Probability Expected (𝑜𝑖 − 𝑒𝑖)2

of heads frequency P(xi) frequency 𝑒𝑖
Oi ei = N × P(xi)
0 8 5𝑐 0 𝑝0𝑞5 = 1
10 0.40
32
1 42 5𝑐1 𝑝1𝑞4 = 5
50 1.28
32
2 116 5𝑐 2 𝑝2𝑞3 = 10 100 2.56
32
3 90 5𝑐 3 𝑝3𝑞2 = 10 100 1.00
32
4 52 5𝑐4 𝑝4𝑞 5
= 32 50 0.08
5 12 5𝑐 5 𝑝5𝑞0 = 1
10 0.40
32
320 1 320 5.72
2 =∑ 𝑒𝑖
= 5.72
D.f. = n – 1 = 6 – 1 = 5
2cal < 2tab

 HO may be rejected
 Coins may be regarded unbaised.
Illustration 4 : The following is a distribution of mistakes committed by a typist in 100 pages.
Number of Mistakes 0 1 2 3 4 5 6
Number of Pages 11 31 26 17 10 4 1
Fit a poisson distribution and test the goodness of fir (e-1 = 0.135)
Ans : H0 : Poisson distribution fits the data.
27
Number of Number of fx P(xi) Expected (𝒐𝒊 − 𝒆𝒊)𝟐
mistakes pages frequency ei 𝒆𝒊
x f = N × P (xi)
0 11 0 0.135 13.5 0.4630
1 31 31 0.270 27.0 0.5926
2 26 52 0.270 27.0 0.0370
3 17 51 0.180 18.0 0.0556
4 10 40 0.090 9.0
5 4 20 0.036 3.6 0.1043
6 1 6 0.012 1.2
∑ 𝑓𝑥
100 200 1.2525
Mean m = = 200 = 2
𝑁 100
−𝑚 𝑥
𝑒 𝑚
for Poisson distribution P(a) =
𝑥!
P(0) = 𝑒 −𝑚
=𝑒 −2
= 0.135
𝑚
P(1) = P(0) = (0.135) = 0.270
2
1 1
𝑚
P(2) = P(1) = (0.270) = 0.270
2
2 2
P(3) = 𝑚 P(2) = 2 (0.270) = 0.180

3 3
P(4) = 𝑚 P(3) = 2 (0.180) = 0.090
4 4
P(5) = 𝑚 P(4) = 2 (0.090) = 0.036
5 5
𝑚
P(6) = P(5) = (0.036) = 0.012
2
6 6
2
2 = ∑ (𝑜𝑖−𝑒𝑖) = 1.2525
𝑒𝑖
D.f. = n – 1 – 1 = 5 – 2 = 3
2cal < 2tab

 H0 may be accepted.
 Poisson distribution fits the data.
(Note : In fitting a Poisson distribution mean m is estimated, hence one d.f. is lost.)
28
 Test of Independence of Two Attributes :
When the data are classified according to two attributes, 2 can also be used to test the
hypothesis that the two attributes are independent.
Supposes the data are classified into r classes A1, A2, A3,......Ar according to attribute A
and into c classes B1, B2, B3......Bc according to attribute B. The representation of the data in a cross
– classified table known as a contingency table is given below. In the r × c contingency table the
observed frequencies of different cells are shown.
B1 B2 B3.............. Bj.............. Bc
A1 O11 O12 O13.............. O1j.............. O1c (A1)
A2 O21 O22 O23.............. O2j.............. O2c (A2)
A3 O31 O22 O33.............. O3j.............. O3c (A3)
......... .................................................................................................... .........
......... .................................................................................................... .........
......... .................................................................................................... .........
Aj Oi1 Oi2 O i3.............. Oij.............. Oic (Aj)
......... .................................................................................................... .........
......... .................................................................................................... .........
......... .................................................................................................... .........
Ar Or1 Or2 O r3.............. Orj.............. Orc (Ar)
(B1) (B2) (B3)............ (Bj)...........(Bc) N
The total of ith row is denoted by (Ai) and the total of jth column is denoted by (Bj). Oij denotes the
frequency of the cell common to ith row and jth column. The total frequency is N. i.e.
∑(Ai) = ∑(Bj) = N
Under the null hypothesis that the two attributes A and B are independent, we shall find the
expected frequency of (i, j)th cell.
( Ai )
The probability that any observation will fall in the ith row =
N
(Bj)
Similarly the probability that any observation will fall in the jth column =
N
Under the hypothesis of independence the probability that any observation will fall in the ith row
(Ai ) (Bj)
and jth column = ×
N N
 Expected frequency of (i, j)th cell

(Ai ) (Bj)
eij = N × ×
N N
(Ai)(Bj)
= N
29
Thus, we can find and expected frequencies of all cells. From observed frequencies oij
and expected frequencies eij, the value of 2 can be obtained by the following formula.
2
(𝑜𝑖𝑗−𝑒𝑖𝑗)
 =2
∑𝑗
𝑒𝑖𝑗
∑𝑖
The number of independent cells in a r × c contingency table is (r – 1) (c – 1). Hence the
degrees of freedom in a r × c table is (r – 1) (c – 1).
For testing the hypothesis of independence of two attributes A and B, the value of 2 is
found out and it is compared with the table value of 2 on (r- 1) (c- l) d. f. and at a required level
of significance. If calculated 2 is greater than the table value of 2, the hypothesis may be
rejected
i.e the two attributes may not be regarded independent. If calculated 2 is less than the table
value of 2, the hypothesis the hypothesis that the attributes are independent may be accepted.
Illustration 5: In an industry, 200 workers employed for a specific job were classified according
to their performance and training received / not received. Test independence of training and
performance. The data are summarized as follows.
Performance
Total
Good Not Good
Trained 100 50 150
Untrained 20 30 50
120 80 200
Ans : Ho : Performance is independent of training.

Performance
Good Not Good Total
Trained 100 (90) 50 (60) 150
Untrained 20 (30) 30 (20) 50
Total 120 80 200
Expected frequency of cell (1, 1) = 150 × 120
= 90
200
The expected frequencies of different cells indicated in brackets in the cells.

2 =∑ 𝑒𝑖
(100 − 90)2 (50 − 60)2 (20 − 30)2 (30 − 20)2

= 90 + 60 + 30 + 20
30
= 1.11 + 1.67 + 3.3 + 5
31
= 11.11
On 1 d.f. and at 5% level of significance table value of 2 = 3.84
2cal > 2tab
 Performance depends upon training.
 Yate’s Correction :
2 is a continuous distribution and it fails to maintain its characteristics of continuity if

any of the expected frequency is less than 5. So in a 2 × 2 contingency table with cell frequencies
. Yate’s in 1934 suggested a correction. Known as Yate’s correction of continuity. According to
this correction.
In a 2 × 2 contingency table if any cell frequency is less than 5, calculate ad and
bc. If ad < bc add 0.5 in a and d and subtract 0.5 from b and c.
If bc < ad, add, 0.5 in b and c and subtract 0.5 from a and
d. This will not change the marginal totals.
After applying Yate’s correction the value of 2 is calculated and it is compared with the
table value of 2..
Illustration 6 : From the following table, determine whether inoculation is effective over the
disease.
Died Survived Total
Inoculated 2 10 12
Not Inoculated 6 6 12
Total 8 16 24
Ans : Here one cell frequency is less than 5, hence Yate’s correction is necessary.
ad = 2 × 6 = 12, bc = 10 × 6 = 60
ad < bc
we shall add 0.5 in each of a and d and subtract 0.5 from each of b and c
Died Survived Total
Inoculated 2.5(4) 9.5(8) 12
Not Inoculated 5.5(4) 6.5(8) 12
Total 8 16 24
Total expected frequencies under the hypothesis of independence are found out and
shown in brackets in the cells.
32
2 =∑ 𝑒𝑖
(2.5 – 4)2 (9.5 −8)2 (5.5 − 4)2 (6.5 − 8)2

= 4 + 8 + 4 + 8
= 0.5625 + 0.28125 + 0.5625 + 0.28125

= 1.6875
D.f. = (r – 1) ( c – 1) = 1
2cal < 2tab
 Hence inoculation may not be regarded effective over the diseases.
7. To Test Specified Value of the Variance of the Population :
Suppose a random sample x1, x2, x3,........xn is drawn from a normal population 2 test can
be used to test hypothesis that population variance is . 2 Under the hypothesis that population
2
variance is 2 it is proved that 𝑛𝑠 is 2 on n–1 d.f.
𝜎 2
𝑛𝑠2 ∑(𝑥𝑖− 𝑥̅)2

 2 = 𝜎2 = 𝜎2
The value of 2 is calculated from the given data and it is compared with the table value of
2 on n – 1 d.f. and at a required level of significance. The decision regarding acceptance or
rejection of the hypothesis is then taken.
Illustration 7 : Ten observations drawn randomly from a normal population are given
below 68, 72, 68, 74, 77, 61, 63, 69, 73, 75
Test the hypothesis that the population variance is 32.
Ans : H0 : 𝜎2 = 32
x 𝒙𝒊 − 𝒙̅ (𝒙𝒊 −
𝒙̅)𝟐
68 -2 4
72 2 4
68 -2 4
74 4 16
77 7 49
61 -9 81
63 -7 49
69 -1 1
73 3 9
75 5 25
33
700 242
∑ 𝑥𝑖 700
𝑥̅ = = = 70
𝑛 10
∑(𝑥𝑖− 𝑥̅)2
2 = 𝜎 2
= 242 = 7.5625
32
Table value of 2 on n -1 = 9 d.f. and at 5% level of significance = 16.92
 2cal < 2tab

 Population variance may be regarded as 32.
34
Exercise
Question -1 The number of road accidents on a high way during a week is given below. Can it
be concluded that the proportion of accidents are equal for all days.
Day Mon. Tue. Wed. Thurs. Fri. Sat. Sun.
Number of accidents 14 16 8 12 11 9 14
(Ans:  = 4.17)
2
Question - 2 In an experiment of pea-breading, the following frequencies of seeds were obtained.

Round and yellow 315
Wrinkled and yellow 101
Round and green 108
Wrinkled and green 36
Total 560
Theory predicts that the frequencies should be in the proportion 9 : 3 : 3 : 1 respectively. Are the
data consistent with the theory at 5% level of significance ?
(Ans : 2 = 0.267)
Question - 3 : The result in the last examination of a sample of 100 students is given below:
1st Class 2nd Class 3rd Class Total

Boys 10 28 12 50
Girls 20 22 8 50
Total 30 50 20 100
Can it be said that the performance in the examination depends upon sex.
(Ans: 2 = 4.86)
Question – 4 A sample of size 20.drawn from a normal population give mean and S.D. as 42
and 6 respectively. Test the hypothesis that population S.D. is 9.
(Ans: 2 = 8.89)
35
Small Sample Test
Introduction
 A sample having number of observations less than or equal to 30 is regarded as a

small sample.
 The tests based on small samples are called small sample tests.
 For small sample tests, t distribution, F distribution etc. can be used.
Difference between Large sample tests and Small sample tests
Large Sample Test Small Sample Test

Sample size is greater than 30 Sample size is less than 30
The value of statistic obtained from sample The value of statistic obtained from sample
can be taken as estimate of the population can not be taken as estimate of the population
parameter. parameter.
Normal distribution is used for testing. Sampling distribution like t, F are used for
testing.
Degrees of Freedom
 Degrees of freedom is the number of independent observations of the variable.

 Suppose, we are asked to select any six observation. There is no restriction on
the selection of these observations. Hence, the degree of freedom is 6.
 Suppose, we want to select six observations whose sum is 50. Hence, we can select any
five observation freely but last observation is automatically selected by virtue of the
restriction of total 50. So degree of freedom is 6 - 1 = 5.
 Thus, degree of freedom for selecting n observation when one such restriction is given
is n – 1.
36
t Distribution or Student’s t Distribution
Introduction
• t distribution was given by W. S. Gosset in 1908.
• He published his work under the pen name of student so this distribution is known as
student’s t distribution.
• Let x1, x2, x3.........xn is a random sample of n observations from a normal population with
mean  and S.D.  then the distribution of t = X̅−μ
S√n−1
is defined as t distribution with
n − 1 degrees of freedom. Here 𝑥̅ = ∑𝑛 𝑥𝑖and 𝑆 = 1 ∑𝑛 (𝑥𝑖 − 𝑥̅)2
1 2
𝑛 𝑖=1 𝑛 𝑖=1
• The probability density function of t distribution is given by

𝒇(𝒕) = 𝟏
𝟏 𝒏
𝒏+𝟏
𝒕𝟐 𝟐
, −∞<𝒕<∞
√𝒏 . 𝖰( , )(𝟏+ )
𝟐 𝟐 𝒏
Assumptions
1) The population from which the sample is drawn is normal.
2) The sample is random.
3) The population standard deviation 𝜎 is not known.
Properties
1) The probability curve of this distribution is symmetrical.
2) The tails of the curve are asymptotic to X axis.
3) When n is very large then t distribution tends to normal distribution.
4) The form of this distribution varies with the degrees of freedom.
Test the significance of the difference between sample mean and population mean
• Suppose, a random sample x1, x2, x3.........xn is drawn from a normal population and the
mean and the variance of the sample are 𝑥̅ and 𝑆2 respectively. If we want to test the
hypothesis that there is no significant difference between sample mean 𝑥̅ and
assumed mean 𝜇 of the population then we apply t test.
• Step – 1 Define H0: Population Mean = 𝜇 Vs
H1: Population Mean ≠ 𝜇
• Step – 2 Find positive difference of 𝑥̅ and 𝜇, i.e. |𝑥̅ − 𝜇|
𝑆
• Step – 3 Find S.D., i.e.
√𝑛−1 where 𝑆2 = 1 ∑(𝑥 − 𝑥̅)2
37
𝑛 �
38
• Step – 4 Find t statistic, i.e.t𝑐𝑎𝑙 = |𝑥̅−𝜇|
( |𝑥̅−𝜇|√𝑛−1
)
√𝑛−1
= 𝑆
• Step – 5 Compare t𝑐𝑎𝑙 with t𝑡𝑎𝑏

t𝑡𝑎𝑏 is t table value obtained from t table at
n − 1 df and at given level of significance.
• Step – 6 Conclusion.
If t𝑐𝑎𝑙 ≤ t𝑡𝑎𝑏 ⇒ H0 is accepted.
If t𝑐𝑎𝑙 > t𝑡𝑎𝑏 ⇒ H0 is rejected.
Example 1: An automatic machine is set to fill 170 tablets in a bottle. A sample of 10 bottles
was examined and the number of tablets in them were
168, 164, 166, 167, 168, 169, 170, 170, 171, 170
Test whether the machine is set properly or not.
Solution:
H0: Machine is set properly. 𝜇 = 170
(i.e. Machine is set to fit 170 tablets in a bottle.)
𝑥𝑖 𝑑𝑖 = 𝑥𝑖 − 168 𝑑𝑖2
168 0 0
164 –4 16
166 –2 4
167 –1 1
168 0 0
169 1 1
170 2 4
170 2 4
171 3 9
170 2 4
1683 3 43
∑ 𝑥𝑖 1683
𝑥̅ = = = 168.3
𝑛 10
2
1
( )2
𝑆 = ∑ 𝑥𝑖 − 𝑥̅
𝑛
∑ 𝑑2
𝑆 =( ) − ∑ 𝒅𝒊 2
2
( 𝑖
𝑛 )
𝑛
39
43 3 2
⇒ 𝑆2 = ( ) − ( )
10 10
⇒ 𝑆2 = 4.3 − 0.09
⇒ 𝑆2 = 4.21
⇒ 𝑆 = √4.21
⇒ 𝑆 = 2.0518
|𝑥̅ − 𝜇|√𝑛 − 1
t𝑐𝑎𝑙 =
𝑆
|168.3 − 170|√10 − 1
t𝑐𝑎𝑙 = 2.0518
1.7 × 3
= = 2.4856
t𝑐𝑎𝑙 2.0518
t table value at 5% level of significance and 9 df.
t𝑡𝑎𝑏 =2.26
Here, t𝑐𝑎𝑙 > t𝑡𝑎𝑏
⇒ H0 is rejected
⇒ Machine is not set properly.
Test significance of the difference between means of two samples

• Suppose, two independent small samples of sizes 𝑛1 and 𝑛2 are drawn from two normal
populations with their means 𝑥
and 𝑥̅2 respectively.
• If we want to test the hypothesis that the population means are equal then we apply t test.
Step – 1 Define H0: 𝜇1 = 𝜇2 Vs H1: 𝜇1 ≠ 𝜇2
Step – 2 Find positive difference of 𝑥̅1 and 𝑥̅2, i.e.|𝑥̅1 − 𝑥̅2|

𝑆
Step – 3 Find S.D., i.e.
1 1
√𝑛1 +𝑛2
where 𝑆 2 = 1
{∑(𝑥1 − 𝑥 )2 + ∑(𝑥2 − 𝑥2̅ )2}
𝑛1+𝑛2−2
1
𝑆2 = {𝑛 𝑆2 + 𝑛 𝑆2}
1 1 2 2
𝑛1 + 𝑛2 − 2
𝑆12 = 1
∑(𝑥1 − 𝑥̅1)2 , 𝑆22 = 1
∑(𝑥2 − 𝑥̅2)2
𝑛1 𝑛2
40
|𝑥̅1−𝑥̅2|
Step – 4 t𝑐𝑎𝑙 =
𝑆
( )
1 1
√𝑛1 +𝑛2
|𝑥̅1−𝑥̅2| 𝑛1𝑛2
t𝑐𝑎𝑙 = √
𝑆 𝑛1+𝑛2
Step – 5 Compare t𝑐𝑎𝑙 with t𝑡𝑎𝑏

Step – 6 Conclusion.
Example 2: Two horses A and B were tested for running a particular track. The time taken by
them are given below: Can it be concluded that horse A is faster than horse B?
Horse A 28 30 32 33 33 29 34
Horse B 29 30 30 24 27 29
Solution:
H0: 𝜇1 = 𝜇2(The mean time of horses are equal.)
H1: 𝜇1 > 𝜇2(Horse A is faster than horse B.)
𝑥1 𝑥2 𝑑1 = 𝑥1 − 31 𝑑12 𝑑2 = 𝑥2 − 28 𝑑22
28 29 –3 9 1 1
30 30 –1 1 2 4
32 30 1 1 2 4
33 24 2 4 –4 16
33 27 2 4 –1 1
29 29 –2 4 1 1
34 3 9
219 169 2 32 1 27
∑ 𝑥1 219
𝑥 = = = 31.29
1
𝑛 7
∑ 𝑥2 169
𝑥 = = = 28.17
2
𝑛 6
1
𝑆2 =
𝑛1 + 𝑛2 − 2 {∑(𝑥1 − )2 + − 𝑥2 )2}
𝑥1 ∑( 𝑥 2
41
2
1
2
(∑ 𝒅𝟏)2 2
(∑ 𝒅𝟐)2
⇒
𝑆 = {∑ 𝑑1 + ∑ 𝑑2 − }
− 𝑛1 + 𝑛2 − 2 𝑛1 𝑛2
1 (2)2 (1)2
⇒ 𝑆2 = {32 −
+ 28 − }
7+6−2 7 6
1 (2)2 (1)2
⇒ 𝑆2 = {32 −
+ 28 − }
11 7 6
⇒ 𝑆2 = 5.2965
⇒ 𝑆 = √5.2965
⇒ 𝑆 = 2.3014
|𝑥̅1−𝑥̅2| 𝑛1𝑛2
Now, t𝑐𝑎𝑙 = √
𝑆 𝑛1+𝑛2
|31.29 − 28.17| 7 × 6
⇒ t 𝑐𝑎𝑙 = √
2.3014 7+6
|31.29 − 28.17| 7 × 6
⇒ t 𝑐𝑎𝑙 = √
2.3014 7+6
⇒ t𝑐𝑎𝑙 = 2.4368
t table value at 5% level of significance and (7 + 6 − 2)=11 df.
t𝑡𝑎𝑏 =1.796 (one tailed)
Here, t𝑐𝑎𝑙 > t𝑡𝑎𝑏
⇒ H0 is rejected
⇒ Horse A is faster than horse B.
Paired t test for difference of two means
• Suppose, two independent small samples of sizes 𝑛1 and 𝑛2 are drawn from two normal
populations with their means 𝑥
and 𝑥̅2 respectively.
• If we want to test the hypothesis that the population means are equal then we apply t test.
Step – 1 Define H0: 𝜇 = 0 Vs H1: 𝜇 ≠ 0
Step – 2 Find 𝑑̅, i.e. 𝑑̅ = ∑ 𝑑𝑖 , 𝑑𝑖 = ∑(𝑥1𝑖 − 𝑥2𝑖)

�
𝑆
Step – 3 Find S.D., i.e.
√𝑛−1 where 𝑆2 = 1 ∑(𝑑 − 𝑑̅)2
𝑛 �
42
Step – 4 Find t statistic, i.e.t | |𝑑̅ |√𝑛−1
𝑐𝑎𝑙 𝑑̅ | = 𝑆
= 𝑆
(√𝑛−1)
Step – 5 Compare t𝑐𝑎𝑙 with t𝑡𝑎𝑏

Step – 6 Conclusion.
Example 3: The sales data of an item in ten shops before advertisement and after advertisement
are as under: Can advertisement is effective at 5% level of significance?
Shops 1 2 3 4 5 6 7 8 9 10
Sales before AD. 9 4 3 5 7 9 6 9 8 10
Sales after AD. 8 6 8 4 10 6 6 11 7 11
Solution: H0: 𝜇 = 0 (i.e. Advertisement is not effective.)
Sales before AD. Sales after AD. 𝒅 𝒅𝟐

9 8 1 1
4 6 –2 4
3 8 –5 25
5 4 1 1
7 10 –3 9
9 6 3 9
6 6 0 0
9 11 –2 4
8 7 1 1
10 11 –1 1
–7 55
𝑛 = 10, 𝑛
∑𝑑
̅ 𝑖
𝑑=
𝑛
−7
⇒ 𝑑̅ =
10
⇒ 𝑑̅ = −0.7
∑ 𝑑2
𝑆2 = ( 𝑖 ) −
(
43
∑ 𝒅𝒊 2
)
𝑛
44
55
−7 2
⇒𝑆 =( )−( )
2
1 10
0
⇒ 𝑆2 = 5.5 − 0.49
⇒ 𝑆2 = 5.01
⇒ 𝑆 = 2.2383
Now, t 𝑐𝑎𝑙 = |𝑑̅ |√𝑛−1

𝑆
⇒ t𝑐𝑎𝑙 = 0.7√10 − 1
2.2383
⇒ t𝑐𝑎𝑙 = 0.9382
t table value at 5% level of significance and 9 df.
t𝑡𝑎𝑏 = 2.26
Here, t𝑐𝑎𝑙 < t𝑡𝑎𝑏
⇒ H0 is accepted
⇒ Advertisement is not effective.
Exercise
Question 1. A company is producing steel tubes of mean inner diameter of 2.00 cm. A
sample of 10 tubes gives mean inner diameter of 2.01 cm and a variance of 0.004
cm square. Is the difference in the means significant? (Ans: t𝑐𝑎𝑙 =0.4747)
Question 2. Ten persons are chosen at random from a population and their heights are found
to be in inches as 63, 63, 66, 67, 68, 69, 70, 70, 71, 71. Test the hypothesis that
the mean height of the population is 66. (Ans: t𝑐𝑎𝑙 =1.89)
Question 3. A sample of 8 observations gives sample mean 1134 and SD 35 units whereas
another sample of 7 observations gives sample mean 1024 and SD 40 units. At
5% level of significance, test the significant difference between two sample
means. (Ans: t𝑐𝑎𝑙 = 32.3831)
Question 4. Two random samples of sixes 9 and 7 respectively are drawn from two different
populations. The means of the samples are 196.4 and 198.8 respectively. The sum
of the squares of the deviations from their respective means are 26.94 and 18.73.
45
Test the hypothesis that population means are equal. (Ans: t𝑐𝑎𝑙 = 2.637)
46
Question 5. The sales data of an index in six shops before and after a special promotion
campaign are as under:
Shops A B C D E F
Before campaign 53 28 32 48 50 42
After campaign 58 32 30 50 56 45
(Ans: t𝑐𝑎𝑙 = 2.6)
Question 6. An IQ test was conducted to 5 persons before and after they were trained. The
results are given below:
Students 1 2 3 4 5
Before training 110 120 123 132 125
After training 120 118 125 136 121
Test whether there is any change in IQ after the training. (Ans: t𝑐𝑎𝑙 = 2.6)
47
F test and ANOVA
Introduction
To test the hypothesis of equality of means in two small samples we use t-test. In applying t test,
it is assumed that the population from which the samples are drawn have equal variances. If this
assumption is not correct the result obtained may not be reliable. Hence, before applying t test, it
is necessary to test that the population variances are equal i.e. 𝜎12 = 𝜎22.
Snedecore's F test can be used for testing the hypothesis that the variances of the populations are
equal. The statistic F is defined as
𝑠̂2
𝐹= 1
,
�
2
𝑠̂2and 𝑠̂ are unbiased estimates of the population variances.

2
1 2
Variance Ratio Test (F test)

Suppose, a random sample of size 𝑛1 is drawn from a normal population having variance 𝜎2 and
1
another independent random sample of size 𝑛2 is drawn from another normal population having
variance 𝜎2. If we are interested in testing the hypothesis that the population variances are equal,
2
we can apply F test in the following way
H0: 𝜎2 = 𝜎2 vs H1: 𝜎2 ≠ 𝜎2
1 2 1 2
𝑠̂2
𝐹 = 21
𝑠̂
2
𝑛1
S2
⇒𝐹=
𝑛1 − 1 1
𝑛2 2
𝑛2 − 1S 2
where
1
𝑆2 = ∑( 𝑥 1
− 𝑥 )2 , 𝑆 =
2
− 𝑥 )2
∑( 𝑥
1 1 1 2 22
𝑛1 𝑛2
F is thus the ratio of two independent unbiased estimates of population variances. F is based on
𝑛1 − 1, 𝑛2 − 1 degrees of freedom. It should be noted that F is defined as a ratio of two
independent estimates of population variances Where the numerator is greater than the
denominator.
𝑛1 2
⇒𝐹= S
𝑛1 − 1 1
𝑜𝑛
𝑛 −
𝑛 1, − 1 d. f.
𝑛2 2 1 2
𝑛2 − 1S2
48
or
49
𝑛2 2
⇒𝐹= S
𝑛2 − 1 2
𝑜𝑛 𝑛 −
𝑛 1, − 1 d. f.
𝑛1 2 2 1
𝑛1 − 1S1
From the given data, the value of F is computed and it is compared with the table value of F on
appropriate degrees of freedom and at a required level of significance. The decision regarding
acceptance or rejection of the hypothesis is then taken.
Example: 1. The following figure give the weights of products of items produced by two
machines. Test the hypothesis that there is no significance variation in the products of two
machines.
Machine X 3 7 5 6 5 4 4 5 3 3
Machine Y 8 5 7 8 3 2 7 6 5 7
Solution :
H𝟎 ∶ 𝜎2 = 𝜎2 i.e. There is no significance variation in the products of two machines.
1 2
𝒙𝟏 𝒙𝟐 𝒅𝟏 = 𝒙𝟏 − 𝟒 𝒅𝟏𝟐 𝒅𝟐 = 𝒙𝟐 − 𝟔 𝒅𝟐𝟐
3 8 –1 1 2 4
7 5 3 9 –1 1
5 7 1 1 1 1
6 8 2 4 2 4
5 3 1 1 –3 9
4 2 0 0 –4 16
4 7 0 0 1 1
5 6 1 1 0 0
3 5 –1 1 –1 1
3 7 –1 1 1 1
45 58 5 19 –2 38
∑ 𝑥1 45
𝑥 = = = 4.5
1
𝑛 10
∑ 𝑥2 58
𝑥 = = = 5.8
2
𝑛 10
1
S =
2
∑(𝑥 − 𝑥̅ )2
1 1 1
𝑛1
∑ 𝑑12 ∑𝒅 2
⇒S =(2
)−( 𝟏
1 )
𝑛1
𝑛1
5 2
2 19
⇒ S1 = ( ) − ( )
10 10
50
⇒ S2 = 1.9 − 0.25
1
51
⇒ S2 = 1.65
1
𝑛1 2 10
⇒ S = (1.65) = 1.8333
𝑛1 − 1 1
10 − 1
1
S2 = ∑( 𝑥
2 2 − 𝑥̅ 2 )2
𝑛2
∑ 𝑑2 ∑ 𝒅𝟐 2
⇒ S2 = ( 2) − ( )
2
𝑛2 𝑛2
2 38 −2 2
⇒ S2 = ( ) − ( )
10 10
⇒ S2 = 3.8 − 0.04
2
⇒ S2 = 3.76
2
𝑛2 2 10
⇒ S = (3.76) = 4.1778
𝑛2 − 1 2
10 − 1
𝑛2 2
2
S
Now, 𝐹𝑐𝑎𝑙 = 𝑛2−1
𝑛1 2
𝑛1−1 1
S
4.1778
⇒ 𝐹𝑐𝑎𝑙 =
1.8333
⇒ 𝐹𝑐𝑎𝑙 = 2.2792
F table value at 5% level of significance and (𝑛2 − 1, 𝑛1 − 1) = (10 − 1, 10 − 1) = (9, 9) df.

F𝑡𝑎𝑏 = 3.1789
Here, F𝑐𝑎𝑙 < F𝑡𝑎𝑏
⇒ H0 is accepted
⇒ There is no significance variation in the products of two machines.
Analysis of Variance (ANOVA)

In testing equality of number of means the technique known as Analysis of variance can be useful.
Analysis of variance is an elegant and versatile technique given by R. A. Fisher for testing
whether or not the means of more than two populations are equal. It is a technique of dividing
the total variations of the data of the sample into component variations due to different sources.
The variations give estimates of the population variance, which can be tested for homogeneity by
F test.
52
Assumptions of ANOVA
53
1. Populations from which samples are drawn are normal.
2. Each of the population have the same variance 𝜎2.
3. Samples are drawn randomly.
4. Samples are independent.
5. Errors are normally distributed with mean 0 and variance 𝜎2.
One Way ANOVA
We’ll understand one way ANOVA by one simple example.
Example: Prices of a commodity were collected from six different cities. Examine whether there
is significant difference in prices in the cities.
City Price
A 20 26 24 26
B 22 24 22 24
C 20 20 22 24
D 22 24 26 30
E 24 22 24 26
F 24 22 22 22
Solution:
H𝟎 ∶ 𝜎2 = 𝜎2 i.e. There is no significance difference in the prices in the cities.
1 2
Let’s subtract 22 from all the entries.

City Price 𝑹𝒊
A –2 4 2 4 8
B 0 2 0 2 4
C –2 –2 0 2 –2
D 0 2 4 8 14
E 2 0 2 4 8
F 2 0 0 0 2
T = 34
𝟐
Correction Factor: 𝑪. 𝑭. = 𝑻
𝑵
where T = 34 = Grand total, N = Total number of observations =
24
2
⇒ 𝐶. 𝐹. = (34) = 48.1667
2
Total Sum of Squares: 𝑻. 𝑺. 𝑺. = ∑ ∑ 𝒙𝟐 − 𝑪. 𝑭.

�
∑ ∑ 𝑥2 = (−2)2 + (4)2 + ⋯ + (0)2 = 172

�
So, 𝑇. 𝑆. 𝑆. = 172 − 48.1667 = 123.8333

54
Row Sum of Squares: 𝑹. 𝑺. 𝑺 𝟐
∑𝒊 − 𝑪. 𝑭.
= 𝒌𝒊
∑
𝑅2 ( 8) ( 4) (−2) (14) ( 8) (2)2
�=
2 2 2 2 2 +
𝑘𝑖 4
∑ 𝑅2
4 + 4 + 4 + 4 +
4
𝑖
⇒ = 16 + 4 + 1 + 49 + 16 + 1
𝑘𝑖
2
𝑖
⇒ ∑ = 87
𝑘𝑖
Now, 𝑅. 𝑆. 𝑆 = 87 − 48.1667
⇒ 𝑅. 𝑆. 𝑆 = 38.8333
Error Sum of Squares: 𝑬. 𝑺. 𝑺 = 𝑻. 𝑺. 𝑺 − 𝑹. 𝑺. 𝑺

⇒ 𝐸. 𝑆. 𝑆 = 123.8333 − 38.8333
⇒ 𝐸. 𝑆. 𝑆 = 85
ANOVA TABLE
𝐒. 𝐒.
Source of Variation S.S. d.f. 𝐌. 𝐒. 𝐒. = 𝐅𝒄𝒂𝒍 𝐅𝒕𝒂𝒃
𝐝. 𝐟.
Due to cities 38.8333 6–1=5 7.7666 1.6447 F(5, 18) = 2.7729
Due to error 85 23 – 5 = 18 4.7222
Total 123.8333 24 – 1 = 23
Here, F𝑐𝑎𝑙 < F𝑡𝑎𝑏

⇒ H0 is accepted
⇒ There is no significance difference in the prices in the cities.
Two Way ANOVA

We’ll understand two way ANOVA by one simple example.
Example: Set up a two way ANOVA table for the following data:
Treatment
Field A B C D
P 45 40 38 37
Q 43 41 45 38
55
R 39 39 41 41
56
Solution:
H𝟎𝟏 ∶ There is no significance difference between rows.
H𝟎𝟐 ∶ There is no significance difference between columns.
Let’s subtract 40 from all the entries.
Treatment
Field A B C D 𝑹𝒊
P 5 0 –2 –3 0
Q 3 1 5 –2 7
R –1 –1 1 1 0
𝑪𝒋 7 0 4 –4 T=7
Correction
𝟐
Factor: 𝑪. 𝑭. =
𝑻 where T = 7 = Grand total, N = Total number of observations = 12
𝑵
2
⇒ 𝐶. 𝐹. = (7) = 4.0833
1
Total Sum of Squares: 𝑻. 𝑺. 𝑺. = ∑ ∑ 𝒙𝟐 − 𝑪. 𝑭.
�
∑ ∑ 𝑥2 = (5)2 + (0)2 + ⋯ + (1)2 = 81

�
So, 𝑇. 𝑆. 𝑆. = 81 − 4.0833 = 76.9167

∑ 𝑹𝟐
𝒊
Row Sum of Squares: 𝑹. 𝑺. 𝑺 − 𝑪. 𝑭.
𝒉𝒊
=
∑
𝑅2 ( 7)
( 0) (0)2
�= 2 2+
ℎ𝑖 + 4
∑𝑅 2 4 4
𝑖
⇒ = 0 + 12.25 + 0
ℎ𝑖
2
𝑖
⇒ ∑ = 12.25
ℎ𝑖
Now, 𝑅. 𝑆. 𝑆 = 12.25 − 4.0833
⇒ 𝑅. 𝑆. 𝑆 = 8.1667
∑ 𝑪𝟐
Column Sum of Squares: 𝑪. 𝑺. 𝑺 − 𝑪. 𝑭.
�
𝒌𝒋
=
∑ 𝐶2 3
=(7)2
� 𝑘𝑗
57
(0)2
(4 (–
+ )2 4)2
3
+ +
4
4
58
∑ 𝐶2
⇒ �= 16.3333 + 0 + 4 + 4
𝑘𝑗
∑ 𝐶2
�
⇒ = 24.3333
𝑘𝑗
Now, 𝐶. 𝑆. 𝑆 = 24.3333 − 4.0833
⇒ 𝐶. 𝑆. 𝑆 = 20.25
Error Sum of Squares: 𝑬. 𝑺. 𝑺 = 𝑻. 𝑺. 𝑺 − 𝑹. 𝑺. 𝑺 − 𝑪. 𝑺. 𝑺

⇒ 𝐸. 𝑆. 𝑆 = 76.9167 − 8.1667 − 20.25
⇒ 𝐸. 𝑆. 𝑆 = 48.5
ANOVA TABLE
𝐒. 𝐒.
Source of Variation S.S. d.f. 𝐌. 𝐒. 𝐒. = 𝐅𝒄𝒂𝒍 𝐅𝒕𝒂𝒃
𝐝. 𝐟.
Due to row 8.1667 3–1=2 4.0833 1.9796 F(2, 11) = 3.9823
Due to column 20.25 4–1=3 6.75 1.1975 F(3, 11) = 3.5874
Due to error 48.5 11–2–3 = 6 8.0833
Total 76.9167 12 – 1 = 11
For row, F𝑐𝑎𝑙 < F𝑡𝑎𝑏

For column, F𝑐𝑎𝑙 < F𝑡𝑎𝑏
⇒ H01 and H02 are accepted
⇒ There is no significance difference between rows.
and there is no significance difference between columns.
Exercise
Question 1. The following samples are drawn from two normal populations. Test the
hypothesis that the population variances are equal.
Sample 1 8 10 14 10 13
Sample 2 12 15 11 16 14 14 16
(Ans: F𝑐𝑎𝑙 = 1.63)
Question 2. It is known that the mean diameters of rivets produced by two firms A and B are
practically the same but the S.D. may differ. For 22 rivets produced by firm A, the
59
S.D. is 29 mm, while for 16 rivets manufactured by firm B, the S.D. is 3.8 mm.
Compute the statistic you would use to test whether the products of firm A have
the same variability as those of firm B. Test its significance.
(Ans: F𝑐𝑎𝑙 = 1.75)
Question 3. Prices of a commodity were collected from four different cities. Examine whether
there is significant difference in prices in the cities.
City Price
A 12 16 16
B 15 14 14 15
C 17 16 15 14
D 15 12 15 16 16
(Ans: F𝑐𝑎𝑙 = 3.01)
Question 4. The performance of 3 operators on 4 machines is given below. Analyse the data.
Machines
Operators A B C D
I 560 540 580 560
II 580 550 600 590
III 570 560 560 590
(Ans:F𝑐𝑎𝑙,𝑟𝑜𝑤 = 2.4, F𝑐𝑎𝑙,𝑐𝑜𝑙 = 3.6))
60
Multiple Choice Questions
1. Reject null hypothesis when it is true is considered as
(a) Type – I error
(b) Type – II error
(c) Probable error
(d) Random error
2. Accept null hypothesis when it is false is considered as
(a) Type – I error
(b) Type – II error
(c) Probable error
(d) Random error
3. Which test is used to test the goodness of fit?
(a) t – test
(b) F - test
(c) Z - test
(d) Chi square test
4. To test the significance of independence of attributes which of the following test is used?
(a) t – test
(b) F - test
(c) Chi square test
(d) Z - test
5. The hypothesis under consideration is called .
(a) alternative hypothesis
(b) simple hypothesis
(c) null hypothesis
(d) composite hypothesis
6. The hypothesis which is opposite of null hypothesis is called .
(a) alternative hypothesis
(b) simple hypothesis
(c) null hypothesis
(d) composite hypothesis
61
7. Which test is used in analysis of variance?
(a) t – test
(b) F - test
(c) Z - test
(d) Chi square test
8. Variance ratio test is also known as .
(a) t – test
(b) Z - test
(c) Chi square test
(d) F – test
9. What is your conclusion of the test if χ 2  χ 2
cal tab
(a) Null hypothesis is rejected.

(b) Alternative hypothesis is accepted.
(c) Null hypothesis is accepted.
(d) Can say anything.
10. Which test is used to test the difference between means of two samples?
(a) t – test
(b) Z - test
(c) Chi square test
(d) F – test
11. Small sample tests are used when .
(a) sample size is more than 30.
(b) sample size is more than 40
(c) sample size is less than 40
(d) sample size is less than 30
12. If T.S.S. = Total sum of squares, R.S.S. = Row sum of squares, C.S.S. = Column sum of
squares and E.S.S. = Error sum of squares then choose the correct formula.
(a) E.S.S. = T.S.S. – R.S.S. + C.S.S
(b) E.S.S. = T.S.S. – R.S.S. – C.S.S
(c) E.S.S. = T.S.S. + R.S.S. + C.S.S
(d) E.S.S. = R.S.S. + C.S.S – T.S.S.
62
13. Which distributions are used in small sample tests?
(a) t and F distributions
(b) Normal and t distributions
(c) Normal and F distributions
(d) Normal and chi square distributions
14. Which test is used in analysis of variance?
(a) t – test
(b) Z - test
(c) Chi square test
(d) F – test
15. What is the full form of d.f. in t test?
(a) degrees of fire
(b) degrees of freedom
(c) defective fraction
(d) discrete function
16. Which of the following test is used to compare two variances?
(a) t – test
(b) F - test
(c) Chi square test
(d) Z – test
17. Analysis of variance is a statistical method of comparing the of several
populations.
(a) standard deviations
(b) variances
(c) means
(d) proportions
18. A t-test is a significance test that assesses .
(a) the means of two independent groups
(b) the medians of two dependent groups
(c) the modes of two independent variables
(d) the standard deviation of three independent variables
63
19. The degrees of freedom for Chi square test statistics when testing for independence in a
contingency table with 4 rows and 4 columns would be
(a) 9
(b) 7
(c) 5
(d) 12
20. When σ is known, the hypothesis about population mean is tested by
(a) t – test
(b) F - test
(c) Chi square test
(d) Z – test
21. An advertising agency wants to test the hypothesis that the proportion of adults in
America who read a Sunday Magazine is 25 percent. The null hypothesis is that the
proportion reading the Sunday Magazine is:
(a) Not equal to 25 %
(b) Less than 25 %
(c) Equal to 25 %
(d) More than 25 %
22. The number of independent values in a set of values is called .
(a) test-statistic
(b) degree of freedom
(c) level of significance
(d) level of confidence
23. If the critical region is located equally in both sides of the sampling distribution of test-
statistic, the test is called
(a) One tailed
(b) Two tailed
(c) Right tailed
(d) Left tailed 25
24. The degree of freedom for paired t-test based on n pairs of observations is .
(a) 2n – 1
64
(b) n – 2
(c) 2 (n – 1)
(d) n – 1
25. What is the range of the test statistic Z?
(a) 0 to 1
(b) –1 to +1
(c) –∞ to +∞
(d) 0 to ∞
References:
1. Business Research Methods – Sudhir Prakashan
2. Research Methodology – C.R. Kothari
3. Fundamentals of Statistics – S. P. Gupta
4. Testing Statistical Hypothesis – Lehman & Romano
5. Statistical Methods – S. P. Gupta
65
66

RM Study Material - Unit 4

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

RM Study Material - Unit 4

Uploaded by

Copyright:

Available Formats

M. COM. SEM.

Let us see the basic terminology of statistical inference.

Examples of some of the parameters and statistics.

Particulars Parameter Statistics

Let see some of the examples of statistical hypothesis:

(1) The mean of the population is 10.

Let us see the examples of null hypothesis.

Alternative Hypothesis: A hypothesis complementary to the null hypothesis is called alternative

Following are the examples of alternative hypothesis:

Thus, H0 is tested against H1.

Thus α = P (Type – I error)

Thus β = P (Type – II error)

Critical region may be given in two ways:

(1) two tailed on probability curve

Power of the test:

i.e. 1 – β = 1 – P (Type – II error)

and if 8 or more trials give heads, then H0 is rejected. Determine probabilities of

p (x) = nCx  px  qn-x Where, x = 0, 1, 2,.......n

Probability of type I error

 = P (Reject Ho/Ho istrue.)

Probability of type II error

 = P (accept Ho/Ho istrue.)

Probability of type I error

 = P (Reject Ho/Ho istrue.)

Probability of type II error

 = P (accept Ho/Ho isfalse.)

= P (accept Ho/H1 istrue.)

Probability of type I error

 = P (Reject Ho/Ho istrue.)

𝑒−330 𝑒−331 𝑒−332

Probability of type II error

 = P (accept Ho/H1 istrue.)

𝑒−220 𝑒−221 𝑒−222

Here, m = 1 & m = 0.5

Probability of type I error

 = P (Reject Ho/Ho istrue.)

= P (𝑥 > 3/𝑚 = 0.5)

= 1 – [P (x = 0) + P(x = 1) + P(x=2) + P(x=3)]

𝑒−0.5(0.5)0 𝑒−0.5(0.5)1 𝑒−0.5(0.5)2 𝑒−0.5(0.5)3

= 1 − 𝑒−0.5 [1 + 0.5 + 0.125 + 0.0208]

Probability of type II error

= e-1 (1 + 1 + 0.5 + 0.1667)

(i) Test of significance of a mean:

Here 𝜎2and 𝜎2 are population variances.

Is the difference in the means significant?

S 2 S22 2.52 2.72

Step 3: Standard Error S.E. of 𝑝=

Step 3: Standard Error S.E. of 𝑝=

PQ 0.5×0.5 where Q = 1 − 0.5

P = 𝑥1+ 𝑥2 = 𝑛1𝑝1+ 𝑛2𝑝2 then S.E. of p = √PQ + PQ = √PQ ( 1 + 1 )

⇒ |𝑝1 − 𝑝2| = |0.2 − 0.155|

then the distribution of    is also 2 distribution with n degrees of freedom.

The probability density function of 2 distribution is

Number of the die observed Expected (𝒐𝒊 − 𝒆𝒊)𝟐

Table value of 2 on 3 d.f. and at 5% level of significance = 7.815

2cal > 2tab

Number Observed Probability Expected (𝑜𝑖 − 𝑒𝑖)2

Table value of 2 on 5 d.f. and at 5% level of significance = 11.07

2cal < 2tab

P(3) = 𝑚 P(2) = 2 (0.270) = 0.180

2cal < 2tab

 Expected frequency of (i, j)th cell

Ans : Ho : Performance is independent of training.