Professional Documents
Culture Documents
– 2
Research Methodology
Unit – 4
Testing of Hypothesis, Small and large sample tests
Complied by Dr. Ankit Bhojak
Testing of Hypothesis
Population:
Population is the well-defined collection of large number of individuals or objects or units which
possess the same characteristics. In other words population is the total set of observations.
For example, if we are studying the height of men, the population is the set of heights of all the
men in the world.
Sample:
Sample is a set of data collected and/or selected from a statistical population. It is a part of
population. It is the representative of the population. It contains all the characteristics of the
population. . For example, if we are studying the height of men, the population is the set of
heights of all the men in the world and the sample is the set of heights of men belong to a
particular country like India, China etc.
The main objective to select a sample from population is to get the information about the
population under study. Statistical inference is the technique to estimate the unknown
characteristics of population from the sample. It can be classified into two parts.
Estimation
Testing of hypothesis.
Parameter:
Parameter is any numerical quantity that characterizes a given population. In simple terms, a
constant obtained from all the observations of a population is known as parameter. This means
the parameter tells us something about the whole population.
1
Statistic:
Statistic is any numerical quantity that characterizes a given sample. In simple terms, a constant
obtained from all the observations of a sample is known as statistic. This means the statistic tells
us something about the sample.
Estimator:
A statistic is used to predict the value of the parameter of the population is called estimator. The
procedure to estimate the value of parameter with the help of statistic is known as estimation.
Sampling distribution:
From a population of size N, different samples of size n are selected. From this selected samples
different values of statistic are obtained and arranged in the form of frequency distribution which
is known as sampling distribution of a statistic. i.e. A sampling distribution is a
probability distribution of a statistic obtained through a large number of samples drawn from a
specific population.
Standard Error:
The standard deviation of the sample statistic obtained from the sampling distribution is known
as standard error of that statistic.
Uses:
1) To test the randomness of a sample
2) To get the confidence interval for the parameter of the population.
3) To test whether the difference between the value of sample statistic and population
parameter is significant or not.
4) To determine the precision of the sample estimates
1
Precision of statistics =
standard error of statistic
2
Statistical Hypothesis: Statistical hypothesis is a logical statement or assumption made about
the parameter of the population. Statistical hypothesis is a conjecture which can be tested by
some procedure at the end of the procedure we may either accept or reject the hypothesis.
Hypothesis can never be proved.
Simple Hypothesis: If the hypothesis specify the value of the parameter of the population then it
is known as simple hypothesis. From above examples, (1) is the example of simple hypothesis.
Composite Hypothesis: If the hypothesis does not specify the value of the parameter of the
population completely then it is known as composite hypothesis. From above examples, (2) and
(3) are the examples of composite hypothesis.
Null Hypothesis: A hypothesis under consideration is called null hypothesis. It is written for the
possible acceptance. Null hypothesis is denoted by H0.
(1)
Mean of the population is 10. i.e. H0 : µ = 10
1
(2)
The die is fair. i.e. H0 : P =
6
(3)
The proportions of smokers in the state of Gujarat and the state of Maharashtra are equal.
i.e. H0 : P1 = P2
(4)
Mean of both the populations are equal. i.e. H0 : µ1 = µ2.
(1)
Mean of the population is not 10. i.e. H1 : µ ≠ 10
1
(2)
The die is unfair. i.e. H1 : P ≠
6
3
(3)
The proportion of smokers in the state of Gujarat is less than the state of Maharashtra. i.e.
H1 : P1> P2
(4)
Mean of one population is larger than that of the other population. i.e. H1 : µ1> µ2.
Test of hypothesis:
The test of a hypothesis is a procedure to decide whether to accept or reject the null hypothesis or
to reject it. If the test procedure reject the null hypothesis then we must have an alternative
hypothesis which is to be accepted.
Type I and Type II errors: In testing of any statistical hypothesis the following situation may
arise.
accept H0 reject H0
H0 is true correct decision incorrect decision
H0 is false incorrect decision correct decision
Here, accepting a false hypothesis and rejecting true hypothesis are considered as incorrect
decisions or errors.
The error committed in rejecting a true null hypothesis is called Type – I error and its probability
is denoted by α.
= P (reject H0 / H0 is true)
The error committed in accepting a false null hypothesis is called Type – II error and its
probability is denoted by β.
= P (accept H0 / H0 is false)
Level of Significance:
In any test procedure both the types of errors should be kept minimum. But as these are inter
related, it is not possible to minimize both the errors simultaneously. Hence, the probability of
4
type
5
– I error is fixed and probability of type – II error is minimised. This predetermined fixed value
of probability of type – I error is called level of significance and it is denoted by α. Hence, the
level of significance is the probability of rejecting a hypothesis which is likely to be accepted.
The commonly used level of significance are 5 % and 1 %. If we consider 5 % level of
significance then it means that out of 100 cases in 5 cases we are rejecting a null hypothesis
which is likely to be accepted.
Critical Region:
The area of probability curve is divided into two regions by predetermined level of significance.
The area of probability curve corresponding to type – I error is called critical region. It is also
known as region of rejection. The area of probability curve other than critical region is called the
acceptance region.
If it is required to test whether the sample statistics is significantly different from the population
parameter then it is called two tailed test. If it is required to test whether the sample statistics is
significantly greater than or lesser than the population parameter then it is called one tailed test.
The probability of rejecting a null hypothesis when it is false is known as power of test and is
given by 1 – β.
= 1 – P (accept H0 / H0 is false)
= P ( reject H0 / H0 is false)
6
Illustration:
𝟐
Q.1
A hypothesis Ho : P = ½ vs H1 : P = is to be tested. For this a coin is tossed 10 times
𝟑
Info.
Here, n = 10 q=1–p
= P (x 8 / P = ½ )
= P (x = 8) + P (x = 9) + P (x = 10)
8
= C 1
1 9
1 10−8 1 10−9
10 8 ( ) ∙ () + 10C9( ) ∙ ( )
2 2 2 2
+ C
1 10 1 10−10
10 10 () ∙ ( )
2 2
= 45 1 10 1 10 1 10
( ) + 10( ) + 1 ( )
2 2 2
= 1 10
( ) (45 + 10 + 1)
2
56 7×8 7
= 210 = 1024 = 128
= 7
128 = 0.0547
7
= P (accept Ho/H1 istrue.)
8
= P (x < 8 / 𝑃 = 2)
3
= 1 – P (x 8 / P = 2)
3
= 1 – [P (x = 8) + P (x = 9) + P (x = 10)]
=1–[ 8
C 2 9
2
1 10−8 1 10−9
10 8 () ∙ () + 10C9( ) ∙ ( )
3 3 3 3
+ C
10 0
10 10 (2 ) ∙ ( 1) ]
3 3
8
= 1 – [45 2
10 × 29 210
310
+ 310
+ 310
]
8
=1– 2
[45 + (10 × 2) + 22]
3 1
256
= 1 – 243 ×243 (45 + 20 + 4) = 1 – 0.2991
= 0.7009
Power of test
1 – = 1 – 0.7009
1 – = 0.2991
Q.2
A coin is tossed 6 times & the hypothesis Ho is P = ½ is rejected in
𝟑
favour of H1 is p = . If the number of heads is greater than 4, find &.
𝟒
Here, n = 6
= P (𝑥 > 4 / 𝑃 = 1)
2
= P (x = 5) + P (x = 6)
9
5 1 6
= 6C5(2) ( 2) + 6C6(2 )
1 1 1
1 16 6
= 6 (2 ) + ( 2)
6
= (1) ( 6 + 1)
2
= 7
64 = 0.1094
= 1 – P (𝑥 4 / 𝑃 = 3)
4
= 1 – [ P (x = 5) + P (x = 6)]
3 5 1 6
= 1 – [6C 5( ) (1 ) + 6C 6(3 )
4 4 4
5 6
= 1 – [6 × 3 + 3
]
46 46
5
=1–3
4 (6 − 3)
6
243 ×9
= 1 – 64 × 64
2187
= 1 – 4096
= 1 – 0.534
= 0.466
10
Q.3
It is claimed that on an average 3 accident per month occur on a particular road.
For this a test is conducted and the H 0 : m = 3 v/s H1 : m = 2 is tested. During the last
month if less than 3 the null hypothesis is rejected. Find probability of type – 1 and
type -2 error and also find power of test.
−𝑚
𝑀𝑥
Info : p(x) = 𝑒
𝑥𝑖 Where, x = 0, 1, 2,.....n
e-1 = 0.3679
Here, m = 3 & m = 2
= P (𝑥 < 3/𝑚 = 3)
= P (x = 0) + P(x = 1) + P(x=2)
= 𝑒−3 [1 + 3 + 4.5]
= 0.0498 (8.5)
= 0.4233
= P (𝑥 3/𝑚 = 2)
= 1 – P (𝑥 < 3/𝑚 = 2)
= 1– [P (x = 0) + P(x = 1) + P(x=2)]
11
= 1 − 𝑒−2 [1 + 2 + 2]
= 1 – 0.1354(5)
= 1 – 06768
= 0.3232
Power of test
1 – = 1 – 0.3232
1 – = 0.67668
Q.4
It is observed that there are 0.5 misprint per page of a book. It is necessary to test H0
: m = 0.5 vs H1 : m = 1. For thse 10 pages are observed and if it contains more than 3
misprint, then H0 is rejected find , & power of test.
= 1 – P (𝑥 3/𝑚 = 0.5)
= 1 – 0.6065 (1.6458)
= 1 – 0.9982
= 0.0018
12
= P (accept Ho/H1 istrue.)
= P (𝑥 3/𝑚 = 1)
= [P (x = 0) + P(x = 1) + P(x=2) + P(x=3)
𝑒−1(1)0 𝑒−1(1)1 𝑒−1(1)2 𝑒−1(1)3
= 0! + 1! + 2! + 3!
(1) In an experiment of tossing a coin, p denotes the probability of getting head. In order to
3
test the hypothesis Ho : P = ½ against H1: p = , the coin is tossed 5 times and if more
4
than 3 times head obtained then Ho is rejected. Find the probabilities for type – I and type
– II errors. Also find the power of test. (Ans: 0.1875,0.3672, 0.6328)
(2) It is desired to test the hypothesis that a coin is unbiased. It is agree to reject the
hypothesis if the number of heads (x) in 9 different tosses is
x ≤ 2 or x ≥ 7. What is the probability of committing type – I error?
(Ans:0.3594)
(3) It is observed that on an average 3 items are defective in a lot of items. A sample is
selected from the lot and if it shows 3 or more defective items then the lot is rejected.
Find probability of committing type – I error. The test is Ho: m = 3 v/s H1 : m ≠ 3.
(Ans: 0.5767)
(4) In order to test the hypothesis, that Ho: m = 2 v/s H1 : m = 3, a sample of size 100 units
is selected at random from the big lot. If the sample shows 2 or less defective items,
then the lot is accepted. Find probabilities for type – I and type – II errors. Also find
power of test.
(Ans:0.3235, 0.4233, 0.5767)
13
Large Sample Tests
For testing a given hypothesis a random sample is drawn from a population. If the number of
units in the sample is greater than 30, it is generally regarded as a large sample. We shall study
the tests of significance for large samples. The tests will be discussed in this chapter are
(1) Tests of Variables
(i) Test of significance of a mean
(ii) Test of significance of difference between two means
(iii) Test of significance of difference between two standard deviations
(2) Tests of Attributes
(iv) Test of significance of proportion of successes
(v) Test of significance of difference between two proportions
For reference the critical values at important levels of significance are given in the below table.
5% 1%
Two tailed test 1.96 2.58
One tailed test 1.64 2.33
Then compare Z with the critical value of Z. If the calculated value of Z > 1.96 at 5% level of
significance then the null hypothesis is rejected and it may be concluded that the difference
between the sample mean and the population mean is significant. If the calculate value of Z <
1.96 then the null hypothesis is accepted and it may be concluded that the difference between
sample mean and population mean is not significant.
14
Example: The mean life time of 100 light tubes produced by a company is computed to be 1570
hours with a standard deviation of 120 hours. The company claims that the average life of the
tubes produced by the company is 1600 hours. Is the claim justified? Use 5 % level of
significance.
Solution:
Step 1: H0: 𝜇 = 1600 Vs H1: 𝜇 ≠ 1600
Step 2: Difference = |𝑥̅ − 𝜇0|
⇒ |𝑥̅ − 𝜇0| = |1570 − 1600|
⇒ |𝑥̅ − 𝜇0| = 30
Step 3: Standard Error S.E. of 𝑥̅ = 𝜎
√𝑛
𝜎
⇒ √𝑛 120
= √100
𝜎
⇒ √𝑛 120
= 10
𝜎
⇒ √𝑛 = 12
|
Step 4: Calculate Z = 𝑥̅−𝜇0
|
( )
√𝑛
⇒𝑍= 30
12
⇒ 𝑍 = 2.5
Here, Z = 2.5 > 1.96 at 5 % level of significance.
⇒ H0 is rejected.
⇒ Company’s claimed is not justified.
⇒ The mean life time of light tube is not 1600 hours.
15
(ii) Test of significance of difference between two means:
Suppose, two independent random samples are drawn from two different population with means
𝑥̅1& 𝑥̅2 respectively with sizes 𝑛1& 𝑛2 respectively. If we want to test the hypothesis that
population means are equal i.e. H0: 𝜇1 = 𝜇2. We can use the following steps.
Step 1: H0: 𝜇1 = 𝜇2. Vs H1: 𝜇1 ≠ 𝜇2.
Step 2: Difference = |𝑥̅1 − 𝑥̅2|
2 2
Step 3: Standard Error S.E. of 𝑥̅1& 𝑥̅2 = √ 𝜎1 + 𝜎2
𝑛1 𝑛2
|𝑥̅1− 𝑥̅2|
Step 4: Calculate Z =
𝜎 𝜎 2 2
(√ 1 + 2 )
𝑛 𝑛
1 2
1 2
Then compare Z with the critical value of Z. at required level of significance and decide whether
to accept or to reject null hypothesis.
Example: The following information is about the height of students of two colleges.
College A College B
Mean Height (in inches) 67.42 67.25
S.D. (in inches) 2.58 2.50
Sample size 1000 1200
2 2
Step 3: Standard Error S.E. of 𝑥̅1& 𝑥̅2 = √ 𝜎1 + 𝜎2
𝑛1 𝑛2
𝜎 𝜎2 2 2.582 2.502
⇒√ 1+ 2 =√ +
16
𝑛1 𝑛2 1000 1200
17
2 2
⇒ √ 𝜎1 + 𝜎2 = √0.0066 + 0.0052
𝑛1 𝑛2
2 2
⇒ √ 𝜎1 + 𝜎 2 = √0.0118
𝑛1 𝑛2
2 2
⇒ √ 𝜎1 + 𝜎2 = 0.1086
𝑛1 𝑛2
|𝑥̅1− 𝑥̅2|
Step 4: Calculate Z =
𝜎 2
𝜎 2
(√ 1 + 2 )
𝑛 𝑛
1 2
0.17
⇒ 𝑍 = 0.1086
⇒ 𝑍 = 1.5654
Here, Z = 1.5654 < 1.96 at 5 % level of significance.
⇒ H0 is accepted.
⇒ There is no significant difference in mean height of students of two colleges.
(iii) Test of significance of difference between two standard deviations
Here, we want to test the hypothesis that the standard deviations of the two populations do not
differ significantly. So we can apply the following steps.
Step 1: H0: 𝜎1 = 𝜎2 Vs H1: 𝜎1 ≠ 𝜎2
Step 2: Difference = |S1 − S2|
2 𝜎22 2 S22
Step 3: Standard Error S.E. of S1 − S2 = √ 𝜎1 + = √ S1 +
2𝑛1 2𝑛2 2𝑛1 2𝑛2
|S1− S2|
Step 4: Calculate Z =
𝜎2 𝜎2
(√ 1 + 2 )
2𝑛1 2𝑛2
Then compare Z with the critical value of Z. at required level of significance and decide whether
to accept or to reject null hypothesis.
18
Example: In a sample of 1000 the mean is 17.5 and the S.D. 2.5. In another sample of 800 the
mean is 18 and S.D. 2.7. Assuming that the samples are independent, discuss whether the two
samples can have come from a population which have the same S.D.
Solution:
Step 1: H0: 𝜎1 = 𝜎2 i.e. Two samples have come from a population which have the
same S.D..
Step 2: Difference = |S1 − S2|
⇒ |S1 − S2| = |2.5 − 2.7|
⇒ |S1 − S2| = 0.2
2 S22
Step 3: Standard Error S.E. of S1 − S2 = √ S1 +
2𝑛1 2𝑛2
2 2
⇒ √ S1 + S2 = √0.0077
2𝑛1 2𝑛2
2 S22
⇒ √ S1 + = 0.0877
2𝑛1 2𝑛2
|S1− S2|
Step 4: Calculate Z =
𝜎2 𝜎2
(√ 1 + 2 )
2𝑛1 2𝑛2
0.2
⇒ Z = 0.0877
⇒ Z = 2.28
Here, Z = 2.28 > 1.96 at 5 % level of significance.
⇒ H0 is rejected.
⇒ Two samples have not come from a population which have the same S.D.
19
(iv) Test of significance of proportion of successes
Suppose a random sample of n units is drawn from a population and x units of them possess a
particular characteristic. The sample proportion of the attribute is p and given by 𝑝 = 𝑥. In order
𝑛
to test the null hypothesis that the population proportion of the attribute is P. We apply following
steps:
Step 1: H0: Population Proportion = P Vs H1: Population Proportion ≠ P
Step 2: Difference = |𝑝 − P| = |𝑥 − P|
𝑛
Then compare Z with the critical value of Z. at required level of significance and decide whether
to accept or to reject null hypothesis.
Example: In a certain city 380 men out of 800 men were found to be smokers. Discuss whether
this information support the view that the majority of men in the city are smokers?
Solution:
Step 1: H0: Population Proportion = P = 1 Vs H1: Population Proportion > 1
2 2
Step 2: Difference = |𝑝 − P| = |𝑥 − P|
𝑛
⇒ |𝑥 − P| = |380 − 0.5|
𝑛 800
⇒ |𝑥 − P| = |0.475 − 0.5|
𝑛
⇒ |𝑥 − P| = 0.025
𝑛
⇒ √PQ = 0.0003125
𝑛 √
⇒ √PQ = 0.0177
𝑛
20
𝑥
| − P|
𝑛
Step 4: Calculate Z =
PQ
√𝑛 )
(
0.025
⇒𝑍= 0.0177
⇒ 𝑍 = 1.4069
Here, Z = 1.4069 < 1.645 at 5 % level of significance.
⇒ H0 is accepted.
⇒ Majority of the men in the city are not smokers.
(v) Test of significance of difference between two proportions
Suppose, a random sample of size 𝑛1is taken from one population and 𝑥1 units of them possess
some attribute. Hence, 𝑝 = 𝑥1 is the proportion of units possessing the attribute in the sample.
1 𝑛1
Suppose, another independent random sample of size 𝑛2is taken from another population and 𝑥2
units of them possess the same attribute. Hence, 𝑝 = 𝑥2. Here we want to test the hypothesis that
2 𝑛2
the population proportions of the attribute are equal. We apply following steps:
Step 1: H0: P1 = P2 Vs H1: P1 ≠ P2
Step 2: Difference = |𝑝1 − 𝑝2|
Step 3: Standard Error S.E. of 𝑝1 − 𝑝2= √P1Q1 + P2Q2
𝑛1 𝑛2
|𝑝1−𝑝2|
Step 4: Calculate Z =
P Q P2Q2
(√ 1𝑛 1+ 𝑛 )
1 2
If the population proportion P1&P2 are unknown, their estimates are obtained by
combining two sample proportions. The pooled estimate P can be obtained as
Then compare Z with the critical value of Z. at required level of significance and decide whether
to accept or to reject null hypothesis.
Example: In a large city A, 20 % of a random sample of 900 school boys had defective eye
sight. In another large city B, 15.5 % of a random sample of 1600 school boys has the same
defect. Is the difference between two proportions significant?
Solution:
Step 1: H0: P1 = P2 i.e. there is no significant difference between two proportions.
Step 2: Difference = |𝑝1 − 𝑝2|
21
20%(900) 15.5%(1600)
⇒ |𝑝1 − 𝑝2| = | 900 − 1600 |
⇒ |𝑝1 − 𝑝2| = |180 − 248 |
900 1600
Where P = 𝑥1+ 𝑥2
𝑛1+𝑛2
180+248
⇒ P = 900+1600
428
⇒ P = 2500
⇒ P = 0.1712
So, ⇒ Q = 1 − P = 0.8288
Now, √PQ ( 1 + 1
) = √0.1712 × 0.8288 ( 1
+ 1
)
𝑛1 𝑛2 900 1600
⇒ √PQ ( 1 + 1 ) = 0.0157
𝑛1 𝑛2
|𝑝1−𝑝2|
Step 4: Calculate Z = 1 1
(√PQ( + ))
𝑛1 𝑛2
0.045
⇒ Z = 0.0157
⇒ Z = 2.8662
Here, Z = 2.8662 > 1.96 at 5 % level of significance.
⇒ H0 is rejected.
⇒ There is no significant difference between two proportions.
22
Exercise
Question 1. A sample of 400 students have a mean height of 171.38 cm. Can it be regarded as
a random sample from a large population with mean height 171.17 cm and
standard deviation 3.3 cm? (Ans: Z = 1.27)
Question 2. The mean of a random sample of 1000 units is 17.6 and the mean of another sample
of 800 units is 18. Can it be concluded that both the samples come from the same
population with S.D. = 2.6? (Ans: Z = 3.24)
Question 3. The information regarding marks of boys and girls of a college is given below:
Sample Mean S.D. n
Boys 83 10 121
Girls 81 12 81
Test whether the difference in standard deviation is significant.
(Ans: Z = 1.75)
Question 4. In a large consignment of fruits, 64 fruits out of sample of 400 fruits are found to
be bad. Test the hypothesis that the population proportion of bad fruits in the
consignment is 20% at 1% level of significance. (Ans: Z = 2)
Question 5. A machine produced 16 defective articles in a batch of 500 articles. After
overheating, it produced 3 defective articles in a sample of 100 articles. Has the
machine improved? (Ans: Z = 0.104)
23
Chi Square Test
Definition of 2 :
If x1, x2, x3..........xn is a random sample of size n, from a normal population with mean 0 and
S.D. 1, then the distribution of ∑ 𝑥2 is called 2 distribution on n degrees of freedom. Similarly if
�
x1, x2, x3.........xn is a random sample of size n from a normal population with mean and S.D.
2
xi
n
1 n
f χ
2
χ2
n e
2 χ
2 2
0 < χ2
n
22
2
2 is a continuous distribution and the form of the distribution depends upon degrees of
freedom n. The mean of the distribution is n and its variance is 2n.
Uses of 2
2 distribution has a large number of applications in statistics. We shall discuss the
following three main uses of 2
1. To test goodness of fit
2. To test independence of attributes
3. To test a specified value of the variance of the population.
Goodness of Fit Test :
Suppose we have obtained an observed frequency distribution and we arc interested in
knowing whether the observed frequency distribution support a particular hypothesis. For. this a
very powerful test for testing the significance of the discrepancy between observed frequency
distribution and expected frequency distribution was given by Karl Pearson in 1900. The test is
known as 2 test of goodness of fit.
Under the null hypothesis that there is no significant difference between observed and
expected frequencies, the value of 2 is calculated by the formula :
(𝑜𝑖−𝑒𝑖)2
2 = ∑ 𝑒𝑖
24
If all the observed frequencies and expected frequencies are equal, the value of 2 will be
zero. This will signify a perfect agreement of observations with expectations. More the value of
2 more is the divergence between the observed and expected frequencies.
The value of 2 is calculated from the given data and it is compared with the table value
of 2 on n –1 degrees of freedom and at a required level of significance. If calculated value of 2
is less than table value of 2 the null hypothesis may be accepted and it may be concluded that
the given frequency distribution fits the hypothesis. And if the calculated value of 2 is greater
than the table value of 2 the hypothesis may be rejected and it may be concluded that the
observed frequency distribution does not fit the hypothesis.
Note : The degrees of freedom in applying goodness of fit test is n – k – l, where k is the number
of parameters estimated.
Limitations of 2 Test
(1) The observations of the sample should be independent.
(2) Absolute frequencies should always be used
(3) If there are any constraints on class frequencies. then they must be linear.
(4) The frequency of any class should not be less than 5. If any class frequency is less than
five it should be combined with the frequency of the adjoing class or classes, so that the
total frequency of combined classes is more than 5.
(5) The class frequencies should be combined is such a way that degrees of freedom is more
than 0.
Illustration 1 : A die is thrown for 300 times and the following distribution is obtained. Can the
die be regarded unbiased.
Number of the die 1 2 3 4 5 6
Frequency 41 44 49 53 57 56
𝟏
Ans : H0 : Die is unbiased i.e. the probability of obtaining any number is
𝟔
25
3 49 50 0.02
4 53 50 0.18
5 57 50 0.98
6 56 50 0.72
2
300 300 4.24
2 = ∑ (𝑜𝑖−𝑒𝑖) = 4.24
𝑒𝑖
D.f. = n – 1 = 6 – 1 = 5
The value of 2 on 5 d.f. and at 5 % level of significance = 11.07
2cal < 2tab
HO may be accepted.
Die may be regarded unbiased.
Illustration 2: The units produced by a plant are classified into four grades. The past
performance of the plant shows that the respective proportion are 8 : 4 : 2 : 1. To check the run of
the plant 600 parts were examined and classified as follows. Is there any evidence of a change in
production standards.
Grade First Second Third Fourth Total
Units 340 130 100 30 600
Ans : Ho : There is no change in production standards.
Grade Units Unit (𝑜𝑖 − 𝑒𝑖)2
Observed Expected 𝑒𝑖
First 340 600 × 8
= 320 30
15
Second 130 600 × 4
= 160 5.625
15
Third 100 600 × 2
= 80 5.00
15
Fourth 30 600 × 1
= 40 2.50
15
600 14.375
(𝑜𝑖−𝑒𝑖)2
2 =∑ 𝑒𝑖
= 14.375
D.f. = n – 1 = 4 – 1 = 3
26
There is evidence in change of production standards.
Illustration 3 : Five coins are tossed for 320 times and the following distribution of number of
heads is obtained.
Number of heads 0 1 2 3 4 5
Frequency 8 42 116 90 52 12
Test the hypothesis that the coins are unbiased.
1
Ans : Ho : Coins are unbiased i.e P =
2
= 5.72
D.f. = n – 1 = 6 – 1 = 5
27
Number of Number of fx P(xi) Expected (𝒐𝒊 − 𝒆𝒊)𝟐
mistakes pages frequency ei 𝒆𝒊
x f = N × P (xi)
0 11 0 0.135 13.5 0.4630
1 31 31 0.270 27.0 0.5926
2 26 52 0.270 27.0 0.0370
3 17 51 0.180 18.0 0.0556
4 10 40 0.090 9.0
5 4 20 0.036 3.6 0.1043
6 1 6 0.012 1.2
∑ 𝑓𝑥
100 200 1.2525
Mean m = = 200 = 2
𝑁 100
−𝑚 𝑥
𝑒 𝑚
for Poisson distribution P(a) =
𝑥!
P(0) = 𝑒 −𝑚
=𝑒 −2
= 0.135
𝑚
P(1) = P(0) = (0.135) = 0.270
2
1 1
𝑚
P(2) = P(1) = (0.270) = 0.270
2
2 2
6 6
2
2 = ∑ (𝑜𝑖−𝑒𝑖) = 1.2525
𝑒𝑖
D.f. = n – 1 – 1 = 5 – 2 = 3
Table value of 2 on 3 d.f. and at 5% level of significance = 11.07
28
Test of Independence of Two Attributes :
When the data are classified according to two attributes, 2 can also be used to test the
hypothesis that the two attributes are independent.
Supposes the data are classified into r classes A1, A2, A3,......Ar according to attribute A
and into c classes B1, B2, B3......Bc according to attribute B. The representation of the data in a cross
– classified table known as a contingency table is given below. In the r × c contingency table the
observed frequencies of different cells are shown.
B1 B2 B3.............. Bj.............. Bc
A1 O11 O12 O13.............. O1j.............. O1c (A1)
A2 O21 O22 O23.............. O2j.............. O2c (A2)
A3 O31 O22 O33.............. O3j.............. O3c (A3)
......... .................................................................................................... .........
......... .................................................................................................... .........
......... .................................................................................................... .........
Aj Oi1 Oi2 O i3.............. Oij.............. Oic (Aj)
......... .................................................................................................... .........
......... .................................................................................................... .........
......... .................................................................................................... .........
Ar Or1 Or2 O r3.............. Orj.............. Orc (Ar)
(B1) (B2) (B3)............ (Bj)...........(Bc) N
The total of ith row is denoted by (Ai) and the total of jth column is denoted by (Bj). Oij denotes the
frequency of the cell common to ith row and jth column. The total frequency is N. i.e.
∑(Ai) = ∑(Bj) = N
Under the null hypothesis that the two attributes A and B are independent, we shall find the
expected frequency of (i, j)th cell.
( Ai )
The probability that any observation will fall in the ith row =
N
(Bj)
Similarly the probability that any observation will fall in the jth column =
N
Under the hypothesis of independence the probability that any observation will fall in the ith row
(Ai ) (Bj)
and jth column = ×
N N
(Ai)(Bj)
= N
29
Thus, we can find and expected frequencies of all cells. From observed frequencies oij
and expected frequencies eij, the value of 2 can be obtained by the following formula.
2
(𝑜𝑖𝑗−𝑒𝑖𝑗)
=2
∑𝑗
𝑒𝑖𝑗
∑𝑖
The number of independent cells in a r × c contingency table is (r – 1) (c – 1). Hence the
degrees of freedom in a r × c table is (r – 1) (c – 1).
For testing the hypothesis of independence of two attributes A and B, the value of 2 is
found out and it is compared with the table value of 2 on (r- 1) (c- l) d. f. and at a required level
of significance. If calculated 2 is greater than the table value of 2, the hypothesis may be
rejected
i.e the two attributes may not be regarded independent. If calculated 2 is less than the table
value of 2, the hypothesis the hypothesis that the attributes are independent may be accepted.
Illustration 5: In an industry, 200 workers employed for a specific job were classified according
to their performance and training received / not received. Test independence of training and
performance. The data are summarized as follows.
Performance
Total
Good Not Good
Trained 100 50 150
Untrained 20 30 50
120 80 200
30
= 1.11 + 1.67 + 3.3 + 5
31
= 11.11
On 1 d.f. and at 5% level of significance table value of 2 = 3.84
2cal > 2tab
HO may be accepted.
Performance depends upon training.
Yate’s Correction :
Total expected frequencies under the hypothesis of independence are found out and
shown in brackets in the cells.
32
(𝑜𝑖−𝑒𝑖)2
2 =∑ 𝑒𝑖
The value of 2 is calculated from the given data and it is compared with the table value of
2 on n – 1 d.f. and at a required level of significance. The decision regarding acceptance or
rejection of the hypothesis is then taken.
Illustration 7 : Ten observations drawn randomly from a normal population are given
below 68, 72, 68, 74, 77, 61, 63, 69, 73, 75
Test the hypothesis that the population variance is 32.
Ans : H0 : 𝜎2 = 32
x 𝒙𝒊 − 𝒙̅ (𝒙𝒊 −
𝒙̅)𝟐
68 -2 4
72 2 4
68 -2 4
74 4 16
77 7 49
61 -9 81
63 -7 49
69 -1 1
73 3 9
75 5 25
33
700 242
∑ 𝑥𝑖 700
𝑥̅ = = = 70
𝑛 10
∑(𝑥𝑖− 𝑥̅)2
2 = 𝜎 2
= 242 = 7.5625
32
34
Exercise
Question -1 The number of road accidents on a high way during a week is given below. Can it
be concluded that the proportion of accidents are equal for all days.
Day Mon. Tue. Wed. Thurs. Fri. Sat. Sun.
Number of accidents 14 16 8 12 11 9 14
(Ans: = 4.17)
2
Question – 4 A sample of size 20.drawn from a normal population give mean and S.D. as 42
and 6 respectively. Test the hypothesis that population S.D. is 9.
(Ans: 2 = 8.89)
35
Small Sample Test
Introduction
Degrees of Freedom
36
t Distribution or Student’s t Distribution
Introduction
• t distribution was given by W. S. Gosset in 1908.
• He published his work under the pen name of student so this distribution is known as
student’s t distribution.
• Let x1, x2, x3.........xn is a random sample of n observations from a normal population with
mean and S.D. then the distribution of t = X̅−μ
S√n−1
is defined as t distribution with
n − 1 degrees of freedom. Here 𝑥̅ = ∑𝑛 𝑥𝑖and 𝑆 = 1 ∑𝑛 (𝑥𝑖 − 𝑥̅)2
1 2
𝑛 𝑖=1 𝑛 𝑖=1
𝟏 𝒏
𝒏+𝟏
𝒕𝟐 𝟐
, −∞<𝒕<∞
√𝒏 . 𝖰( , )(𝟏+ )
𝟐 𝟐 𝒏
Assumptions
1) The population from which the sample is drawn is normal.
2) The sample is random.
3) The population standard deviation 𝜎 is not known.
Properties
1) The probability curve of this distribution is symmetrical.
2) The tails of the curve are asymptotic to X axis.
3) When n is very large then t distribution tends to normal distribution.
4) The form of this distribution varies with the degrees of freedom.
Test the significance of the difference between sample mean and population mean
• Suppose, a random sample x1, x2, x3.........xn is drawn from a normal population and the
mean and the variance of the sample are 𝑥̅ and 𝑆2 respectively. If we want to test the
hypothesis that there is no significant difference between sample mean 𝑥̅ and
assumed mean 𝜇 of the population then we apply t test.
• Step – 1 Define H0: Population Mean = 𝜇 Vs
H1: Population Mean ≠ 𝜇
• Step – 2 Find positive difference of 𝑥̅ and 𝜇, i.e. |𝑥̅ − 𝜇|
𝑆
• Step – 3 Find S.D., i.e.
√𝑛−1 where 𝑆2 = 1 ∑(𝑥 − 𝑥̅)2
37
𝑛 �
38
• Step – 4 Find t statistic, i.e.t𝑐𝑎𝑙 = |𝑥̅−𝜇|
( |𝑥̅−𝜇|√𝑛−1
)
√𝑛−1
= 𝑆
∑ 𝑥𝑖 1683
𝑥̅ = = = 168.3
𝑛 10
2
1
( )2
𝑆 = ∑ 𝑥𝑖 − 𝑥̅
𝑛
∑ 𝑑2
𝑆 =( ) − ∑ 𝒅𝒊 2
2
( 𝑖
𝑛 )
𝑛
39
43 3 2
⇒ 𝑆2 = ( ) − ( )
10 10
⇒ 𝑆2 = 4.3 − 0.09
⇒ 𝑆2 = 4.21
⇒ 𝑆 = √4.21
⇒ 𝑆 = 2.0518
|𝑥̅ − 𝜇|√𝑛 − 1
t𝑐𝑎𝑙 =
𝑆
|168.3 − 170|√10 − 1
t𝑐𝑎𝑙 = 2.0518
1.7 × 3
= = 2.4856
t𝑐𝑎𝑙 2.0518
t table value at 5% level of significance and 9 df.
t𝑡𝑎𝑏 =2.26
Here, t𝑐𝑎𝑙 > t𝑡𝑎𝑏
⇒ H0 is rejected
where 𝑆 2 = 1
{∑(𝑥1 − 𝑥 )2 + ∑(𝑥2 − 𝑥2̅ )2}
𝑛1+𝑛2−2
1
𝑆2 = {𝑛 𝑆2 + 𝑛 𝑆2}
1 1 2 2
𝑛1 + 𝑛2 − 2
𝑆12 = 1
∑(𝑥1 − 𝑥̅1)2 , 𝑆22 = 1
∑(𝑥2 − 𝑥̅2)2
𝑛1 𝑛2
40
|𝑥̅1−𝑥̅2|
Step – 4 t𝑐𝑎𝑙 =
𝑆
( )
1 1
√𝑛1 +𝑛2
|𝑥̅1−𝑥̅2| 𝑛1𝑛2
t𝑐𝑎𝑙 = √
𝑆 𝑛1+𝑛2
∑ 𝑥1 219
𝑥 = = = 31.29
1
𝑛 7
∑ 𝑥2 169
𝑥 = = = 28.17
2
𝑛 6
1
𝑆2 =
𝑛1 + 𝑛2 − 2 {∑(𝑥1 − )2 + − 𝑥2 )2}
𝑥1 ∑( 𝑥 2
41
2
1
2
(∑ 𝒅𝟏)2 2
(∑ 𝒅𝟐)2
⇒
𝑆 = {∑ 𝑑1 + ∑ 𝑑2 − }
− 𝑛1 + 𝑛2 − 2 𝑛1 𝑛2
1 (2)2 (1)2
⇒ 𝑆2 = {32 −
+ 28 − }
7+6−2 7 6
1 (2)2 (1)2
⇒ 𝑆2 = {32 −
+ 28 − }
11 7 6
⇒ 𝑆2 = 5.2965
⇒ 𝑆 = √5.2965
⇒ 𝑆 = 2.3014
|𝑥̅1−𝑥̅2| 𝑛1𝑛2
Now, t𝑐𝑎𝑙 = √
𝑆 𝑛1+𝑛2
|31.29 − 28.17| 7 × 6
⇒ t 𝑐𝑎𝑙 = √
2.3014 7+6
|31.29 − 28.17| 7 × 6
⇒ t 𝑐𝑎𝑙 = √
2.3014 7+6
⇒ t𝑐𝑎𝑙 = 2.4368
t table value at 5% level of significance and (7 + 6 − 2)=11 df.
t𝑡𝑎𝑏 =1.796 (one tailed)
Here, t𝑐𝑎𝑙 > t𝑡𝑎𝑏
⇒ H0 is rejected
⇒ Horse A is faster than horse B.
Paired t test for difference of two means
• Suppose, two independent small samples of sizes 𝑛1 and 𝑛2 are drawn from two normal
populations with their means 𝑥
and 𝑥̅2 respectively.
• If we want to test the hypothesis that the population means are equal then we apply t test.
Step – 1 Define H0: 𝜇 = 0 Vs H1: 𝜇 ≠ 0
𝑆
Step – 3 Find S.D., i.e.
√𝑛−1 where 𝑆2 = 1 ∑(𝑑 − 𝑑̅)2
𝑛 �
42
Step – 4 Find t statistic, i.e.t | |𝑑̅ |√𝑛−1
𝑐𝑎𝑙 𝑑̅ | = 𝑆
= 𝑆
(√𝑛−1)
Shops 1 2 3 4 5 6 7 8 9 10
Sales before AD. 9 4 3 5 7 9 6 9 8 10
Sales after AD. 8 6 8 4 10 6 6 11 7 11
𝑛 = 10, 𝑛
∑𝑑
̅ 𝑖
𝑑=
𝑛
−7
⇒ 𝑑̅ =
10
⇒ 𝑑̅ = −0.7
∑ 𝑑2
𝑆2 = ( 𝑖 ) −
(
43
∑ 𝒅𝒊 2
)
𝑛
44
55
−7 2
⇒𝑆 =( )−( )
2
1 10
0
⇒ 𝑆2 = 5.5 − 0.49
⇒ 𝑆2 = 5.01
⇒ 𝑆 = 2.2383
⇒ t𝑐𝑎𝑙 = 0.7√10 − 1
2.2383
⇒ t𝑐𝑎𝑙 = 0.9382
t table value at 5% level of significance and 9 df.
t𝑡𝑎𝑏 = 2.26
Here, t𝑐𝑎𝑙 < t𝑡𝑎𝑏
⇒ H0 is accepted
⇒ Advertisement is not effective.
Exercise
Question 1. A company is producing steel tubes of mean inner diameter of 2.00 cm. A
sample of 10 tubes gives mean inner diameter of 2.01 cm and a variance of 0.004
cm square. Is the difference in the means significant? (Ans: t𝑐𝑎𝑙 =0.4747)
Question 2. Ten persons are chosen at random from a population and their heights are found
to be in inches as 63, 63, 66, 67, 68, 69, 70, 70, 71, 71. Test the hypothesis that
the mean height of the population is 66. (Ans: t𝑐𝑎𝑙 =1.89)
Question 3. A sample of 8 observations gives sample mean 1134 and SD 35 units whereas
another sample of 7 observations gives sample mean 1024 and SD 40 units. At
5% level of significance, test the significant difference between two sample
means. (Ans: t𝑐𝑎𝑙 = 32.3831)
Question 4. Two random samples of sixes 9 and 7 respectively are drawn from two different
populations. The means of the samples are 196.4 and 198.8 respectively. The sum
of the squares of the deviations from their respective means are 26.94 and 18.73.
45
Test the hypothesis that population means are equal. (Ans: t𝑐𝑎𝑙 = 2.637)
46
Question 5. The sales data of an index in six shops before and after a special promotion
campaign are as under:
Shops A B C D E F
Before campaign 53 28 32 48 50 42
After campaign 58 32 30 50 56 45
(Ans: t𝑐𝑎𝑙 = 2.6)
Question 6. An IQ test was conducted to 5 persons before and after they were trained. The
results are given below:
Students 1 2 3 4 5
Before training 110 120 123 132 125
After training 120 118 125 136 121
Test whether there is any change in IQ after the training. (Ans: t𝑐𝑎𝑙 = 2.6)
47
F test and ANOVA
Introduction
To test the hypothesis of equality of means in two small samples we use t-test. In applying t test,
it is assumed that the population from which the samples are drawn have equal variances. If this
assumption is not correct the result obtained may not be reliable. Hence, before applying t test, it
is necessary to test that the population variances are equal i.e. 𝜎12 = 𝜎22.
Snedecore's F test can be used for testing the hypothesis that the variances of the populations are
equal. The statistic F is defined as
𝑠̂2
𝐹= 1
,
�
2
1 2
𝑠̂2
𝐹 = 21
𝑠̂
2
𝑛1
S2
⇒𝐹=
𝑛1 − 1 1
𝑛2 2
𝑛2 − 1S 2
where
1
𝑆2 = ∑( 𝑥 1
− 𝑥 )2 , 𝑆 =
2
− 𝑥 )2
∑( 𝑥
1 1 1 2 22
𝑛1 𝑛2
F is thus the ratio of two independent unbiased estimates of population variances. F is based on
𝑛1 − 1, 𝑛2 − 1 degrees of freedom. It should be noted that F is defined as a ratio of two
independent estimates of population variances Where the numerator is greater than the
denominator.
𝑛1 2
⇒𝐹= S
𝑛1 − 1 1
𝑜𝑛
𝑛 −
𝑛 1, − 1 d. f.
𝑛2 2 1 2
𝑛2 − 1S2
48
or
49
𝑛2 2
⇒𝐹= S
𝑛2 − 1 2
𝑜𝑛 𝑛 −
𝑛 1, − 1 d. f.
𝑛1 2 2 1
𝑛1 − 1S1
From the given data, the value of F is computed and it is compared with the table value of F on
appropriate degrees of freedom and at a required level of significance. The decision regarding
acceptance or rejection of the hypothesis is then taken.
Example: 1. The following figure give the weights of products of items produced by two
machines. Test the hypothesis that there is no significance variation in the products of two
machines.
Machine X 3 7 5 6 5 4 4 5 3 3
Machine Y 8 5 7 8 3 2 7 6 5 7
Solution :
H𝟎 ∶ 𝜎2 = 𝜎2 i.e. There is no significance variation in the products of two machines.
1 2
𝒙𝟏 𝒙𝟐 𝒅𝟏 = 𝒙𝟏 − 𝟒 𝒅𝟏𝟐 𝒅𝟐 = 𝒙𝟐 − 𝟔 𝒅𝟐𝟐
3 8 –1 1 2 4
7 5 3 9 –1 1
5 7 1 1 1 1
6 8 2 4 2 4
5 3 1 1 –3 9
4 2 0 0 –4 16
4 7 0 0 1 1
5 6 1 1 0 0
3 5 –1 1 –1 1
3 7 –1 1 1 1
45 58 5 19 –2 38
∑ 𝑥1 45
𝑥 = = = 4.5
1
𝑛 10
∑ 𝑥2 58
𝑥 = = = 5.8
2
𝑛 10
1
S =
2
∑(𝑥 − 𝑥̅ )2
1 1 1
𝑛1
∑ 𝑑12 ∑𝒅 2
⇒S =(2
)−( 𝟏
1 )
𝑛1
𝑛1
5 2
2 19
⇒ S1 = ( ) − ( )
10 10
50
⇒ S2 = 1.9 − 0.25
1
51
⇒ S2 = 1.65
1
𝑛1 2 10
⇒ S = (1.65) = 1.8333
𝑛1 − 1 1
10 − 1
1
S2 = ∑( 𝑥
2 2 − 𝑥̅ 2 )2
𝑛2
∑ 𝑑2 ∑ 𝒅𝟐 2
⇒ S2 = ( 2) − ( )
2
𝑛2 𝑛2
2 38 −2 2
⇒ S2 = ( ) − ( )
10 10
⇒ S2 = 3.8 − 0.04
2
⇒ S2 = 3.76
2
𝑛2 2 10
⇒ S = (3.76) = 4.1778
𝑛2 − 1 2
10 − 1
𝑛2 2
2
S
Now, 𝐹𝑐𝑎𝑙 = 𝑛2−1
𝑛1 2
𝑛1−1 1
S
4.1778
⇒ 𝐹𝑐𝑎𝑙 =
1.8333
⇒ 𝐹𝑐𝑎𝑙 = 2.2792
52
Assumptions of ANOVA
53
1. Populations from which samples are drawn are normal.
2. Each of the population have the same variance 𝜎2.
3. Samples are drawn randomly.
4. Samples are independent.
5. Errors are normally distributed with mean 0 and variance 𝜎2.
One Way ANOVA
We’ll understand one way ANOVA by one simple example.
Example: Prices of a commodity were collected from six different cities. Examine whether there
is significant difference in prices in the cities.
City Price
A 20 26 24 26
B 22 24 22 24
C 20 20 22 24
D 22 24 26 30
E 24 22 24 26
F 24 22 22 22
Solution:
H𝟎 ∶ 𝜎2 = 𝜎2 i.e. There is no significance difference in the prices in the cities.
1 2
𝟐
Correction Factor: 𝑪. 𝑭. = 𝑻
𝑵
where T = 34 = Grand total, N = Total number of observations =
24
2
⇒ 𝐶. 𝐹. = (34) = 48.1667
2
∑
𝑅2 ( 8) ( 4) (−2) (14) ( 8) (2)2
�=
2 2 2 2 2 +
𝑘𝑖 4
∑ 𝑅2
4 + 4 + 4 + 4 +
4
𝑖
⇒ = 16 + 4 + 1 + 49 + 16 + 1
𝑘𝑖
2
𝑖
⇒ ∑ = 87
𝑘𝑖
Now, 𝑅. 𝑆. 𝑆 = 87 − 48.1667
⇒ 𝑅. 𝑆. 𝑆 = 38.8333
ANOVA TABLE
𝐒. 𝐒.
Source of Variation S.S. d.f. 𝐌. 𝐒. 𝐒. = 𝐅𝒄𝒂𝒍 𝐅𝒕𝒂𝒃
𝐝. 𝐟.
Due to cities 38.8333 6–1=5 7.7666 1.6447 F(5, 18) = 2.7729
Due to error 85 23 – 5 = 18 4.7222
Total 123.8333 24 – 1 = 23
55
R 39 39 41 41
56
Solution:
H𝟎𝟏 ∶ There is no significance difference between rows.
H𝟎𝟐 ∶ There is no significance difference between columns.
Let’s subtract 40 from all the entries.
Treatment
Field A B C D 𝑹𝒊
P 5 0 –2 –3 0
Q 3 1 5 –2 7
R –1 –1 1 1 0
𝑪𝒋 7 0 4 –4 T=7
Correction
𝟐
Factor: 𝑪. 𝑭. =
𝑻 where T = 7 = Grand total, N = Total number of observations = 12
𝑵
2
⇒ 𝐶. 𝐹. = (7) = 4.0833
1
Total Sum of Squares: 𝑻. 𝑺. 𝑺. = ∑ ∑ 𝒙𝟐 − 𝑪. 𝑭.
�
58
∑ 𝐶2
⇒ �= 16.3333 + 0 + 4 + 4
𝑘𝑗
∑ 𝐶2
�
⇒ = 24.3333
𝑘𝑗
Now, 𝐶. 𝑆. 𝑆 = 24.3333 − 4.0833
⇒ 𝐶. 𝑆. 𝑆 = 20.25
ANOVA TABLE
𝐒. 𝐒.
Source of Variation S.S. d.f. 𝐌. 𝐒. 𝐒. = 𝐅𝒄𝒂𝒍 𝐅𝒕𝒂𝒃
𝐝. 𝐟.
Due to row 8.1667 3–1=2 4.0833 1.9796 F(2, 11) = 3.9823
Due to column 20.25 4–1=3 6.75 1.1975 F(3, 11) = 3.5874
Due to error 48.5 11–2–3 = 6 8.0833
Total 76.9167 12 – 1 = 11
Exercise
Question 1. The following samples are drawn from two normal populations. Test the
hypothesis that the population variances are equal.
Sample 1 8 10 14 10 13
Sample 2 12 15 11 16 14 14 16
(Ans: F𝑐𝑎𝑙 = 1.63)
Question 2. It is known that the mean diameters of rivets produced by two firms A and B are
practically the same but the S.D. may differ. For 22 rivets produced by firm A, the
59
S.D. is 29 mm, while for 16 rivets manufactured by firm B, the S.D. is 3.8 mm.
Compute the statistic you would use to test whether the products of firm A have
the same variability as those of firm B. Test its significance.
(Ans: F𝑐𝑎𝑙 = 1.75)
Question 3. Prices of a commodity were collected from four different cities. Examine whether
there is significant difference in prices in the cities.
City Price
A 12 16 16
B 15 14 14 15
C 17 16 15 14
D 15 12 15 16 16
(Ans: F𝑐𝑎𝑙 = 3.01)
Question 4. The performance of 3 operators on 4 machines is given below. Analyse the data.
Machines
Operators A B C D
I 560 540 580 560
II 580 550 600 590
III 570 560 560 590
(Ans:F𝑐𝑎𝑙,𝑟𝑜𝑤 = 2.4, F𝑐𝑎𝑙,𝑐𝑜𝑙 = 3.6))
60
Multiple Choice Questions
1. Reject null hypothesis when it is true is considered as
(a) Type – I error
(b) Type – II error
(c) Probable error
(d) Random error
2. Accept null hypothesis when it is false is considered as
(a) Type – I error
(b) Type – II error
(c) Probable error
(d) Random error
3. Which test is used to test the goodness of fit?
(a) t – test
(b) F - test
(c) Z - test
(d) Chi square test
4. To test the significance of independence of attributes which of the following test is used?
(a) t – test
(b) F - test
(c) Chi square test
(d) Z - test
5. The hypothesis under consideration is called .
(a) alternative hypothesis
(b) simple hypothesis
(c) null hypothesis
(d) composite hypothesis
6. The hypothesis which is opposite of null hypothesis is called .
(a) alternative hypothesis
(b) simple hypothesis
(c) null hypothesis
(d) composite hypothesis
61
7. Which test is used in analysis of variance?
(a) t – test
(b) F - test
(c) Z - test
(d) Chi square test
8. Variance ratio test is also known as .
(a) t – test
(b) Z - test
(c) Chi square test
(d) F – test
9. What is your conclusion of the test if χ 2 χ 2
cal tab
62
13. Which distributions are used in small sample tests?
(a) t and F distributions
(b) Normal and t distributions
(c) Normal and F distributions
(d) Normal and chi square distributions
14. Which test is used in analysis of variance?
(a) t – test
(b) Z - test
(c) Chi square test
(d) F – test
15. What is the full form of d.f. in t test?
(a) degrees of fire
(b) degrees of freedom
(c) defective fraction
(d) discrete function
16. Which of the following test is used to compare two variances?
(a) t – test
(b) F - test
(c) Chi square test
(d) Z – test
17. Analysis of variance is a statistical method of comparing the of several
populations.
(a) standard deviations
(b) variances
(c) means
(d) proportions
18. A t-test is a significance test that assesses .
(a) the means of two independent groups
(b) the medians of two dependent groups
(c) the modes of two independent variables
(d) the standard deviation of three independent variables
63
19. The degrees of freedom for Chi square test statistics when testing for independence in a
contingency table with 4 rows and 4 columns would be
(a) 9
(b) 7
(c) 5
(d) 12
20. When σ is known, the hypothesis about population mean is tested by
(a) t – test
(b) F - test
(c) Chi square test
(d) Z – test
21. An advertising agency wants to test the hypothesis that the proportion of adults in
America who read a Sunday Magazine is 25 percent. The null hypothesis is that the
proportion reading the Sunday Magazine is:
(a) Not equal to 25 %
(b) Less than 25 %
(c) Equal to 25 %
(d) More than 25 %
22. The number of independent values in a set of values is called .
(a) test-statistic
(b) degree of freedom
(c) level of significance
(d) level of confidence
23. If the critical region is located equally in both sides of the sampling distribution of test-
statistic, the test is called
(a) One tailed
(b) Two tailed
(c) Right tailed
(d) Left tailed 25
24. The degree of freedom for paired t-test based on n pairs of observations is .
(a) 2n – 1
64
(b) n – 2
(c) 2 (n – 1)
(d) n – 1
25. What is the range of the test statistic Z?
(a) 0 to 1
(b) –1 to +1
(c) –∞ to +∞
(d) 0 to ∞
References:
1. Business Research Methods – Sudhir Prakashan
2. Research Methodology – C.R. Kothari
3. Fundamentals of Statistics – S. P. Gupta
4. Testing Statistical Hypothesis – Lehman & Romano
5. Statistical Methods – S. P. Gupta
65
66