You are on page 1of 35

Chapter 3

Hypothesis Development and Testing


Development of any statistical hypothesis depends on the nature or type of the research. At first
here we discussed different fundamental concepts which are very much important of hypothesis
development and testing.
Testing Statistiscal Hypothesis
One of the most important techniques of making statistical inference about the population
parameter(s) or about the form of the population distribution is of testing statistical hypotheses.
Here, different techniques of testing statistical hypotheses about population parameter(s) are
discussed.
2

0.4

RF f(x) 0.35

0.3

0.25

0.2

0.15

0.1

0.05

PROFIT
-5 -4 -3 -2 -1 0 1 2 3 4 5
x

Test
Test of a statistical hypothesis is a statistical technique or method by which we can arrive at a
conclusion or decision wherher the null hypothesis will be accepted or rejected on the basis of the
sample observations. It is also called a two action decision problem after the experimental sample
values are obtained where two actions being the acceptance or rejection of the null hypothesis
under consideration. Let x1 , x 2 ,........, x n be a random sample of size n from a normal
distribution with mean  and variance  2 . If the null hypothesis is that population mean value
is equal to 50, and then we can arrive at the decision about this statement on the basis of the sample
observations by using a test.There are two types of test namely; (i) One tailed test (ii) Two tailed
test

One Tailed Test


A test, for which the rejection region or critical region is wholly located at one end either at the left
hand end or at the right hand end of the sampling distribution of the test statistic, is called a one
tailed test or one sided test. When the rejection region is located at the left hand end of the sampling
distribution of the test statistic, is called the left-tailed or lower tailed test, if the rejection region is
located at the right hand end of the sampling distribution of the test statistic, is called the
right-tailed test or upper tailed test.
Example 1: Let x1 , x 2 ,........, x n be a random sample of size n from a normal
distribution with mean  and variance  . The null hypothesis to be tested is that
2

1
H 0 :   0
against the following two alternatives
Case (i) H1 :   0
Case (ii) H1 :   0 .
For case (i), to test this null hypothesis the rejection region of the test statistic is located at the left
hand end of the sampling distribution of the test statistic. If the test statistic being normal
distribution the rejection or critical region is shown by shading the appropriate portion of area
under the sampling distribution.

f(Z)

Critical
Region
Acceptance
Region

z H1 :   0

Figure Rejection region corresponding to one-tailed (lower-tailed ) test


For case (ii), to test the null hypothesis the rejection region of the test statistic is located at the right
hand end of the sampling distribution of the test statistic. If the test statistic being normal
distribution the rejection or critical region is shown by shading the appropriate portion of area
under the sampling distribution.

f(Z)

Critical Region

Acceptance
Region

Z Z
H1 :   0

Figure: Rejection region corresponding to one-tailed (upper-tailed) test

Two Tailed Test


A test, for which the rejection region is located at either both ends or tails of the sampling
distribution of the test statistic, is called a two tailed test or two sided test.
Example: Let x1 , x 2 ,........, x n be a random sample of size n from a normal distribution
with mean  and variance  2 . The null hypothesis to be tested is that H 0 :   0
against the alternative hypotheses is H1 :   0 .To test this null hypothesis, the rejection
region is located at the both ends of the sampling distribution of the test statistic. If the test statistic

2
being normal distribution the rejection or critical regions are shown by shading the appropriate
portion of area under the sampling distribution.

f(Z)

Critical Region Critical Region

Acceptance
Region

-Z H1 :   0 Z Z
2 2

Figure: Rejection regions corresponding to two-tailed test

Hypothesis
Any statement or assumption regarding either population parameter(s) or form of the population
distribution is called the hypothesis and is denoted by H.
Example Let x1 , x 2 ,........, x n be a random sample of size n from a normal distribution
with mean  and variance  2 . A statistical hypothesis is that the population mean  is
equal to 50.

Null Hypothesis:
A statistical hypothesis, which is picked up for testing on the basis of the sample observations is
called a null hypothesis and it is denoted by H 0 .
Example Let x1 , x 2 ,........, x n be a random sample of size n from a normal distribution with
mean  and variance  2 . If we are interested to test the hypothesis that the population mean
 is equal to specific value  0 , it is called a null hypothesis, and it is given by; H 0 :   0
Alternative Hypothesis:
A null hypothesis is a testable so there must be a counter statement or proposition in order to test
our null hypothesis. This counter statement or proposition is called an alternative hypothesis. Thus
an alternative hypothesis may be considered any hypothesis other than null hypothesis and it is

denoted by H1
Example Let x1 , x 2 ,........, x n be a random sample of size n from a normal distribution
with mean  and variance  2 . If we are interested to test the hypothesis that the population
mean  is equal to specific value  0 , against the hypothesis that it is not equal to  0 , then it is
called the alternative hypothesis, and it is given by; H1 :   0

Simple Hypothesis:
A statistical hypothesis which completely and uniquely specifies the whole distribution along with
the assumptions involved is called a simple hypothesis.
Example Let x1 , x 2 ,........, x n be a random sample of size n from a normal distribution
with mean  and variance  2 . The null hypothesis H0 :   0 ,  2   02 is a simple
hypothesis.

3
Composite Hypothesis:
Any statistical hypothesis which does not specify the whole distribution along with the assumptions
involved is called a composite hypothesis.
Example Let x1 , x 2 ,........, x n be a random sample of size n from a normal distribution with
mean  and variance  2 . The following null hypotheses are composite hypotheses
H0 :   0 ,  2   02
H0 :   0 ,  2   02
H0 :   0 ,  2   02

Parametric Hypothesis:
If the form of the population distribution is known from which the sample is drawn at random and
the hypothesis is made only for population parameter is called a parametric hypothesis.
Example Let x1 , x 2 ,........, x n be a random sample of size n from a normal distribution
with mean  and variance  2 . If we are interested to test the null hypothesis, H 0 :   0 ,
against an appropriate alternative hypothesis is called a parametric hypothesis.

Non-Parametric Hypothesis:
Any statistical hypothesis which is made about the form of the population distribution having either
specified or unspecified population parameter(s) from which the sample is drawn at random is
called a non-parametric hypothesis.
Example The hypothesis is such that the sample x1 , x 2 ,........, x n is drawn from a binomial
distribution with known or unknown probability of success is called a non-parametric hypothesis.

Errors
In hypothesis testing to test a null hypothesis against an appropriate alternative hypothesis the
following results can be happened;
(i) We reject the null hypothesis when H 0 is true or we accept the alternative hypothesis when
H1 is false.
(ii) We accept the null hypothesis when H 0 is false or we reject the alternative hypothesis when
H1 is true.
(iii) We reject the null hypothesis when H 0 is false or we accept the alternative hypothesis when
H1 is true.
(iv) We accept the null hypothesis when H 0 is true or we reject the alternative hypothesis when
H1 is false
It is very clear to us, the first two cases lead to errors in hypothesis testing. It is clear from the first
and second cases that two kinds of errors occur i.e. H 0 is true but we reject it, and H 0 is false
but we accept it. These are called the errors in hypothesis testing. The first one is called type I error
and the second one is called type II error
Type I Error:
In hypothesis testing the error which occurs for rejecting the null hypothesis H 0 when H 0 is true or
for accepting the alternative hypothesis H1 when H1 is false, is called type I error.
Example Let x1 , x 2 ,........, x n be a random sample of size n from a normal distribution with
mean  and variance  2 . The null hypotheses, H 0 :   60 , against the alternative hypothesis

4
H1 :   60 , is true but we reject it after analyzing the data, this type of error is called a type I error.
Type II Error:
In hypothesis testing the error which occurs for accepting the null hypothesis H 0 when H 0 is
false or for rejecting the alternative hypothesis H1 when H1 is true, is called a type II error.
Example Let x1 , x 2 ,........, x n be a random sample of size n from a normal distribution
with mean  and variance  2 . The null hypotheses, H 0 :   65 , against the alternative
hypothesis H1 :   65 , is false but we accept it after analyzing the data, this type of error is called
a type II error.
In hypothesis testing the decisions and the two kinds of errors are shown below in Table
Table Decisions and type I and II errors
Decisions H 0 is True or H 0 is False
H1 is False or H1 is True
Accept H 0 or Reject H1 Correct Decision Type II Error
Reject H 0 or Accept H1 Type I Error Correct Decision

Level of Significance: In hypothesis testing, the probability of type I error is called a level of
significance of a test statistic. It is denoted by  and is given by;
 = Probability of type I error
= Probability of rejecting H 0 |H 0 is true
= Probability[x1 ,x 2 ,.......,x n  critical region|H 0 is true]
= Probability[X  critical region|H 0 is true]
and therefore
1   = Probability of accepting H 0 when H 0 is true
= P r o b a b i 1l i t2y [ x , xn , . . . . . . . , x e g i o
anc c| H
0e p t iasn c
t reu e
r]
= P r o b a b i l ity[X a c c e p t a0 n c eu e r e] g i o n | H is t
This (1   ) probability is called the 100 (1   ) % confidence interval. In hypothesis testing
we always try to minimize the type I error, i.e., to have a smallest probability of making a type I
error. Hence the main objective of hypothesis testing is to construct a test statistic to minimize the
level of significance  .
Similarly the probability of type II error is called beta risk. It is denoted by  and is given by,
 = Probability of type II error
= Probability of accepting H 0 |H 0 is false
= Probability[x1 ,x 2 ,.......,x n  acceptance region|H 0 is false]
= Probability[X  acceptance region|H 0 is false]
Power of a Test:
In hypothesis testing, the power of a test statistic is the probability of rejecting the null hypothesis
H 0 when H 0 is false or the probability of accepting H1 when H1 is true. It is denoted by (1- ) ,
where  is the probability of type II error, and it is given by,

5
Power of a Test = Probability of rejection H 0 when H 0 is false
= 1- Probability of accepting H 0 when H 0 is false
= 1- Probability of type II error
= 1- 
The associated probabilities for taking correct decisions and also for type I and II errors are shown
below in Table
TbaleThe associated probabilities of taking correction decisions and for type I and II errors.
Probability
Decisions H 0 is True or H 0 is False or
H1 is False H1 is true
Accept H 0 or Reject H1 (1- ) 
Confidence Interval Probability of Type II Error
Reject H 0 or Accept H1  (1- )
Probability of Type I Error Power of a Test
Meaning of p-Value:
In hypothesis testing, the p-value for a statistical test is the smallest significance level of that test
statistic at which the observed sample may cause rejection of the null hypothesis when the null
hypothesis is true.
Example Let x1 , x 2 ,........, x n be a random sample of size 20 from a normal distribution
with mean  and variance
 2 = 5. Suppose we wish to test the null hypothesis
H 0 :  = 50
against the alternative hypothesis is
H1:  > 50
Let the sample mean is x = 51. If a critical value 50.5 is used, then the observed significance
level ( p-value) for this test is given as;
   
 x-   51-50 
p-value = Prob(x > 51) = Prob(  > 
    5 
  
 n   20 
= Prob(Z>2)
= 0.5-0.4772
= 0.0228

f(Z)
P-value

0 2 Z

6
Figure p-value for an upper-tail test when Z= 2

This is the smallest significance level at which the null hypothesis will be rejected by using the
most powerful test Z when the null hypothesis H 0 is true. If we select a significance level say
 =0.05 for this test, we reject the null hypothesis, because the p-value 0.0228 for this test is
smaller than 0.05. In contrast if we select the significance level say  =0.01 , we do not reject
the null hypothesis because the p-value 0.0228 is larger than 0.01. Thus it is very much clear from
this discussion that if the p-value of a test statistic is reported for testing a null hypothesis, we may
decide quicky whether the null hypothesis will be accepted or rejected with different a priori
significance levels  's . Thus we can say that the reporting p-value with the result of the test
statistics is a useful technique for quick decision making about the null hypothesis

Different Steps for calculating the p-Value of a Test Statistic for Testing a Null Hypothesis
The following steps are involved for calculating the p-value of a test statistic for testing a null
hypothesis about a population parameter;
Step One: The first step is to set up the null hypothesis H 0 , against an appropriate alternative
hypothesis H1 which specifies the numerical value of the unknown population parameter.

Step Two: At the second step we have to choose an appropriate test statistic under the null
hypothesis and calculate the value of the test statistic on the basis of the sample observations which
are drawn from the population at random. Thus we can say that the test statistic will be a purely
function of the sample observations.

Step Three: If the test is one-tailed, the p-value is equal to the tail area beyond the test statistic in
the same direction as the alternative hypothesis. Thus if the alternative hypothesis is of the form ">",
the p-value is the area to the right of or above the observed value of the test statistic. Consequently,
if the alternative hypothesis is of the form "<", the p-value is the area to the left of, or below the
observed value of the test statistic.

Step Four: If the test is two-tailed, the p-value is equal to twice the tail area beyond the observed
value of the test statistic in the direction of the sign of the test statistic. That is, if the value of the
test statistic is positive, the p-value is twice the area to the right of or above the observed value of
the test statistic. Conversely, if the value of the test statistic is negative, the p-value is twice the area
to the left of or below the observed value of the test statistic.
Example Let x1 , x 2 ,........, x100 be a random sample of size 100 from a normal population with
 and variance   4 . Suppose we wish to test the null hypothesis
2
mean
H 0 :  = 50
against the alternative hypothesis is
H1:   50
Let the sample mean is x = 44.5 . Here the value of the test statistic is given by;
x-E(x)
Z ~ N(0, 1)
var(x)
44.5-50

4
100

7
 2.5
Here the observed value of the test statistic is Z= -2.5, since the it is a two -tailed test hence any
value of Z less than -2.5 or greater than +2.5 would be contradictory to H 0 . Therefore the
observed significance level for this test is given by;
p-value = Prob( Z < -2.5 or Z > 2.5)
Here we find that
Prob (Z > 2.5) = 0.5 - 0.4938 = 0.0062
Therefore the p-value for this two-tailed test is
p-value = 2 Prob( Z > 2.5) = 2  0.0062= 0.0124

Decision to Reject or Accept H 0 Using Reported p-Value


The following steps are involved to make a decision whether the null hypothesis H 0 will be
rejected or not using the reported p-value of a test statistic.
Step One: At the first step we have to set up a null hypothesis H 0 against an appropriate
alternative hypothesis H1 .
Step Two: Under the null hypothesis choose a suitable test statistic and calculate the value of the
test statistic on the basis of the sample observations which are drawn from the population at
random.
Step Three: Choose the value of significance level  that you are wiling to tolerate.
Step Four: The null hypothesis will be rejected if the observed significance level (p-value) of the
test statistic is smaller than the chosen value of  , otherwise we accept the null hypothesis. For
example if we choose  = 5%, if the p-value < 0.05, we rejecte the null hypothesis and we accept
the alternative hypothesis.

Different Steps to Perform a Test


The following steps are involved to perform a test statistic.
Step One: In hypothesis testing the first step is to set up the null hypothesis H 0 against an
appropriate alternative hypothesis H1 .
Step Two: At the second step we have to choose an appropriate test statistic under null hypothesis
and calculate the value of the test statistic on the basis of the sample observations which are drawn
from the population.
Step Three: Choose the value of level of significance  that you are wiling to tolerate.
Step Four: The sampling distribution of the test statistic is divided into two disjoint sets one is the
critical region or rejection region and it is denoted by C.R. and another one is acceptance region
and it is denoted by A. R.. and then make the following decision on the basis of the calculated value
of the test statistic;
(i) Reject H 0 if the calculated value of the test statistic falls in the critical region.
(ii) Accept H 0 if the calculated value of the test statistic falls in the acceptance region.
Different Tests for Testing Null Hypotheses
The following tests are most popular and widely used for testing a null hypothesis
(1) Normal test
(2) Student t-test
(3)  2 test
(4) F-test
Nonparametric Tests
The most popular and widely applicable nonparametric tests for testing the hypothesis of
population means are given below:
(1) One sample sign test for mean or median.

8
(2) Two sample sign test of paired observations for the difference of two means or medians.
(3) One sample Wilcoxon signed-rank test for mean or median.
(4) Wilcoxon signed-rank test of paired observations for the difference of two means or medians.
(5) Wilcoxon rank-sum test for independent samples.
(6) Mann-Whitney U test for independent samples.
(7) Kruskal-Wallis H test.
(8) Test of randomness.
(9) Median test.
Normal Test:
Let X is the random variable which is normally distributed with mean E(X) =  and variance
var(X) =  2 . Then the test statistic is given by;
X-E(X)
Z= ~ N( 0, 1)
var(X)
X- 
= ~ N( 0, 1)

where E(X) =  is specified by the null hypothesis H 0 and var(X) =  2 is either
known or estimated from a large sample. Sometimes it is called a large sample test. This test is
usually two-tailed test, also in some situations it can be applied as one-tailed test.

f(Z)

-ve values of Z 0 +ve values of Z Z

Figure Probability density function of a stanadrd normal distribution.


Uses of Normal Test:
Normal test is most powerful and widely used in testing hypothesis regarding
(i) population mean
(ii) population proportions and
(iii) population correlation coefficients.

Student t-Test:
In normal test it is assumed that the variance of the random variable X is either known or estimated
from a large sample (i.e n > 30). In some cases, when the sample size is small ( i.e. n <30) and
var(X) is unknown, we may have to face some problems of using Z test. In this situation W.S.
Gossett derived a test statistic and obtained its sampling distribution to tackle this problem is
known as the Student t-test and is given by;
Z
t= ~t (n) d.f
2
n
where Z is a standardized normal variate which is defined as

9
X-E(X)
Z= ~N( 0, 1)
var(X)
and  2 is a chi-square variate with d.f. n.
Now, the square of a standard normal variate is a chi-square variate.
If we define
Xi  
Zi 

2
then Z is a chi-square variate with d.f. 1.
i

So
 Xi   
n 2

 
i=1    ~ n

2

and
 xi  x 
n 2

 
i=1     ~  n-1
2

we know,
n

 (x i  x) 2
s2 = i=1

n-1
n
  (x
i=1
i  x) 2  (n-1)s 2

 xi  x 
n 2

Since,    ~  n-1
2

i=1   
Thus we have
(n-1)s 2
~  n21
 2

Thus the Student t-test can be written as;


x-E(x)
var(x)
t= ~t (n-1) d.f
(n-1)s 2
2
n-1
x-
=
s2
n
which is the form of t-distribution with (n-1) d.f. If n is very large then t-test becomes normal test.
Therefore t-test is called a small sample test and can be considered as a special case of normal test.
Like normal test, t-test is a two-tailed test, in some cases it can be applied as one tailed test.

10
f(t (n-1) )

-ve values of t (n-1) 0 ve values of t (n-1) t (n-1)

Figure Probability density function of the Student's t- distribution with (n-1) degrees of freedom
Uses of t-test:
t-test is used for small sample. The t-test is widely used in testing hypothesis regarding
(i) the population means,
(ii) population regression coefficients, and
(iii) population correlation coefficients.

2 Test
Let x1 , x 2 ,........, x n be a random sample of size n from a normal distribution with mean 
and variance  2 . Then
xi  
Zi  ~N(0,1)

and Z i2 is a chi-square variate with d.f. 1.
n n
 xi   
So,  i    ~ n ;
2 2
Z 
i=1 i=1   
which is a chi-square variate with d.f. n.
The  -test is one of the most popular, simple and widely applicable non-parametric tests in the
2

field of business, economics, banking, finance, management, medical sciences etc. The quantity
chi-square gives the magnitude of discrepancy between theory and experimental observation. Thus
using the chi-square test we can know whether a given discrepancy between theory and
experimental observation can be attributed to chance or whether it results from the inadequacy of
the theory to fit the observed data. If  is zero means there is no descrepency between theory and
2

observation that it fits data very well i.e. the observed and expected frequencies are same. The
greater the value of  2 , indicates discrepancy between observed and expected frequencies is very
high and implies that does not fit the observed data well.
Let Oi (i = 1, 2, …....,m) be the observed frequency and corresponding expected frequency is
E i of the ith cell or ith group. Then the formula of the  2 test is given by;
(Oi -E i ) 2
m
 2
=
i=1 Ei
which is distributed as chi-square with (m-1) d.f. This formula is for a single variable of
classification. For two variables of classification, the formula of chi-square test is given by;

11
m n (Oij -E ij ) 2
 = 
2

i=1 j=1 E ij
which is distributed as chi-square with (m-1)(n-1) d.f.
Let the level of significance is  .
At  level of significance with (m-1) d.f for equation or (m-1)(n-1) degrees freedoms for
equation we find the table value of the  test statistic. We compare the calculated value of the
2

test statistic with the table value, if the calculated value is greater than the table value of the test
statistic, we reject the null hypothesis, and this indicates that there exist the significant difference
between theory and observation. On the other hand if the calculated value of  2 is less than the table
value of the test statistic, we accept the null hypothesis this simply means that fits the data very
well. It may be noted that  2 test depends only on the set of observed and expected frequencies
and on degrees of freedom. It does not make any assumption regarding the form of the population
from which the sample observations are drawn. Since the  2 test does not involve any population
parameters, hence it is called non-parametric test or distribution free test.

2
f ( )
( m 1)

Critical Region

 m2 1( )  m2 1

Figure: Probability density function of Chi-square distribution with m-1 degrees of freedom

Conditions for Validity of the 2 Test:

The  -test is an approximate test for large values of n. The following conditions must be
2

satisfied for the validity of chi-square test of "goodness of fit" between theory and experiment.
(i) The sample observations are drawn at random and independely from a population.
(ii) The summation of observed frequencies must be equal to the summation of expected
m n m n

 O   E ij
m m
frequencies i.e.  O i =  E i or ij
i=1 i=1 i=1 j=1 i=1 j=1
(iii) The total frequency should be reasonably large i.e. greater than 50.
(iv) The expected cell frequencies for each each cell should be greater than or equal to 5. If any
expected cell frequency is less than 5, we have to pool it with the preceding or succeeding
frequency for the application of the  2 -test, so that the pooled frequency will be greater than or
equal to 5 and finally we have to adjust the degrees of freedom due to the lost in pooling.

Uses of the  2 -test


The  2 -test is one of the most popular, simple and widely applicable non-parametric tests in the
field of business, economics, banking, finance, management, medical sciences etc. The following
are the most important applications of  2 test;

12
(i) for testing homogeneity of a set of several population variances
(ii for testing homogeneity of several population correlation coefficients.
(ii) for testing homogeneity of several population proportions
(iv) for testing the independency of several attributes or variates.
(v) for testing of goodness of fit.
(vi) for testing that two or more than two populations are identical or not.

F-Test:
Prof. R.A. Fisher originally derived this test statistic and in honor of Fisher it is called F-test. Let us
define  n2 1
is a chi-square variate with n1 d.f., and  n22
is another chi-square variate with
n2 d.f. Then the F-test statistic is given by;
 n21

n1
F=
 n22

n2
which is distributed as F with d.f. n1 and n 2 .
Let x11 , x12 ,........., x1n1 be a random sample of size n1 from a normal population with
mean 1 and variance  12 and x 21 , x 22 ,.........,x 2n 2 be another random sample of
size n 2 which is drawn from another normal population with mean  2 and variance  22 .
Now,
n1

 (x
i=1
1i  x1 ) 2
is a chi-square variate with (n1 -1) d.f. and
 12
n2

( x
i = 1
2 i  x 22 )
is a chi-square variate with (n 2 -1) d.f.
 22
The sample variance for 1st sample is given by;
n1

 (x 1i  x1 ) 2
s12  i=1

n1 -1
n1
 s (n1 -1)   (x1i  x1 ) 2
2
1
i=1
And sample variance for 2nd sample is given by;
n2

 (x 2i  x 2 )2
s 22  i=1

n 2 -1

13
n2
 s (n 2 -1)   (x 2i  x 2 ) 2
2
2
i=1
Thus from equation we have,
s12 (n1 -1)
 12 is a chi-square variate with (n1 -1) d.f. and

And from equation (12.17) we have;


s 22 (n 2 -1)
 22 is a chi-square variate with (n 2 -1) d.f.

Thus the F-test statistic is given by;


s12 (n1 -1)
(n1  1) 12
F= 2 ~ F((n1 -1),(n 2 -1))
s 2 (n 2 -1)
(n 2  1) 22
s12
 12

s 22
 22
which is distributed as F with d.f.( n1 -1) and ( n 2 -1).

f(F(n1 1),(n2 1) )

Critical Region

F(n1 1),(n 2 1)( ) F(n1 1),(n 2 1)

Figure: Probability density function of the F distribution with (n1 -1) and ( n 2 -1) degrees of
freedom
Uses of F-test
F-test is used for the following purposes
(i) for testing equality of several population means
(ii) for testing equality of two population variances.
(iii) for testing linearity of regression equations.
(iv) for testing the joint null hypothesis of the regression parameters:
Application of Different Tests Concerning Population Means, Population Proportions,
Population Varinances, Population Correlation Coefficients, and Regression Coefficients
We discuss the applications of Z-test, Student's t test, 2 test and F test. The applications of

14
these tests are classified into four different groups for easy understanding as follows:
(i) Testing hypothesis about populations means;
(ii) Testing hypothesis about population proportions;
(iii) Testing hypothesis about population variances;
(iv) Testing hypothesis about population correlation coefficients
(v) Testing hypothesis about population regression coefficients

Different Tests Concerning about Population Means


In this section we consider the followings;
(i) Test concerning the population mean is equal to specified value;
(ii) Test concerning difference between two population means;
(iii) Test concerning for comparison of k ( k>2 ) population means;
To test whether the population mean  is equal to a specified value, say,  0 , here two cases are
considered depending on whether the population variance is known or unknown;
Z-Test for a Population Mean is Equal to a Specified Value Say 0 (Variance is Known)
Here the null hypothesis to be tested is
H 0 :  =0
against the following three alternative hypotheses;
Case (1): H1:  >0
Case (2): H1:  <0
Case (3): H1:   0
Assumptions:
(i) Let x1 , x 2 ,........, x n be a random sample of size n which is drawn from a normal
distribution with mean  and variance  2 .
(ii) The population variance  2 is known (either n > 30 or n < 30) and we assume that it is equal
to  02
Method : Let x is the sample mean, which is given by
n

x
i=1
i
x=
n
Under null hypothesis the test statistic is given by
x-E(x)
Z= ~N( 0, 1)
var(x)
x-0
= ~N( 0, 1)
 2

n
x-0
 ~N( 0, 1)
 2
0
n
Let the level of significance is 
Example: A random sample of obtaining marks in Business Statistics of the 100 BBA students of a
15
public university A is drawn. From the sample observations it is found that the average obtaining
marks in Business Statistics of the students is 68. If the population of the obtaining marks of the
BBA students is assumed to be normal with mean  and variance 9.
Test the following null hypothesis;
H 0 :  =70
Against the following three alternative hypotheses
Case (1): H1:  >70
Case (2): H1:  <70
Case (3): H1:   70
Solution: Here given that x = 68 and  = 9
2

Under null hypothesis the test statistic is given by;


68-70
Z= ~N( 0, 1)
9/100
=-6.67
Let the level of significance is   0.05

Z-Test for a Population Mean is Equal to a Specified Value Say  0 (Variance is Unknown)
Here the null hypothesis to be tested is
H 0 :  =0
against the following three alternative hypotheses;
Case (1): H1:  >0
Case (2): H1:  <0
Case (3) : H1:   0
Assumptions:
(i) Let x1 , x 2 ,........, x n be a random sample of size n (where n>30) from a normal
distribution with mean  and variance  2 .
(ii) The population variance 
2
is unknown
Method: Let x is the sample mean, which is given by
n

x
i=1
i
x=
n
Since the population variance  2 is unknown, we have to replace it by the estimated value of  2 .
The estimated value of  2 is obtained from the sample observations, which is denoted by s 2 and
is given by;
n

 (x  x) 2

s2  i=1

n-1
Under null hypothesis the test statistic is given by

16
x-E(x)
Z= ~N( 0, 1)
var(x)
x-0
 ~N( 0, 1)
 2

n
x-0
 ~N( 0, 1)
s2
n
Let the level of significance is 
Example A random sample of obtaining marks in Economics of 60 students from the university B
is drawn and the sample observations are given below;
58, 56, 62, 65, 60, 64, 58, 59, 66, 65, 75, 59, 60, 69, 67, 75, 70, 68, 62, 62
50, 54, 59, 65, 68, 66, 62, 60, 61, 78, 50, 55, 45, 55, 75, 66, 58, 52, 47, 49
43, 59, 62, 65, 63, 61, 45, 58, 63, 62, 66, 58, 55, 65, 60, 52, 68, 55, 61, 72
If the population of the obtaining marks in Economics of the students of university B is assumed to
be normal with mean  and variance  2 .
Test the null hypothesis is;
H 0 :  =65
Against the three alternative hypotheses which are given below;
Case (1): H1:  >65
Case (2): H1:  <65
Case (3) : H1:   65
Solution : Let x1 , x 2 ,........, x 60 be a random sample of size 60 from a normal distribution
with mean  and variance  2 .
(ii) Let x is the sample mean, which is given by
n

x i
3648
x= i=1
  60.80
n 60
(iii) The population variance 2 is unknown, so we have to estimate it from the sample
observations. The estimated value of 2 is given by;
n

 (x  x) 2

s2  i=1
 57.45
n-1
Under null hypothesis the test statistic is given by
60.8-65
Z= ~N( 0, 1)
57.45/60
=-4.292
Let the level of significance is   0.05

17
Z-Test for Comparison of Two Population Means (Variances are Known or Unknown but
Unequal)
Here the null hypothesis to be tested is
H 0 : 1 =2 or equivalently H 0 : 1  2  0
against the following three alternatives;
Case (1) : H1: 1  2 Or equivalently H1: 1  2  0
Case (2) : H1: 1  2 Or equivalently H1: 1  2  0
Case (3): H1: 1  2 Or equivalently H1: 1  2  0
Assumptions:
(i) Let x11 , x12 ,.........,x 1n1 be a sample of size n1 and x 21 , x 22 ,.........,x 2n2 be
another sample of size n 2 are drawn randomly and independently from two different normal

populations with mean 1 and  2 ; and variance  12 and  22 respectively.

(ii) Sample sizes are large i.e. n1 > 30 and n 2 > 30.
(iii) The population variances  2
1 and  2
2 are known (either sample sizes are large or small)
which are equal to  and  respectively
2
10
2
20

Method:
Let x1 is the first sample mean and x 2 is second sample mean, which are given by
n1 n2

x
i=1
1i x 2i
i=1
x1 = ; and x 2 =
n1 n2
Under null hypothesis the test statistic is given by
(x1 -x 2 )-E(x1 -x 2 )
Z= ~N( 0, 1)
var(x1 -x 2 )
(x1 -x 2 )-(1 - 2 )
= ~N( 0, 1)
 2
 2
1
 2
n1 n2
(x1 -x 2 )
= ~N( 0, 1)
 102  20
2

n1 n2
If the populations’ variances are unknown, we have to use the estimated values of these variances.
The estimated values of  12 and  22 are given by;
n1

 (x1i  x1 )2
n2

 (x 2i  x 2 )2
ˆ12  s12  i=1
; and ˆ 2  s 2 
2 2 i=1
n1 -1 n 2 -1
If the variances are unknown and the sample sizes are large, the test statistic under null hypothesis
is given as follows;
18
(x1 -x 2 )
Z= ~N( 0, 1)
2 2
s s
1
 2
n1 n 2
= Zcal
Let the level of significance is 
Example A random sample of obtaining marks in Business Statistics of the 50 BBA students of
university A is drawn from a normal population with mean 1 and variance 25. From the sample
observations it is found that the average obtaining marks in Business Statistics of the students is 68.
Another random sample of obtaining marks in Business Statistics of the 60 BBA students of
university B is drawn from another normal population with mean  2 and variance 16. From the
sample observations it is found that the average obtaining marks in Business Statistics of the
students of university B is 65. Test the significance of difference between two population means.
Solution: Here the null hypothesis to be tested is
H 0 : 1 = 2 Or equivalently H 0 : 1  2  0
against the following three alternative hypotheses:
Case (1) : H1: 1  2 Or equivalently H1: 1  2  0
Case (2) : H1: 1  2 Or equivalently H1: 1  2  0
Case (3): H1: 1  2 Or equivalently H1: 1  2  0
Here given that n1 =50, n 2 =60, x1 = 68, x 2 =65,  12 =25, and  22 =16
Under null hypothesis the test statistic is given by
68-65
Z= ~N( 0, 1)
25 16

50 60
= 3.426
Let the level of significance is   5% .
Z-Test for Comparison of Two Population Means (Variances are Known and Equal)
Here the null hypothesis to be tested is
H 0 : 1 = 2 Or equivalently H 0 : 1  2  0
against the following three alternative hypotheses;
Case (1) : H1: 1  2 Or equivalently H1: 1  2  0
Case (2) : H1: 1  2 Or equivalently H1: 1  2  0
Case (3): H1: 1  2 Or equivalently H1: 1  2  0
Assumptions:
(i) Let x11 , x12 ,.........,x1n1 be a sample of size n1 and x 21 , x 22 ,.........,x 2n2 be another
sample of size n 2 are drawn randomly and independently from two different normal populations
with mean 1 and  2 , and variance  12 and  22 respectively.
(ii) The population variances  12 and  22 are known and must be

19
equal (either n1 > 30, n 2 >30 or n1 < 30, n 2 <30) which are equal to  2 .
Method: Let x1 is the first sample mean and x2 is the second sample mean, which are given
by
n1

x
n2

i=1
1i
x 2i
x1 = ; and x 2 =
i=1
n1 n2
Under null hypothesis the test statistic is given by
(x1 -x 2 )-E(x1 -x 2 )
Z= ~N( 0, 1)
var(x1 -x 2 )
(x1 -x 2 )-(1 - 2 )
= ~N( 0, 1)
 12  22

n1 n2
(x1 -x 2 )
= ~N( 0, 1)
 2
 2

n1 n2
(x1 -x 2 )
= ~N( 0, 1)
1 1
 
n1 n 2
Let the level of significance 
Example A sample of life times of 20 bulbs of brand A and another sample of life times of 15
bulbs of brand B are selected randomly and independenly from two different populations. It is
found that the average life times of brand A is 1500 hrs and of brand B is 1570 hrs. If the two
populations of life times of bulbs are assumed to be normally distributed with mean 1 and  2
and variance  12 and  22 respectively. Test the difference between two population means if
 12 =  22 =144.
Solution: Here the null hypothesis to be tested is
H 0 : 1 = 2 Or equivalently H 0 : 1  2  0
against the following three alternative hypotheses;
Case (1) : H1: 1  2 Or equivalently H1: 1  2  0
Case (2) : H1: 1  2 Or equivalently H1: 1  2  0
Case (3): H1: 1  2 Or equivalently H1: 1   2  0
Here, given that,
n1 = 20, n 2 = 15, x1 = 1500, x 2 = 1570, and 12 =  22 =144
Under null hypothesis the test statistic is given by;
1500-1570
Z= ~N( 0, 1)
1 1
12 
20 15
= -17.0783
20
Let the level of significance is   5% .

t-Test for a Population Mean is Equal to a Specified Value Say 0 (Variance is Unknown)
Here the null hypothesis to be tested is;
H 0 :  =0
against the following three alternative hypotheses
Case (1): H1:  >0
Case (2): H1:  <0
Case (3) : H1:   0
Assumptions:
(i) Let x1 , x 2 ,........, x n be a random sample of size n (where n < 30) from a normal
population with mean  and variance  2 .
(ii) The population variance  2 is unknown.
Method: Let x is the sample mean, which is given by
n

x
i=1
i
x=
n
Under null hypothesis the test statistic is given by;
x-E(x)
t= ~t (n-1)d.f.
var(x)
x-0
= ~t (n-1)d.f.
2
n
Since the population variance  is unknown, so we have to replace it by the estimated value
2

of 2. The estimated value of 2 is obtained from the sample observations, which is
2
denoted by s and is given by;
n

 (x i  x) 2
ˆ 2  s 2  i=1

n-1
Thus from equation (12.41) we have
x-0
t= ~t (n-1)d.f.
2
s
n
Let the level of significance is 
Example A large number of customers claim to the sales manager of coca-cola company that in
each 2 liter bottle the weight is less than from the labeled. That is why a quality control inspector of
the coca-cola company is interested in testing whether the mean number of liter of coca-cola bottle
differs from the labeled amount of 2 liter or not. The inspector is drawn a random sample of 15

21
bottles from different markets and found the average weight of their contents is 1.85 liter and the
standard deviation is 0.5 liter. Does the sample evidence indicate that the customer’s claim is true at
5% level of significance?

Solution: The null hypothesis to be tested is given as follows:


H0 :  = 2
against the alternative hypothesis is;
H1:  < 2
Let x is the sample mean, and s 2 is the estimated value of  2 which are given by
n n

x i  (x i  x) 2
x= i=1
=1.85 and s2  i=1
 0.25
n n-1
Under null hypothesis the test statistic is given by
1.85-2
t= ~t14 d.f.
0.25
15
= -1.16190
Let the level of significance is  =0.05
t-Test for Testing Equality of Two Population Means (Population Variances are Unknown
and Equal)
Here the null hypothesis to be tested is
H 0 : 1 =2 Or equivalently H 0 : 1  2 0
against the following three alternative hypotheses;
Case (1) : H1: 1  2 Or equivalently H1: 1  2  0
Case (2) : H1: 1  2 Or equivalently H1: 1  2  0
Case (3): H1: 1  2 Or equivalently H1: 1   2  0
Assumptions:
(i) Let x11 , x12 ,.........,x 1n1 be a sample of size n1 and x 21 , x 22 ,.........,x 2n 2 be
another sample of size n 2 are drawn randomly and independently from two different normal
populations with mean 1 and  2 and variance  12 and  22 respectively.
(ii) Here the sample sizes are small i.e. (n1 < 30, n 2 <30) .
(iii) The population variances  1 and  22 are unknown and equal,
2

Method: Let x1 is the first sample mean and x 2 is the second sample mean, which are given
by
n1

 x1i
n2

i=1
x 2i
x1 = ; and x 2 = i=1
n1 n2
Since the population variances  12 and  22 are unknown, we have to replace them by their
estimated values. The estimated values are obtained from the sample observations which are

22
denoted by s12 and s 22 and are given by;
n1 n2

 (x1i  x1 )2  (x 2i  x 2 )2
ˆ12  s12  i=1
; and ˆ 2
2
 s 22  i=1
n1 -1 n 2 -1
Since the population variances are equal, let  12   22   2 . The best estimate of population
variance  2 is given by;
n1 n2

 (x 1i  x1 )   (x 2i  x 2 ) 2
2
(n1 -1)s12 +(n 2 -1)s 22
s2  i=1 i=1

n1 +n 2 -2 n1 +n 2 -2
Under null hypothesis the test statistic is given by
(x1 -x 2 )-E(x1 -x 2 )
t= ~t (n1 +n 2 -2)d.f.
var(x1 -x 2 )
(x1 -x 2 )-(1 - 2 )
= ~t (n1 +n 2 -2)d.f.
 2
 2
1
 2
n1 n2
(x1 -x 2 )
= ~t (n1 +n 2 -2)d.f.
2 2
s s

1 2
n1 n 2
(x1 -x 2 )
= ~t (n1 +n 2 -2)d.f.
2 2
s s

n1 n 2
(x1 -x 2 )
 ~t (n1 +n 2 -2)d.f.
1 1
s 
n1 n 2
Let the level of significance is 
Example The Business Studies Faculty of a public university A conducts both day shift and night
shift MBA classes. The results of a random sample of 15-day shift students are as; x1 = 78, and
s12  16 . The results of a random sample of 16 night shift students are as; x 2 =85, and
s 22 =25 . Suppose both populations are normally distributed with mean 1 and  2 and
variance  12 and  22 respectively. Test the significance of the difference of population means if
the population variances are equal.
Solution: Here the null hypothesis to be tested is
H 0 : 1 = 2
against the following three alternative hypotheses;

23
Case (1) : H1: 1  2
Case (2) : H1: 1  2
Case (3): H1: 1  2
Here the given that; n1 = 15, n 2 = 16, x1 = 78, x 2 =85, s1  16 , and s 2 =25 .
2 2

Since the population variances are equal, the best estimate of population variance is given by;
(n1 -1)s12 +(n 2 -1)s 22 14  16  15  25
s2    20.65
n1 +n 2 -2 15  16  2
Under null hypothesis the test statistic is given by;
78-85
t= ~t 29 d.f.
1 1
4.54 
15 16
= -4.2901
Let the level of significance is 5%

t-Test for Testing Equality of Two Population Means (Population Variances are Unknown
but not Equal
Here the null hypothesis to be tested is
H 0 : 1 =2
against the following three alternative hypotheses;
Case (1) : H1: 1  2
Case (2) : H1: 1  2
Case (3): H1: 1  2
Assumptions :
(i) Let x11 , x12 ,.........,x 1n1 and x 21 , x 22 ,.........,x 2n 2 be the two samples of size
n1 and n2 are randomly and independently drawn from two normal populations with mean
1 and 2 and variance 12 and  22 respectively.
(ii) Here the sample sizes are small i.e. (n1 < 30, n 2 <30) .

(iii) The population variances  12 and  2 are unknown and are not equal.
2

Method: Let x1 is the first sample mean and x 2 is the second sample mean which are given
by
n1

 x1i
n2

i=1
x 2i
x1 = ; and x2 = i=1
n1 n2
Under null hypothesis the test statistic is given by;

(x1 -x 2 )-E(x1 -x 2 )
t= ~t (m)d.f.
var(x1 -x 2 )

24
(x1 -x 2 )-(1 - 2 )
= ~t (m)d.f.
 2
 2
1
 2
n1 n2
Since the population variances  12 and  22 are unknown and are not equal, we have to replace
them by their estimated values. The estimated values are obtained from the sample observations
which are denoted by s12 and s 22 and are given by;
n1

 (x
n2

1i  x1 ) 2
 (x 2i  x 2 )2
ˆ12  s12  i=1
; and ˆ 22  s 22  i=1
n1 -1 n 2 -1
Thus from equation (12.52) we have
(x1 -x 2 )
t= ~t m d.f.
2 2
s s
1 2
n1 n 2
which is approximately distributed as Student's t with m degrees of freedom which is given by;
 s12 s 22 
  
m=  n1 n 2 
2 2
 s 22   s 22 
   
 n1    n 2 
n1  1 n 2  1
In general the degrees of freedom m given by the equation (12.54) is not an integer value, and we
may use the nearest integer value for m.
Let the level of significance is 
Example: Suppose the manager of a textile industry suspects that the mean time lost due to the
shickness of the night shift workers exceeds the mean time for the day shift workers. To check it,
the manager randomly selected 12 workers in each shift category and record the number of days
lost due to sickness within the past year.
Night Shift

12, 10, 20, 15, 18, 9, 12, 10, 21, 25, 13, 8.

Day Shift

8, 10, 15, 9, 12, 16, 15, 20, 5, 18, 12, 7


If the number of days per year lost due to the sickness for the night shift and day shift workers are
normally distributed with mean 1 and  2 and variance  1 and  2 respectively, test the
2 2

significance of the difference of population means if the population variances are not equal.
Solution: Here the null hypothesis to be tested is
H 0 : 1 = 2
against the alternative hypotheses is

25
H1: 1 > 2
Let x1 is the sample mean for night shift and x 2 is the sample mean for day shift workers,
which are given by
12 12
1 173 1 147
x1 =
12

i=1
x1i =
12
=14.42, and x 2 =
12
x
i=1
2i =
12
=12.25
The sample variances are given by
1 n1 1 n1
2
s =
1 
n1  1 i 1
2 2
(x1i -x1 ) =29.36, and s 2 = 
n 2  1 i 1
(x 2i -x 2 ) 2 =21.48
Under null hypothesis the test statistic is given by;
14.42-12.25
t= ~t (m)d.f.
29.36 21.48

12 12
=1.0543
where,
2
 29.36   21.48  
 12    12  
   
m=  2 2
 21.48  21
 29.36   21.48 
   
 12    12 
11 11
Let the level of significance is 5%
At 5% level of significance with 21 degrees of freedom the critical value of the test statistic is
1.721,

t-Test for Comparison of Two Correlated Population Means (Paired t Test)


Here the null hypothesis to be tested is
H0: x = y Or equivalently H 0 :  d   x -  y  0
against the following three alternative hypotheses
Case (1): H1:  x >  y Or equivalently H1: d  x - y  0

Case (2): H 1:  x <  y Or equivalently H1: d  x - y  0


Case (3): H1: x  y Or equivalently H1: d  x - y  0
Assumptions:
(i) Let ( x1 , y1 ), (x 2 , y 2 ),......,(x n , y n ) be a random sample of n independent pairs
of observations, which is drawn from a bivariate normal population.
(i) Let us define d i  x i - yi , i = 1, 2, ............,n.
(ii) Here d1 , d 2 ,..........,d n constitute a random sample of size n from a normal population
with mean  d and variance  d2 .
Method: The sample mean d and the sample variance sd2 are given by;
n

d
i=1
i
d= , and
n
26
1 n
s 
2
d 
n-1 i 1
(d i  d) 2
Under null hypothesis the test statistic is given by
d
t= ~t (n-1)d.f.
2
s d
n
which is distributed as the Student's t with degrees of freedom (n-1)
Let the level of significance is 
Example: An experiment is conducted to compare the age of husbands and wives of couples during
their marriage time. To do this a random sample of 15 couples is drawn and the age of husbands
and wives at the time of their marriage are recorded below;
No. of Couples Age of Husbands Age of Wives
1 25 20
2 30 25
3 23 18
4 22 18
5 35 22
6 24 20
7 36 22
8 24 21
9 40 35
10 45 30
11 36 25
12 22 18
13 20 22
14 44 30
15 40 32
Test to see whether there is evidence that the mean age for the husbands 1 exceeds the mean
age for the wives  2 .
Solution: Here the null hypothesis to be tested is
H 0 : 1 = 2 Or equivalently H 0 : d  1 - 2  0
against the alternative hypothesis is
H1: 1 > 2 Or equivalently H1: d  1 - 2  0
Let us define d i , (i = 1, 2, ...........15); is the difference between husband and wife age of the ith
couple which are shown below in Table (12.3);
Table Calculation of d i = x i -yi i.e. difference between husbands and wives age
No. of Couples Age of Husbands Age of Wives ( yi ) di = x i -yi
( xi )
1 25 20 5
2 30 25 5
3 23 18 5
4 22 18 4
5 35 22 13
6 24 20 4
7 36 22 14
8 24 21 3

27
9 40 35 5
10 45 30 15
11 36 25 11
12 22 18 4
13 20 22 2
14 44 30 14
15 40 32 8
Here d1 , d 2 ,.........,d15 constitute a random sample of size 15 from a normal population with

mean d and variance  d . Let d is the estimated value of


2
d and s d2 is the estimated
value of  , which are given by;
2
d
15

d 1 15

 (d
i
 7.467 ; and s d   d) 2  21.124
112 2
d= i=1
 i
15 15 14 i 1
Under null hypothesis the test statistic is given by
7.467
t= ~t14 d.f.
21.124
15
= 6.284
which is distributed as the Student's t with degrees of freedom (14)
Let the level of significance is  = 5%
At 5% level of significance with 14 degrees of freedom the critical value of the test statistic 1.761
t-Test for a Population Correlation Coefficient  is Equal to Zero
Here the null hypothesis to be tested is
H0:  = 0
against the following three alternative hypotheses:
Case 1: H1 :  > 0
Case 2: H1 :  < 0
Case 3: H1 :   0
where,  is the population correlation coefficient.
Assumptions
(i) Let (x1 , y1 ), (x 2 , y2 ),......, (x n , y n ) be a random sample of n pairs of observations.
(ii) These sample observations are drawn from a bivariate normal population.
(iii) The relationship between y and x is linear.
Method The estimated value of  is given by
n

 (x -x) (y -y)
i=1
i i
r=
n n

 (x - x)  (y -y)
i=1
1
2

i=1
i
2

x y - n x y
i=1
i i
=
 n 2 2
n
2
 i    yi -ny 
2
x - nx
 i=1   i=1 

28
1 n 1 n
where, x =  i ; and, n 
n i=1
x y =
i=1
yi

Under null hypothesis the test statistic is given by


r n-2
t= ~t (n-2) d.f
(1-r 2 )
Let the level of significance is  .
Case 1: Let at  level of significance with (n-2) degrees of freedom the critical value of the test
statistic is t (n-2); which is shown by the shaded area of the following diagram.

Figure The critical value of t-test statistic for testing the null hypothesis H 0 :  = 0 against the
alternative hypothesis H1:  > 0 at the significance level  .
Comment: If the calculated value of the test statistic is greater than the critical value, we reject the
null hypothesis means that there is a positive relationship between two variables x and y, otherwise
we accept the hypothesis.
Case 2: Let at  level of significance with (n-2) degrees of freedom the critical value of the test
statistic is -t (n-2); and which is shown by the shaded area with the following diagram.

Figure The critical value of t-test statistic for testing the null hypothesis H 0 :  = 0 against the
alternative hypothesis H1:  < 0 at the significance level  .
Comment: If the calculated value of the test statistic is greater than the critical value, we accept the
null hypothesis means that there is no relationship between two variables x and y, otherwise we
reject the hypothesis implies that two variables are x and y are negatively related.
Case 3: At  level of significance with (n-2) degrees of freedom the critical values are
-t (n-2); / 2 and t (n-2); / 2 which are shown by the shaded area of the Figure (12.30).

29
Figure The critical values of t-test statistic for testing the null hypothesis H 0 :  = 0 against the
alternative hypothesis H1:   0 at the significance level  .
Comment: If the calculated value of the test statistic does not fall between - t (n-2); / 2 and
t (n-2); / 2 , we reject the null hypothesis means two variables are related, otherwise we accept the null
hypothesis.
Example The data on sales revenue and expenditure for advertisement of 12 factories are given
below in Table (12.10).
Table The amount of sales (in Lac TK) and advertising expenditure (in Lac TK) of 12 factories
Factories Sales Revenue Advertising
( in Lac Taka) Expenditure
(in Lac Taka)
1 35 2
2 30 1.5
3 45 2.5
4 50 3
5 80 5
6 85 5.5
7 65 4.5
8 70 6
9 45 4
10 48 3.5
11 52 5
12 49 6
Test the relationship between sales revenue and expenditure for advertisement at 5% level of
significance.
Solution Here the null hypothesis to be tested is
H0 :  = 0 ,
against the alternative hypothesis is
H1 :   0 ,
where,  is the population correlation coefficient.
Assumptions
(i) Let (x1 , y1 ), (x 2 , y 2 ),......, (x12 , y12 ) be a random sample of 12 pairs of observations of sales
revenue and expenditure for advertisement from a bivariate normal population.
(ii) The relationship between y and x is linear.
Method The estimated value of  is given by
n

x y - n x y
i=1
i i
r=
 n 2 2
n
2
  x i - nx    yi -ny 
2

 i=1   i=1 
12 12

x y 12

 x =38834 ,
i i 12
654 48.5
 xi yi =2859.5 ,
2
Now , x = i=1
  54.5 , y = i=1   4.0417 , i
12 12 12 12 i=1 i=1
12

 y =222.25 .
i=1
2
i

Putting these values in the equation (12.70), we have

30
2859.5-12  54.5  4.0417
r=
38834-12  54.52   222.25-12  4.0417 2 
= 0.7475
Under null hypothesis the test statistic is given by
0.7475 10
t= ~t10 d.f.
1-0.74752
= 3.5583
Let the level of significance is 5%
At 5% level of significance with 10 degrees of freedom the critical values of the test statistics are
2.2281 and -2.2281.
Decision: Since the calculated value of the test statistic does not fall in the acceptance region, we
reject the null hypothesis, means that the two variables sales revenue and advertisement
expenditure are positively correlated.
Z-Test for a Population Correlation Coefficient  is Equal to a Specified Value Say  0
The null hypothesis to be tested is
H 0 :  = 0
against the following three alternative hypotheses:
Case 1: H1 :   0
Case 2: H1 :   0
Case 3: H1 :   0
Assumptions
(i) Let (x1 , y1 ), (x 2 , y2 ),......, (x n , y n ) be a random sample of n pairs of observations.
(ii) These sample observations are drawn from a bivariate normal population.
(iii) The relationship between two variables x and y is linear.
Method The estimated value of  is given by
n

x y - n x y
i=1
i i
r=
 n 2 2
n
2
 
 i=1
x i - nx  2
  yi -ny 
  i=1 
n
1 1 n
where, x =  x i and y =  yi
n i=1 n i=1
To test the above null hypothesis we need a transformation which is known as a Fisher's
Z-transformation for correlation coefficient. This is defined as follows:
1 1+r
Z= ln
2 1-r
There are mainly three reasons for this transformation:
(1) First the variance of r depends on  which is unknown constant, but the variance of Z is does
not depend on  . In fact the variance of Z is approximately1/(n-3).
(2) The distribution of r is far from normal, but the distribution of Z is approximately normal.
(3) The distribution of r change their forms rapidly as  is changed, on the other hand the
distribution of Z is nearly constant in form.
The value of Z corresponding to given value of r is directly obtainable from Fisher's-Yates Table.
Now to test the null hypothesis we find
1 1+r
Z= ln
2 1-r
The expected value Z is given by

31
1 1+
E(Z) = ln
2 1-
1 1+ 0
= ln  under H 0 
2 1- 0
Under null hypothesis the test statistic is given by
Z- E(Z)
d= ~N(0,1)
var(Z)
1 1  r 1 1  0
ln  ln
2 1-r 2 1-0
=
1/(n-3)
Let the level of significance is  .
Case 1 : Let at  level of significance the critical value of the test statistic is z which is
shown by the shaded area in Figure (12.10)
Comment : If the calculated value of the test statistic is greater than the critical value, we reject the
null hypothesis, otherwise we accept it.
Case 2 : Let at  level of significance the critical value of the test statistic is -z which is
shown by the shaded area in Figure (12.11).
Comment : If the calculated value of the test statistic is less than the critical value, we reject the
null hypothesis, otherwise we accept it .
Case 3 : Let at  level of significance the critical values of the test statistic are -Z / 2 and Z / 2
which are shown by the shaded area in Figure (12.12).
Comment : If the calculated value of the test statistic falls between these two critical values, we
accept the null hypothesis, otherwise we reject it.
Example A random sample of 15 pairs of observations of two kinds of assessment i.e. internal
assessment and external assessment of performance of the BBA students in Business Statistics of a
public university is drawn and the results are given below;
Table Internal and external assessment of performance of BBA students in Business Statistics
Students Internal External
No. Assessment Assessment
1 85 74
2 65 55
3 85 82
4 50 40
5 76 56
6 85 76
7 65 48
8 70 68
9 45 50
10 75 69
11 52 47
12 95 86
13 48 34
14 73 68
15 56 50
Test the hypothesis that population correlation coefficient between two kinds of assessment will be
0.80 at 5% level of significance.
Solution Here the null hypothesis to be tested is
H 0 :  = 0.80
against the alternative hypothesis is
H1:   0.80
Assumptions
(i) Let (x1 , y1 ), (x 2 , y 2 ),......, (x15 , y15 ) be a random sample of 15 pairs of observations.

32
(ii) These sample observations are drawn from a bivariate normal population.
(iii) The relationship between two variables y and x is linear.
Method The estimated value of  is given by
n

x y - n x y
i=1
i i
r= = 0.9219
 n 2   n

  x i - nx    yi -ny 
2 2 2

 i=1   i=1 
The Fisher's Z-transformation for correlation coefficient is given by
1 1+0.9219
Z= ln
2 1-0.9219
= 1.6015
The expected value Z is given by
1 1+0.80
E(Z) = ln
2 1-0.80
= 1.0986
Under null hypothesis the test statistic is given by
Z- E(Z)
d= ~N(0, 1)
var(Z)
= (1.6015-1.0986)/ 1/12
= 1.7421
Here the level of significance is 5%.
At 5% level of significance the critical value of the test statistic are -1.96 and 1.96.
Decision: Since the calculated value of the test statistic falls in the acceptance region, we accept the
null hypothesis, which implies that the population correlation coefficient between the internal and
external assessments of the BBA students in Business Statistics is about 80%

Let the level of significance is 5%.


Problem 1 It is found that the average life times of a sample of 100 bulbs of brand A is 1600 hrs
with standard deviation is 16 hrs and also from another sample of size 80 bulbs of brand B the
average life times is 1500 hrs with standard deviation is of 20 hrs. If the two populations are
normally distributed with mean 1 and  2 and variance  12 and  22 respectively, test the
difference between two population means.
Problem 2 A random sample of size 40 of obtaining marks in Economics is drawn from a group of
students of section A, it is found that the average obtaining marks is 76, it is also found that the
average obtaining marks in Economics of a random sample of size 60 from another group of
students of section B is 84. The obtaining marks in Economics of these two groups are normally
distributed with mean 1 and  2 and variance 25 and 16 respectively. Test the null hypothesis
that the average obtaining marks in Economics of these two groups will be same.
Problem 3 A researcher of BRI wants to compare two varieties of paddy A and B, in terms of their
yields. He cultivated 10 acres of land for each variety under similar conditions. It is found that the
average yield per acre of variety A is 35 quintals with standard deviation is of 5 quintals and for
variety B, the average yield per acre is 32 quintals with standard deviation is of 4 quintals. If the
yields of two varieties of paddy are normally distributed with mean 1 and  2 and variance  12
and  22 respectively. Test the null hypothesis that the average yields of two varieties of paddy will
be same.
Problem 4 It is found that the average obtaining points in Business Statistics of a random sample of
10 students from a public university A is 85 with standard deviation is of 5 points and from another
random sample of 15 students from another public university B is 90 with standard deviation is of 4

33
points respectively. If the obtaining points of the students of university A and B are normally
distributed with 1 and  2 , and variance  2 . Test the null hypothesis that the average obtaining
points of university A will be less than that of average obtaining points of university B.

Problem 5 The manager of a large branch of a public bank selected a random sample of waiting
times of 25 clients from a large number of clients at Motijheel area, the waiting times (in minutes)
are given as: 25.0, 23.5, 18.46, 22.0, 15.40, 16.50, 20.40, 25.02, 23.25, 22.45, 21.05, 18.50, 17.22,
20.45, 19.25 20.0, 18.5, 18.46, 15.0, 15.40, 16.50, 14.40, 15.02, 14.25, 20.45. Also he selected
another random sample of waiting times of 20 clients from a large number of clients at Gulshan
area, the waiting times (in minutes) are given as:12.50, 12.22, 15.45, 14.25, 15.05, 13.50, 12.20,
14.50, 12.25, 18.0, 18.5, 18.46, 18.0, 19.40, 18.50, 20.40, 20.02, 18.25, 17.45, 12.05. Do you think
that there is no significant difference of the mean waiting times of their clients of Motijheel and
Gulshan areas, assuming that the waiting times of clients in both areas are normally distributed?

Problem 6 The following data on marks out of 50 obtained by some students in Business Statistics
in two tests: One before and other after special coaching
First Test (Before Second Test (After
Coaching) Coaching)
33 34
30 30
29 31
32 28
28 30
31 34
25 30
27 32
33 33
26 30
29 32
30 29
28 33
27 30
26 35
22 29
24 29
Test the null hypothesis that the correlation coefficient between obtaining marks of first and second
tests will be significant at 5% level of significance.
Problem 7. The following data indicates the R&D expenditure (in million TK) and annual profits
(in million TK) of a company over a period of time.
R&D Annual Profits
Expenditure
5 22
8 25
9 35
10 50
7 34
9 40
8 32
10 45
12 50

34
8 32
Test the null hypothesis that the correlation coefficient between R & D expenditure and annual
profits is statistically significant at 5% level.
Problem 8. The following data indicates obtaining points of the students of KIMEP in Economics
and Econometrics.
Student No. Economics Econometrics
1 91 84
2 85 45
3 54 67
4 43 52
5 92 76
6 66 59
7 89 90
8 77 84
9 58 34
10 41 36
11 65 45
12 60 48
Test the null hypothesis that the correlation coefficient of the obtaining points of the students of
KIMEP between Economics and Econometrics will be greater than 0.85.
Problem 9 The following data on marks out of 50 obtained by some students in Business Statistics
in two tests: One before and other after special coaching
First Test Second Test
(Before (After Coaching)
Coaching)
33 34
30 30
29 31
32 28
28 30
31 34
25 30
27 32
33 33
Test the null hypothesis that the correlation coefficient between oftaining marks of first and second
tests will be significant at 5% level of significance.

35

You might also like