You are on page 1of 28

UNIT-IV

1. Testing of Hypothesis

1.1 Introduction
One of the objectives of statistical investigation is to evaluate whether there is significant
difference between the estimated parameter and true parameter after estimating population
mean or proportion. Naturally a question arises “Does the estimated parameters confirm to
real parameter, or, is there any considerable difference between them”, the answer leads us to
the evaluation of difference. There are tests to assess the significance of such difference,
which are called significance tests.

The basis of statistical tests is hypothesis. First we form a hypothesis regarding the
population. Then we conduct a test to assess whether there is any significant difference. The
hypothesis will be accepted or rejected according the significance of difference revealed by
the test. Therefore, significance test is also called hypothesis tests.

In social sciences where direct knowledge of population parameter is rare, significance or


hypothesis testing is the often used strategy for deciding whether sample data supports
population characteristics or not.

1.2 Some Basic Concepts


Significance tests are amply supported by several theoretical basic concepts. In order to
conduct tests, knowledge of following basic ingredients is essential.

Population Parameters and Sample Statistic


The process of selecting a sample from a population is called sampling. In sampling, a
representative sample of elements of a population is selected and then analysed. A measure
(or value) which describes the entire population or process characterises is called a
parameter. For example the quantities such as mean (μ) standard deviation (σ), variance (σ2),
proportion (p) etc are called parameters of the population. A measure (or value) found from
the sample data is called sample statistic. Sample statistics are usually denoted by 𝑥̅ (mean), s
(standard deviation), variance (s2), 𝑝̅ (proportion) etc.

Parametric and non - parametric tests


On the basis of focus of the test, they can be classified as parametric and non parametric tests.
In certain tests, assumption about population distribution can be made. For example in large
sample test or Z test we assume that samples are drawn from population following normal
distribution. Such tests which are based on assumptions about population are called
parametric tests. These tests focus on means of samples or population, proportion, variance or
standard deviation and accordingly, all mean tests, proportion tests or variance tests are
parametric tests. But in certain situations, it is not possible to make any assumption about
population distribution, from which samples are drawn. Besides they do not focus on
parameters like mean, proportion or variance etc. Such tests are called non parametric tests.
Non parametric tests include Chi square test, Rank test, Sign test , Runs test etc.

Types of significance tests


There are numerous types of significance tests, according to situations and criteria of testing.
Tests may be parametric or non parametric, one tailed or two tailed, small sample or large
sample etc.
Small sample and large sample tests
According to the number of items included in a sample, tests can be divided as small sample
tests and large sample tests. If the test includes a sample of size less than 30, it is a small
sample test. When the size is 30 or more, it is called a large sample test. Small sample tests
follow student’s t distribution. Large samples tests follow normal distribution or Z-test. Mean
tests may be conducted as large or small tests. But proportions are conducted as large sample
tests only.

1.3 Sampling distribution


From a population, several samples may be collected, from each sample group, some sample
statistic like mean, median range or standard deviation can be ascertained. The distribution
thus obtained from a sample statistic is called a sampling distribution. It is a probability
distribution. Thus a sampling distribution is a list of certain sample statistics. A sample
statistic is a random variable and it has a probability distribution, called sampling distribution
of that statistic. Accordingly, there will be sampling distribution of means, sampling
distribution of standard deviations etc.

Standard error
A sampling distribution has several sample means or other statistics. A grand mean of such a
sampling distribution can be ascertained, and deviations calculated. The standard deviation of
sampling distribution of a statistic is called standard error of that statistic. The standard error
is very useful quantitative tool in hypothesis testing. Generally the standard error is 𝜎/√𝑛
where σ is the standard deviation and n is the sample size.

Interval Estimation
• To use sample statistics to estimate population parameters.
• Statistical inference is based on estimation and hypothesis testing.
• Types of estimates:
 Point estimate: A single number that is used to estimate an unknown
population parameter.
 Interval estimate: A range of values used to estimate a population parameter.
• Any sample statistic that is used to estimate a population parameter is called an
estimator. An estimate is a specific observed value of a statistic.
• Properties of a good estimator:
 Unbiasedness
 Efficiency
 Consistency
 Sufficiency
Point Estimates
• Point estimator of population mean (μ): Sample mean (x).
• Point estimator of population variance (σ2): Sample variance (s2).
• Point estimator of population standard deviation (σ): Sample standard deviation (s).
• Point estimator of population proportion (p): Sample proportion (p).
Interval Estimates
• An interval estimate describes a range of values within which a population parameter
is likely to appear.
• Interval estimate is constructed using point estimate, standard error and corresponding
probability.
• The probability that we associate with an interval estimate is called the confidence
level. The confidence interval is the range of the estimate we are making. Confidence
limits are the upper and lower limits of the confidence interval.
• A high confidence level seems to signify a high degree of accuracy in the estimate but
high confidence levels will produce large confidence intervals, and such large
intervals are not precise; they give very fuzzy estimates.
• 95% confidence interval means: “That if we select many random samples of the same
size and calculate a confidence interval for each of these samples, then in about 95
percent of these cases, the population parameter will lie within that interval.”
• Interval estimate for mean of large sample (Population s.d. is known):

x  z / 2
n

• Interval estimate for mean for large sample (Population s.d. is unknown):
x zα/2 ( s / n )
• When the population standard deviation is not known and the sample size is 30 or
less, t-distribution is used.
• Assumption for using t-distribution: The population is normal or approximately
normal.
• The t-distribution is symmetrical but flatter than the normal distribution, and there is a
different t-distribution for different sample sizes (or degrees of freedom). As the
sample size gets larger, the shape of the t-distribution becomes approximately equal to
the normal distribution.
• Interval estimate for mean using t-distribution:
x tα/2 ( s / n)

Note: (i) Larger sample provides a smaller margin of error.


(ii) Confidence intervals are essentially the same regardless of whether z or t is used.
(iii)Interval width must increase since we want to make a statement about with
greater confidence.
1.4 Statistical Hypothesis
Statistical Hypothesis is the basis of all significance tests. For a researcher, statistical
hypothesis is a claim (or an assertion, or a statement, or a formal question) that he intends to
resolve. Usually we begin some assumptions about the population from which the sample is
drawn. This assumption may be about the form of the population or about the parameters of
the population. Such assumption is called hypothesis. A hypothesis may be defined as “a
tentative conclusion logically drawn concerning the parameter or the form of the distribution
of the population. As an example, let us consider the IQs of all MBA students at a leading
university. Then the hypothesis may be “the mean of the population will be 110” or “the
population proportion will be the same as sample proportion.”

Null and alternative hypothesis


A null hypothesis is statistical hypothesis which states that the difference between the sample
statistic or population parameter is nil, or statistically insignificant. Usually null hypotheses
are formed for significance testing. Any hypothesis other than null hypothesis is called an
alternative hypothesis. The null hypothesis is denoted by H0 and alternative hypothesis is
denoted by H1.
One-tailed hypothesis testing specifies a direction of the statistical test. For example to test
whether cloud seeding increases the average annual rainfall in an area which usually has an
average annual rainfall of 20 cm, we define the null and alternative hypotheses as follows,
where μ represents the average rainfall after cloud seeding.
H0: µ ≤ 20 (i.e. average rainfall does not increase after cloud seeding)
H1: µ > 20 (i.e. average rainfall increases after cloud seeding
Here the experimenters are quite sure that the cloud seeding will not significantly reduce
rainfall, and so a one-tailed test is used where the critical region is as in the shaded area in
Figure 1. The null hypothesis is rejected only if the test statistic falls in the critical region, i.e.
the test statistic has a value larger than the critical value.

Figure 1 – Critical region is the right tail

The critical value here is the right (or upper) tail. It is quite possible to have one sided tests
where the critical value is the left (or lower) tail. For example, suppose the cloud seeding is
expected to decrease rainfall. Then the null hypothesis could be as follows:
H0: µ ≥ 20 (i.e. average rainfall does not decrease after cloud seeding)
H1: µ < 20 (i.e. average rain decreases after cloud seeding)

Figure 2 – Critical region is the left tail

Two-tailed hypothesis testing doesn’t specify a direction of the test. For the cloud seeding
example, it is more common to use a two-tailed test. Here the null and alternative hypotheses
are as follows.
H0: µ = 20
H1: µ ≠ 20
The reasons for using a two-tailed test is that even though the experimenters expect cloud
seeding to increase rainfall, it is possible that the reverse occurs and, in fact, a significant
decrease in rainfall results. To take care of this possibility, a two tailed test is used with the
critical region consisting of both the upper and lower tails.
Figure 3 – Two-tailed hypothesis testing

In this case we reject the null hypothesis if the test statistic falls in either side of the critical
region. To achieve a significance level of α, the critical region in each tail must have size α/2.

Since our sample usually only contains a subset of the data in the population, we cannot be
absolutely certain as to whether the null hypothesis is true or not. We can merely gather
information (via statistical tests) to determine whether it is likely or not. We therefore speak
about rejecting or not rejecting the null hypothesis on the basis of some test, but not
of accepting the null hypothesis or the alternative hypothesis. Often in an experiment we are
actually testing the validity of the alternative hypothesis by testing whether to reject the null
hypothesis.

Errors in Hypothesis Testing


When performing such tests, there is some chance that we will reach the wrong conclusion.
There are two types of errors:
 Type I – H0 is rejected even though it is true (false positive)
 Type II – H0 is not rejected even though it is false (false negative)
The acceptable level of a Type I error is designated by alpha (α), while the acceptable level
of a Type II error is designated beta (β).

Significance level is the acceptable level of type I error, denoted α. Typically, a significance
level of α = .05 is used (although sometimes other levels such as α = .01 may be employed).
This means that we are willing to tolerate up to 5% of type I errors, i.e. we are willing to
accept the fact that in 1 out of every 20 samples we reject the null hypothesis even though it
is true.

P-value (the probability value) is the value p of the statistic used to test the null hypothesis.
If p < α then we reject the null hypothesis.

The general procedure for null hypothesis testing is as follows:


 State the null and alternative hypotheses
 Specify α and the sample size
 Select an appropriate statistical test
 Collect data (note that the previous steps should be done prior to collecting data)
 Compute the test statistic based on the sample data
 Determine the p-value associated with the statistic
 Decide whether to reject the null hypothesis by comparing the p-value to α (i.e. reject
the null hypothesis if p < α)
 Report your results, including effect sizes
2. Parametric Tests for Means and Proportions
2.1 Introduction
Parametric tests like mean tests, proportion tests and variance tests are of great importance in
business and economic applications. Standard test procedures are available to test various
hypotheses regarding these parameters. Most of the business and economic situations tend to
be normal, conforming to normal probability distribution. This enables scientific testing and
proving of hypothesis relating to industry and commerce.

Standard test procedures are available to test various hypotheses regarding these parameters.
This makes significance testing powerful and reliable. Central Limit Theorem is the basis of
parametric tests, which enables accurate estimation of values within which populating
parameters will lie. Level of confidence and level of significance plays a crucial role in
making parametric tests useful and handy.

Population parameters like the mean, proportions, variance etc. are of great importance in
significance tests and economic applications. Test based on such parameters are called
parametric tests. Parametric tests enable to specify the parameters of population and the form
of a concerned probability sampling distribution. Statistical tests in which hypothesis deals
with population parameters or sample statistics are called parametric tests. For example,
when we want to test given population mean or population proportion, or any sample statistic,
the test applied is called parametric test.

Features of parametric tests


Parametric tests are universally recognized as the most useful and reliable hypothesis testing
technique. It exhibits following features.
1. Parametric – it is based on population parameters like mean, variance, proportion or
standard deviation. Parametric tests make use of one or more statistics obtained from
sample data or proportion to arrive at a probabilistic statement as hypothesis.
2. Distribution – a parametric test is based on the assumption that population from
which samples is drawn follows normal or some other probability distribution.
Normality of the distribution makes it possible to make statistical inference.
3. Randomness - it is also assumed that samples drawn from a population are random
samples. Randomness makes sampling technique and testing powerful and reliable.
4. Level of measurement – parametric tests conform to higher level of measurement
such as interval scale and ratio scale. Nominal or ordinal measurement levels do not
apply in case of parametric tests.

Assumptions in Parametric Tests


1. Sample Observations are independent.
2. Observations follow any sampling distribution.
3. Samples drawn are random samples.
4. Observations are made at least on interval scale

2.2 Hypothesis Testing for Single Population Mean - large Sample

Two-Tailed Test
Let μ0 be the hypothesized value of the population mean to be tested. Then the null and
alternative hypotheses for two-tailed test are defined as
𝐻0 : 𝜇 = 𝜇0

and 𝐻1 : 𝜇 ≠ 𝜇0
If the standard deviation σ of the population is known, then based on Central limit theorem,
the sampling distribution mean 𝑥̅ would follow the standard normal distribution for a large
sample size. Then the Z-test statistic is given by

𝑥̅ −𝜇 𝑥̅ −𝜇
𝑍= = 𝜎 (1)
𝜎𝑥̅ ( ⁄ )
√𝑛

If the standard deviation σ of the population is not known, then a sample standard deviation s
is used to estimate σ. Then the Z-statistic is given by

𝑥̅ −𝜇 𝑥̅ −𝜇
𝑍= 𝑠 = (2)
𝑠𝑥̅ ( ⁄ )
√𝑛
Now the decision rule based on sample mean for the two-tailed test will be as follows:
 Reject H0 if Zcal ≤ - Zα/2 or Zcal ≥ Zα/2

 Accept H0 if - Zα/2 < Z < Zα/2

where Zα/2 is the table value (critical value of Z).

Left Tailed Test


Large sample hypothesis testing about population mean for a left-tailed test is of the form

𝐻0 : 𝜇 ≥ 𝜇0 and 𝐻1 : 𝜇 < 𝜇0
The Z-test:

𝑥̅ −𝜇 𝑥̅ −𝜇
𝑍= = 𝜎
𝜎𝑥̅ ( ⁄ )
√𝑛
Decision rule:
 Reject H0 if Zcal ≤ - Zα

 Accept H0 if Zcal > - Zα

Right Tailed Test


Large sample hypothesis testing about population mean for a right-tailed test is of the form

𝐻0 : 𝜇 ≤ 𝜇0 and 𝐻1 : 𝜇 > 𝜇0
The Z-test:

𝑥̅ −𝜇 𝑥̅ −𝜇
𝑍= = 𝜎
𝜎𝑥̅ ( ⁄ )
√𝑛
Decision rule:
 Reject H0 if Zcal ≥ Zα

 Accept H0 if Zcal < Zα

Example 1: A cell phone battery company claims that its batteries have an average life of
170 hrs. A consumer tested 49 batteries and found that they have an average life 163 hours,
with standard deviation 16 hours. Is the claim valid at 5 per cent significance level?

Solution: Let us assume the null hypothesis H0 that there is no difference between company’s
claim and the sample mean. Then we have
Ho: μ = 170
H1: μ ≠ 170
Given n = 49, Sample mean = 𝑥̅ = 163, Population mean μ = 170, σ = 16 then using equation
(1), we obtain
𝑥̅ −𝜇 𝑥̅ −𝜇 (163−170) −7
𝑍= = = = = −3.06
𝜎𝑥
̅ (𝜎⁄ ) (16/√49) 2.29
√𝑛
Since Zcal = -3.06 is less than its critical Zα= -1.645 at α = 0.05 level of significance the null
hypothesis is rejected. Therefore the claim is not valid.

Example 2: The mean lifetime of a sample of 250 fluorescent light bulbs produced by a
company is found to be 1400 hours with a standard deviation of 160 hours. Test the
hypothesis that the mean lifetime of bulbs produced in general is higher than the mean
lifetime of 1390 hours at α = 0.01 level of significance.

Solution: Let us assume that the null hypothesis that the mean lifetime of bulbs is not more
than 1390 hours, i.e.,

Ho: μ ≤ 1390
H1: μ > 1390
Given n = 250, Sample mean = 𝑥̅ = 1400, Population mean μ = 1370, sample standard
deviation, s = 160 then using equation (2), we obtain

𝑥̅ −𝜇 𝑥̅ −𝜇 (1400−1390) 10
𝑍= = = = = 3.13
𝑠𝑥
̅ (𝑠⁄ ) (160/√250) 3.2
√𝑛
Since Zcal = 3.13 is more than its critical Zα= 2.33 at α = 0.01 level of significance. Thus the
null hypothesis is rejected. Therefore the mean lifetime of bulbs produced in general is higher
than the mean lifetime of 1390 hours at α = 0.01 level of significance.

Example 3: A packing device is set to fill detergent powder packets with a mean weight of 4
kg with a standard deviation of 0.20 kg. The weight of packets is known to drift upwards over
a period of time due to machine fault. A random sample of 80 packets is taken and weighed.
This sample has a mean weight of 4.03 kg. Can we conclude that the mean weight produced
by the machine has increased? Use a 5 per cent significance level.

Ho: μ ≥ 4
H1: μ < 4
Given n = 80, Sample mean = 𝑥̅ = 4.03, Population mean μ = 4, σ = 0.20 then using equation
(1), we obtain

𝑥̅ −𝜇 𝑥̅ −𝜇 (4.03−4) 0.03
𝑍= = 𝜎 = = = 1.36
𝜎𝑥̅ ( ⁄ ) (0.20/√80) 0.022
√𝑛

Since Zcal = 1.36 is less than its critical Zα= 1.645 at α = 0.05 level of significance the null
hypothesis is not rejected (i.e., accepted). Therefore we conclude that the mean weight may
be more than 4 kg.

2.2.1 Hypothesis Testing for Single Population Proportion – Large Samples


Just as means can be compared and examined for determining significance of their differences,
proportions can be subjected to significance testing. By testing the difference of two population
proportions, or difference between one sample and population, we can decide whether sample
proportion differs significantly from given sample proportion or whether two samples come from
populations having the same proportion of success.

In parametric tests for proportions, the null hypothesis will be ‘there is no significant difference
between sample proportion and population proportion. Proportion tests follow normal
distribution. Proportion tests are generally conducted as large sample tests.

To conduct a test of hypothesis, it is assumed that the sampling distribution of proportions follow
a standard normal distribution. Now let 𝑝̅ be the sample proportion and its standard deviation
be 𝜎𝑝̅ then the Z statistic is defined as

(𝑝̅ −𝑝0 ) (𝑝̅ −𝑝0 )


𝑍= = (5)
𝜎𝑝
̅ 𝑝 (1−𝑝0 )
√ 0
𝑛
where n is the sample size.

The three forms of null and alternative hypotheses pertaining to the hypothesised proportion
p0 are as follows:
Null hypothesis Alternative hypothesis
𝐻0 ∶ 𝑝 = 𝑝0 𝐻1 : 𝑝 ≠ 𝑝0 (two-tailed test)
𝐻0 ∶ 𝑝 ≥ 𝑝0 𝐻1 : 𝑝 < 𝑝0 (left-tailed test)
𝐻0 ∶ 𝑝 ≤ 𝑝0 𝐻1 : 𝑝 > 𝑝0 (right-tailed test)

Decision Rule:
 Reject H0 if Zcal ≤ - Zα/2 or Zcal ≥ Zα/2 (two-tailed)
 Reject H0 if Zcal ≥ Zα (right-tailed)
 Reject H0 if Zcal ≤ - Zα (left-tailed)

Example 6: A manufacturer claims that at least 95 per cent of the equipments which he
supplied to a factory confirmed to the specifications. An examination of the sample of 200
pieces of equipment revealed that 18 were faulty. Test the claim of the manufacturer at α =
0.05.

Solution: We now define the null and alternative hypotheses as


𝐻0 : 𝑝 ≥ 0.95
𝐻1 : 𝑝 < 0.95
This shows that we are using left-tailed test.
18
Given that n = 200, 𝑝̅ = (1 − ) =0.91. From equation (5), we get
200

(𝑝̅ −𝑝0 ) (𝑝̅ −𝑝0 )


𝑍= =
𝜎𝑝
̅ 𝑝 (1−𝑝0 )
√ 0
𝑛
(0.91 − 0.95) −0.04
= = = −2.67
0.015
√0.95𝑥0.05
200
Since Zcal = -2.67, which is less than its critical value i.e., Ztable = -1.65 at the given level of
significance. Thus the null hypothesis H0 is rejected. Hence we conclude that the proportion
of equipment confirming to specifications is not more than 95 percent.

2.2.2 Hypothesis Testing for Single Population Mean - Small Samples

Let μ0 be the hypothesized value of the population mean to be tested and 𝑥̅ be the sample
mean. Then the null and alternative hypotheses for two-tailed test are defined as

𝐻0 : 𝜇 = 𝜇0 and 𝐻1 : 𝜇 ≠ 𝜇0 Two-Tailed Test

𝐻0 : 𝜇 ≥ 𝜇0 and 𝐻1 : 𝜇 < 𝜇0 Left-Tailed Test

𝐻0 : 𝜇 ≤ 𝜇0 and 𝐻1 : 𝜇 > 𝜇0 Right -Tailed Test

If the standard deviation σ of the population is not known, then σ is estimated by the sample
standard deviation‘s’. The test statistic follows t-distribution with n-1 degrees of freedom.
Then the t-test statistic is given by

𝑥̅ −𝜇 𝑥̅ −𝜇
𝑡= = 𝑠 (7)
𝑠𝑥̅ ( ⁄ )
√𝑛

Here the sample standard deviation is calculated by


√∑(𝑥−𝑥̅ ) 2
𝑠= (8)
√(𝑛−1)
Decision rule:
Now the decision rule based on sample mean for the two-tailed test will be as follows:
 Reject H0 if tcal ≤ - tα/2 or tcal ≥ tα/2

Left Tailed Test

 Reject H0 if tcal ≤ - tα

Right Tailed Test


 Reject H0 if tcal ≥ tα

where tα and tα/2 are the table values with n-1 degrees of freedom (critical value of t).
Note: For large samples of 30 or more the t distribution is similar to a normal distribution.

Example 8: Maxwell’s hot Chocolate is concerned about the effect of the recent year long
coffee advertising campaign on hot chocolate sales. Maxwell’s has randomly selected 26
weeks from the past year and found average sales of 912 pounds with a standard deviation of
72 pounds. The average weekly hot chocolate sales two years ago was 980 pounds. Define
suitable hypothesis for testing whether hot chocolate sales have decreased. Use α=0.05 to test
this hypotheses.
Solution: Given that
Population mean, μ = 980, Sample mean, 𝑥̅ = 912,
Sample standard deviation, s = 72 Sample size, n = 26
We now define the null and alternative hypotheses as
𝐻0 : 𝜇 ≥ 980
𝐻1 : 𝜇 < 980
Since the sample size is small, we use t-test statistic to test the above hypothesis. From
equation equation (7), we get
𝑥̅ −𝜇 912−980 −68
𝑡= = = = −4.82
𝑠𝑥̅ (72/√26) 14.12
The degrees of freedom= (n-1) = 25. Now tcri= -1.753. Since tcal ≤ - tα, H0 is not accepted.
Thus we conclude that the hot chocolate sales have decreased.

Example 9: A sample of 24 observations reveals a sample mean of 68 and a sample standard


deviation of 4.4. The population mean is known as 62. Define suitable hypotheses to test
whether the population mean is significantly different from 62. Identify suitable test statistic
to test the hypotheses at α = 0.05. What is the value of the statistic?
Solution: We know that
Population mean, μ = 62 Sample mean, 𝑥̅ = 68,
Sample standard deviation, s = 4.4 Sample size, n = 24
We now define the null and alternative hypotheses as
𝐻0 : 𝜇 = 62
𝐻1 : 𝜇 ≠ 62
Since the sample size is small, we use t-test statistic, two tailed test, to test the above
hypothesis. From equation equation (7), we get
𝑥̅ −𝜇 68−62 6
𝑡= = = 0.9 = 6.67
𝑠𝑥̅ (4.4/√24)
The calculated test statistic value is 6.67. The degrees of freedom= (n-1) = 23. Now tcri= 2.06
at α = 0.05. Since tcal ≥ tα, H0 is rejected. Thus we conclude that the population mean is
significantly different from 62.

3. Hypothesis Tests-Two Samples


3.1 Hypothesis Testing between Two Populations Means
Let two independent random samples of large size n1 and n2 be drawn from two populations
respectively. Assume that ̅̅̅
𝑥1 and ̅̅̅
𝑥2 denote the sample means. The Z-test statistic used to
determine difference between the population means (𝜇1 − 𝜇2 ) is based on the difference
between the sample means. Thus the Z-test statistic is calculated by

(𝑥
̅̅̅1̅−𝑥
̅̅̅2̅)−(𝜇1 −𝜇2 ) (𝑥
̅̅̅1̅−𝑥
̅̅̅2̅)−(𝜇1 −𝜇2 )
Zstatistic = = 2 2 (1)
𝜎(𝑥 𝜎 𝜎
̅̅̅̅−𝑥
1 ̅̅̅̅)
2 √( 1 + 2 )
𝑛1 𝑛2

where 𝜎(𝑥̅̅̅1̅−𝑥̅̅̅2̅) denotes standard error of the statistic; (𝑥


̅̅̅1 − 𝑥
̅̅̅)
2 is the difference between
two sample means; (𝜇1 − 𝜇2 ) denotes the difference between population means i.e.,
hypothesised population parameter.

If 𝜎12 = 𝜎22 = σ (say), equation (1) reduces to

(𝑥
̅̅̅1̅−𝑥
̅̅̅2̅)−(𝜇1 −𝜇2 ) (𝑥
̅̅̅1̅−𝑥
̅̅̅2̅)−(𝜇1 −𝜇2 )
Zstatistic = = 1 1 (2)
𝜎(𝑥
̅̅̅̅−𝑥
1 ̅̅̅̅)
2 𝜎√( + )
𝑛1 𝑛2

If the standard deviations σ1 and σ2 are not known, then we may estimate the standard error of
the sampling distribution by substituting the sample standard deviations s1 and s2 as estimates
of the population standard deviations.

Two-Tailed Test
The null and alternative hypotheses are
𝐻0 : 𝜇1 − 𝜇2 = 𝑑0

𝐻1 : 𝜇1 − 𝜇2 ≠ 𝑑0

Right-Tailed Test
In this case the null and alternative hypotheses are as follows:
𝐻0 : 𝜇1 − 𝜇2 = 𝑑0

𝐻1 : 𝜇1 − 𝜇2 > 𝑑0

In this case the null and alternative hypotheses are as follows:


𝐻0 : 𝜇1 − 𝜇2 = 𝑑0

𝐻1 : 𝜇1 − 𝜇2 < 𝑑0

Here d0 is some specified difference that is to be tested. If there is no difference between two
population means, then d0 = 0.

Example 1: Independent random samples taken at two local malls provided the following
information regarding purchases by patrons of the two malls.
Hamilton Place Eastgate
Sample Size 80 75
Average Purchase $43 $40
Standard Deviation $ 8 $ 6
At 95% confidence, test to determine whether or not there is a significant difference between the
average purchases by the patrons of the two malls.

Solution: We assume the null hypothesis as there is no significant difference between the average
purchases by patrons of the two malls, then we have
H0: 1 - 2 = 0
Ha: 1 - 2  0
Note: The alternate hypothesis tells you that this is a two-tail test.

Here n1= 80 and n2 = 75, ̅̅̅


𝑥1 = 43 and 𝑥
̅̅̅2 = 40, s1 = 8 and s2 = 6. Now using equation (1), Z
statistic is obtained as
(𝑥
̅̅̅1̅−𝑥
̅̅̅2̅)−(𝜇1 −𝜇2 ) (𝑥
̅̅̅1̅−𝑥
̅̅̅2̅)−(𝜇1 −𝜇2 )
Zstatistic = = 2 2
𝜎(𝑥 𝑠 𝑠
̅̅̅̅−𝑥
1 ̅̅̅̅)
2 √( 1 + 2 )
𝑛1 𝑛2
(43−40) 3
= =
64
√( )+( )
36 √1.28
80 75

= 2.65
Since Zcal = 2.65 is more than the Zα/2 = 1.96. Thus there is no statistical evidence to accept H0 and
hence reject H0 which indicate that there is a difference in average purchases at two malls.

Example 2: An experiment was conducted to compare the mean time in days required to recover
from common cold for a person who are given daily dose of 4 mg of vitamin C versus those who were
not given vitamin supplement. Suppose that 35 adults were randomly selected for each treatment
category and observed the following information:

Vitamin C No Vitamin supplement


Sample size 35 35
Sample mean 5.8 6.9
Standard Deviation 1.2 2.9
Test the hypothesis that the use of vitamin C reduces that mean time required to recover from
common cold and its complications, at a 0.05 level of significance.

Solution: We now define the null and alternative hypotheses as


H0: 1 - 2 ≥ 0
Ha: 1 - 2 < 0

Note: The alternate hypothesis tells you that this is a single-tailed (left) test.

Here n1= 35 = n2, ̅̅̅


𝑥1 = 5.8 and ̅̅̅
𝑥2 = 6.9, s1 = 1.2 and s2 = 2.9. Now from equation (1),
(𝑥
̅̅̅1̅−𝑥
̅̅̅2̅)−(𝜇1 −𝜇2 ) (𝑥
̅̅̅1̅−𝑥
̅̅̅2̅)−(𝜇1 −𝜇2 )
Zstatistic = = 2 2
𝜎(𝑥 𝑠 𝑠
̅̅̅̅−𝑥
1 ̅̅̅̅)
2 √( 1 + 2 )
𝑛1 𝑛2

(5.8−6.9) (−1.1)
= 2 2
= = −2.07
0.53
√(1.2) +(2.9)
35 35

Since we are using one tailed test, then the Zα = 1.65 and Zcal > Zα, the null hypothesis is
rejected. Thus we conclude that the use of vitamin C does not reduce the mean time required to
recover from common cold.
3.2 Hypothesis Testing for Two Population Proportions
The hypothesis testing concepts developed in previous section can be extended to test
whether there is any difference between two population proportions. In this case the null
hypothesis is stated as
𝐻0 ∶ 𝑝1 = 𝑝2

and the alternative hypothesis will be


𝐻1 ∶ 𝑝1 ≠ 𝑝2.
The sampling distribution of difference in sample proportions (𝑝
̅̅̅1 − ̅̅̅)
𝑝2 is based on the
assumption that the difference between two population proportions i.e., (𝑝1 − 𝑝2 ), is
normally distributed. Thus the Z-test statistic is given by

((𝑝̅1 −𝑝̅2 ) ) ((𝑝̅1 −𝑝̅2 ) )


𝑍= = (3)
𝜎(̅̅̅ 𝑝 (1−𝑝1 ) 𝑝2 (1−𝑝2 )
𝑝 −𝑝
1 ̅̅̅)
2 √ 1 +
𝑛1 𝑛2

Example 3: Out of a sample of 600 men from a city 450 were smokers. In another sample of 900
men another city, 450 were smokers. Do the data indicate that the cities are significantly different
in smoking habit in the two cities? Test the hypothesis at α = 0.05.
Solution: We define the null and alternative hypotheses as
𝐻0 ∶ 𝑝1 = 𝑝2

𝐻1 ∶ 𝑝1 ≠ 𝑝2.
Here we are using two-tailed test. Given that
450 450
n1= 600, n2= 600, 𝑝
̅̅̅1 = = 0.75, ̅̅̅
𝑝2 = 900 = 0.50
600

We now use Z-test statistic


̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
((𝑝̅1 −𝑝̅2 )−(𝑝1 −𝑝2 )) ((𝑝̅1 −𝑝̅2 ) )
𝑍= =
𝑠(𝑝̅̅̅−𝑝̅̅̅) 1 1
1 2 √𝑝̅ (1−𝑝̅ )(𝑛 +𝑛2)
1
𝑛1 𝑝
̅̇ 1 + 𝑛2 ̅
𝑝2 600𝑥0.75+900𝑥0.50 900
where 𝑝̅ = = = = 0.60.
𝑛1 +𝑛2 600+900 1500
Thus we have
(0.75−0.50) 0.25 0.25
𝑍= = = = 9.26
1
√0.60𝑥0.40( + )
1 √0.24𝑥0.003 0.027
600 900

Since Zcal = 9.26 which is less than its critical value i.e., Z0.025 =2.58 at the given level of
significance. Thus the null hypothesis H0 is rejected. Hence we conclude that there is
significant difference in the smoking habit of two cities.

3.3 Hypothesis Testing for Single Population Mean – Small sample size
Let μ0 be the hypothesized value of the population mean to be tested and 𝑥̅ be the sample
mean. Then the null and alternative hypotheses for two-tailed test are defined as

𝐻0 : 𝜇 = 𝜇0 and 𝐻1 : 𝜇 ≠ 𝜇0 Two-Tailed Test


𝐻0 : 𝜇 ≥ 𝜇0 and 𝐻1 : 𝜇 < 𝜇0 Left-Tailed Test

𝐻0 : 𝜇 ≤ 𝜇0 and 𝐻1 : 𝜇 > 𝜇0 Right -Tailed Test

If the standard deviation σ of the population is not known, then σ is estimated by the sample
standard deviation‘s’. The test statistic follows t-distribution with n-1 degrees of freedom.
Then the t-test statistic is given by

𝑥̅ −𝜇 𝑥̅ −𝜇
𝑡= = 𝑠 (4)
𝑠𝑥̅ ( ⁄ )
√𝑛

Here the sample standard deviation is calculated by


√∑(𝑥−𝑥̅ ) 2
𝑠= (5)
√(𝑛−1)
Decision rule:
Now the decision rule based on sample mean for the two-tailed test will be as follows:
 Reject H0 if tcal ≤ - tα/2 or tcal ≥ tα/2

Left Tailed Test

 Reject H0 if tcal ≤ - tα

Right Tailed Test


 Reject H0 if tcal ≥ tα

where tα and tα/2 are the table values with n-1 degrees of freedom (critical value of t).

Note: For large samples of 30 or more the t distribution is similar to a normal distribution.

Example 4: Maxwell’s hot Chocolate is concerned about the effect of the recent year long
coffee advertising campaign on hot chocolate sales. Maxwell’s has randomly selected 26
weeks from the past year and found average sales of 912 pounds with a standard deviation of
72 pounds. The average weekly hot chocolate sales two years ago was 980 pounds. Define
suitable hypothesis for testing whether hot chocolate sales have decreased. Use α=0.05 to test
this hypotheses.
Solution: Given that
Population mean, μ = 980, Sample mean, 𝑥̅ = 912,
Sample standard deviation, s = 72 Sample size, n = 26
We now define the null and alternative hypotheses as
𝐻0 : 𝜇 ≥ 980
𝐻1 : 𝜇 < 980
Since the sample size is small, we use t-test statistic to test the above hypothesis. From
equation equation (5), we get
𝑥̅ −𝜇 912−980 −68
𝑡= = = = −4.82
𝑠𝑥̅ (72/√26) 14.12
The degrees of freedom= (n-1) = 25. Now tcri= -1.753. Since tcal ≤ - tα, H0 is not accepted.
Thus we conclude that the hot chocolate sales have decreased.

Example 5: A sample of 24 observations reveals a sample mean of 68 and a sample standard


deviation of 4.4. The population mean is known as 62. Define suitable hypotheses to test
whether the population mean is significantly different from 62. Identify suitable test statistic
to test the hypotheses at α = 0.05. What is the value of the statistic?
Solution: We know that
Population mean, μ = 62 Sample mean, 𝑥̅ = 68,
Sample standard deviation, s = 4.4 Sample size, n = 24
We now define the null and alternative hypotheses as
𝐻0 : 𝜇 = 62
𝐻1 : 𝜇 ≠ 62
Since the sample size is small, we use t-test statistic, two tailed test, to test the above
hypothesis. From equation equation (7), we get
𝑥̅ −𝜇 68−62 6
𝑡= = = 0.9 = 6.67
𝑠𝑥̅ (4.4/√24)
The calculated test statistic value is 6.67. The degrees of freedom= (n-1) = 23. Now tcri= 2.06
at α = 0.05. Since tcal ≥ tα, H0 is rejected. Thus we conclude that the population mean is
significantly different from 62.

3.4 Hypothesis Testing for Two Independent Population Means


In the case of a t test for two independent samples the point of interest is to test the difference between
the two population means i.e., (𝜇1 − 𝜇2 ). Since 𝑥 ̅̅̅1 and ̅̅̅
𝑥2 denote the sample means and also
point estimators to draw inferences about (𝜇1 − 𝜇2 ).
The two populations are sampled and the means and variances computed based on samples of sizes n1
and n2 if both populations have the same variance, a pooled variance estimate is computed from the
two sample variances.

Now the t- test statistic is defined as


𝑥1 −𝑥
(̅̅̅̅ 2 )−(𝜇1 −𝜇2 )
̅̅̅̅ 𝑥1 −𝑥
(̅̅̅̅ 2 )−(𝜇1 −𝜇2 )
̅̅̅̅
𝑡= = (5)
𝑠(̅̅̅̅
𝑥 −𝑥 ̅̅̅̅) 𝑠√(𝑛1 +𝑛1 )
1 2 1 2
2
where ‘s ’ is the weighted pooled variance and is obtained by
∑(𝑥1 −𝑥̅ 1 )2 +∑(𝑥2 −𝑥̅ 2 )2
𝑠2 = (6)
(𝑛1 +𝑛2 −2)

or
(𝑛1 −1)𝑠12 +(𝑛2 −1)𝑠22
𝑠2 = (7)
(𝑛1 +𝑛2 −2)
The t statistic computed under the null hypothesis has a df = n1 + n2 – 2. Hence, the probability
associated with t can be computed. If the variances are unequal, an approximation to t is computed.
The number of degrees of freedom is obtained by rounding to the nearest integer.
Two-Tailed Test
The null and alternative hypotheses are
𝐻0 : 𝜇1 − 𝜇2 = 𝑑0
𝐻1 : 𝜇1 − 𝜇2 ≠ 𝑑0

Right-Tailed Test
In this case the null and alternative hypotheses are as follows:
𝐻0 : 𝜇1 − 𝜇2 = 𝑑0
𝐻1 : 𝜇1 − 𝜇2 > 𝑑0

Left-Tailed Test
In this case the null and alternative hypotheses are as follows:
𝐻0 : 𝜇1 − 𝜇2 = 𝑑0
𝐻1 : 𝜇1 − 𝜇2 < 𝑑0

Here d0 is some specified difference that is to be tested. If there is no difference between two
population means, then d0 = 0.

Example 6: As per Wall Street Journal, Gasoline prices reached record high levels in 16
states during 2003. Two of the affected states were California and Florida. The American
Automobile Association reported a sample mean price of $1.72 per gallon in Florida and $
2.16 per gallon in California. They used a sample size of 25 for the California and 30 for the
Florida and found the standard deviation as 0.10 in California and 0.08 in Florida. Define
suitable hypothesis and test at α =0.05.

Solution: Here we have

n1= 25 and n2 = 30, 𝑥


̅̅̅1 = 2.16 and 𝑥
̅̅̅2 = 1.72, s1 = 0.1 and s2 = 0.08.

Since we are comparing two populations, it is reasonable to assume that the null hypothesis as there is
no significant difference between the average prices in the two states, then we have

H0: 1 - 2 = 0
Ha: 1 - 2  0

Now the weighted pooled variance ‘s2’ is obtained by equation (11). Thus we have
(𝑛1 −1)𝑠12 +(𝑛2 −1)𝑠22 (25−1)(0.12 )+(30−1)(0.082 )
𝑠2 = =
(𝑛1 +𝑛2 −2) 25+30−2
0.24+0.19
= = 0.08
53

So the value of pooled standard deviation, s = 0.28.


Now using equation (9), t statistic is calculated as
𝑥1 −𝑥
(̅̅̅̅ 2 )−(𝜇1 −𝜇2 )
̅̅̅̅ 𝑥1 −𝑥
(̅̅̅̅ 2 )−(𝜇1 −𝜇2 )
̅̅̅̅
𝑡= =
𝑠(̅̅̅̅
𝑥 −𝑥 ̅̅̅̅) 𝑠√(𝑛1 +𝑛1 )
1 2 1 2
(2.16−1.76)−0 0.4
= 1 1
= 0.27 = 1.48
(0.28)√( )+( )
25 30

The degrees of freedom= (25+30-1 = 53). Now tcri= 2.021 at α = 0.05. Since tcal ≤ tα, H0 is
not rejected. Thus we conclude that there is no significant difference in the prices of two
states.

3.5 Hypothesis Testing for Paired (dependent) Samples


Let 𝜇𝑑 the mean of the difference values for the population. Then this value is compared with
‘zero’ or some hypothesised value using the t-test statistic for a single sample. Since the
population standard deviation of the differences is not known, we use t-statistic to test the
hypothesis. So we compute t for paired samples the paired difference variable is formed and
its mean and variance are calculated. Then the t statistic is computed. This is also called as t-
test for ‘paired test’. If n is the number of pairs, the degrees of freedom are  n  1 . The t-test
statistic is given by
𝑑̅ −𝜇𝑑
𝑡= (8)
𝑠 𝑑 ⁄ √𝑛

where 𝑑̅ be the mean of the difference of the paired sample data, 𝑠𝑑 be the standard deviation of the
differences and is given by
(𝑑−𝑑̅ )2
𝑠𝑑 = √ 𝑛−1
(9)

Example 7: The daily production rates for a sample of factory workers before and after a training
program are shown below.

Worker 1 2 3 4 5
Before 6 10 9 8 7
After 9 12 10 11 9
At 95% confidence test to see if the training program was effective. That is, did the training program
actually increase the production rates?
Solution: In this case we define the following hypotheses:

H0: d  0
Ha: d > 0
We now prepare the following table to test the hypothesis:

Worker Before After Difference (d) (𝒅 − 𝒅̅ )𝟐


1 6 9 3 0.8
2 10 12 2 -0.2
3 9 10 1 -1.2
4 8 11 3 0.8
5 7 9 2 -0.2
From the above table, ∑ 𝑑 = 11, 𝑑̅ = 2.2, n = 5, Ʃ(𝑑 − 𝑑̅)2 = 2.8 and the standard deviation is

(𝑑−𝑑̅ )2 2.8
𝑠𝑑 = √ 𝑛−1
= √ 4 = 0.84

Now the t-statistic is obtained from equation (11). Thus we have


𝑑̅−𝜇𝑑 2.2−0
𝑡= = = 5.85
𝑠𝑑 ⁄√𝑛 0.84/√5

The degrees of freedom= (5-1 = 4). Now tcri= 2.132 at α = 0.05. Since tcal ≥ tα, H0 is
rejected. Thus we conclude that the training programme was effective.

Example 8: To determine the effectiveness of a new weight control diet, six randomly selected
students observed the diet for 4 weeks with the results shown below.

Dieter A B C D E F
Weight Before 138 151 129 125 152 140
Weight After 136 149 136 127 146 144
a. Set up the null and alternative hypotheses to see if this diet is effective.
b. Find the mean and standard deviation of the differences.
c. Test the hypothesis stated in Part a.
d. State your conclusion.
Solution: In this case we define the following hypotheses:

H0: d = 0
Ha: d ≠ 0
We now prepare the following table to test the hypothesis:

Difference 𝟐
Worker Before After (d) ̅)
(𝒅 − 𝒅
A 138 136 -2 4.15
B 151 149 -2 4.15
C 129 136 7 43.89
D 125 127 2 2.64
E 152 146 -6 40.64
F 140 144 4 13.14

From the above table, ∑ 𝑑 = 3, 𝑑̅ = 0.375, n = 6, Ʃ(𝑑 − 𝑑̅)2 = 108.62 and the standard deviation is

(𝑑−𝑑̅ )2 108.65
𝑠𝑑 = √ 𝑛−1
= √ 5
= 4.66

Now the t-statistic is obtained from equation (11). Thus we have


𝑑̅−𝜇𝑑 0.375−0
𝑡= = = 0.197
𝑠𝑑 ⁄√𝑛 4.66/√6

The degrees of freedom= (6-1 = 5). Now tcri= 2.571 at α/2 = 0.025. Since tcal < tα/2, H0 is not
rejected. Thus we conclude that new weight control diet is not effective.

3.6 Merits of parametric tests


Parametric tests are widely applied to solve decision making problems relating to business
and industry. It is due to following reasons
1. Simple – parametric tests are the most simple to understand, explain and prove. Test results
have a direct bearing with the procedure
2. Efficient – parametric tests are considered efficient and powerful. It is due to the fact that
they are based on valid assumptions about population.
3. Realistic – since the parametric tests are based on parameters and statistics, which are true
representatives of large mass of data , they are considered realistic and more or less accurate
4. Prediction – parametric tests are based on Central Limit theorem, and form of distribution.
This enables valid predictions and projections.
5. Sharp – parametric tests do not make use of ranks or signs, unlike in parametric tests. So
the results will be more sharp.

4. Chi square Test

4.1 Introduction
In earlier lessons, prominent differences between sampling distributions have been previously
studied through parameters like mean, standard deviation, proportion etc, which are the
estimates of the parameters of the populations but generally these do not give all the features
of these distributions. This caused the necessity to have some index which can measure the
degrees of difference between the actual frequencies of various groups and can thus compare
all necessary features of them. These tests are easier to explain and easier to understand. This
is the reason why such tests have become popular. But one should not forget the fact that they
are usually less efficient or powerful as they are based on no assumptions, and we all know
that the less one assumes, the less one can infer from a set of data. But then the other side
must also be kept in view that the more one assumes, the more one limits the applicability of
one’s methods. Non parametric tests are quantitative techniques designed for such situations.

A statistical test is a formal technique, based on some probability distributions, for arriving at
a decision about the reasonableness of an assertion or hypothesis. The test technique makes
use of one or more values, obtained from sample data to arrive at a probability statement
about the hypothesis. But such a test technique also makes use of some more assertions about
the population from which the sample is drawn. For instance, it may be assumed that
population is normally distributed, sample drawn is a random sample and similar other
assumptions. The normality of the population distribution forms the basis for making
statistical inferences about the sample drawn from the population. But no such assumptions
are made in case of non parametric tests. Chi-Square test, Signed Rank tests, Rank sum tests,
Wilcoxon Matched-Pairs Signed-Ranks Test, Mann-Whitney Test are some popular non
parametric tests. But we shall discuss here only Chi-Square test.

4.2 Features of Non parametric tests


Non-Parametric test do not assume any distribution, or is not based on any statistic or
parameter. They exhibit following features:
1- Non parametric test is not based on standard error concept. They directly deal with the
observations.
2- They do not suppose any particular distribution and consequential assumptions
3- They are rather quick and easy to use, ie, they do not require laborious computations since
in many cases the observations are replaced by their rank or order and in many cases we
simply use signs.
4- They are often not as efficient or sharp as tests of significance or the parametric tests. An
interval estimate with 95% confidence may be twice as large with the use of non parametric
tests as with regular standard methods.
5- When our measurements are not as accurate as is necessary for standard tests of
significance, the non parametric methods come to our rescue which can be used fairly
satisfactorily.
6- Parametric tests cannot apply to ordinal or nominal scale but non parametric tests do not
suffer from any such limitation.
7- Parametric tests of difference like t test or F test make assumption about the homogeneity
of the variance whereas this is not necessary for non parametric tests of difference.

4.3 Difference between Parametric and Non-Parametric Testing


Both parametric and non-parametric tests are significance tests, because they examine
significance of differences between given values. But they have to be distinguished, on
following
basis:
Parametric tests Non-Parametric tests
1. Based on assumption about distribution 1. No assumption about distribution
2. Based on statistics and parameters 2. No statistics and parameters
3. Focus on SE Concept and 3. Does not focus on SE Concept and
Level of Significance Level of Significance
4. Mostly interval scaled or ratio data 4. Mostly nominal scaled or ordinal data
5. Precise mathematical analysis 5. No precise mathematical analysis
6. Handle variables 6. Handle mostly attributes

4.4 Assumptions in Non - Parametric tests


Non-parametric tests belong to significant tests, designed for special situations. They come to our
rescue, subject to following assumptions:
1. Sample observations are independent.
2. Samples drawn are randomly selected.
3. Observations are measured at least on ordinal scale.

4.5 Chi Square Test


In this section we discuss the most important Chi Square test. From a series of observation,
different statistics are constructed to estimate population parameters. In general, sampling
distribution of the statistic depends on parameter and form of population. The difference between
distributions has been studied through constants such as mean, proportion, etc. They may not
truly represent a distribution. This caused the necessity to have some index which can measure
the degree of difference between actual frequencies and expected frequencies directly, without
any representative value. An index of this type is Karl Pearson’s Chi Square which is used to
measure the deviations of observed frequencies in an experiment from the expected
frequencies obtained from some hypothetical universe. In this lesson, we are going to study a
distribution called chi square distribution which enables us to compare a whole set of sample
values with a corresponding set of hypothetical values. Chi square distribution was
discovered by Helmert in 1875 and was again discovered independently in 1900 by Karl
Pearson who applied it as a test of goodness of fit.

The mean of a Chi Square distribution is its degrees of freedom. Chi Square distributions are
positively skewed, with the degree of skew decreasing with increasing degrees of freedom. As
the degrees of freedom increases, the Chi Square distribution approaches a normal distribution.
Figure 1 shows density functions for three Chi Square distributions. Notice how the skew
decreases as the degrees of freedom increases.

Figure 1. Chi Square distributions with 2, 4, and 6 degrees of freedom.


The Chi Square distribution is very important because many test statistics are approximately
distributed as Chi Square. Two of the more common tests using the Chi Square distribution
are tests of deviations of differences between theoretically expected and observed frequencies
(one-way tables) and the relationship between categorical variables (contingency tables).
Numerous other tests beyond the scope of this work are based on the Chi Square distribution.
Chi-squire test is a significant test, which is not based on any parameter like mean, variance
or proportion. Therefore such tests are distinguished as non-parametric test.

Note: On the basis of situation, nature and purpose of test, chi-square test may be classified
as – test of independence of attributes, test of goodness of fit , test of homogeneity, and test
for variance.

4.6 Chi-Square Test for Cross-Tabulation Tables (Test of Independence):


One of the simplest techniques for describing sets of relationships is the cross-tabulation. cross-
tabulation, or contingency table, is a joint frequency distribution of observations on two or more
sets of variables. The chi-square distribution provides a means for testing the statistical
significance of contingency tables. This allows us to test for differences in two groups’
distribution across categories. The logic behind the chi-square test is that of comparing the
observed frequencies (Oi) with the expected frequencies (Ei).
The chi-square test allows us to conduct tests for significance in the analysis of an R x C
contingency table (where R = row and C = column).

To calculate the chi-square statistic, the following formula is used:


2
2 O i  E i 
X  
Ei (10)
where  2
= chi-square test statistic
Oi = observed frequency in the ith cell
Ei = expected frequency in the ith cell

As in univariate chi-square test a frequency count of data that nominally identify or categorically
rank groups is acceptable for the chi-square test for contingency tables.

We begin, as in all hypothesis-testing procedures, by formulating the null hypothesis and


selecting the level of confidence for the particular problem. As an example, suppose that we
wish to test the null hypothesis that an equal number of men and women are aware of the brand
of tyre of an automobile and that the hypothesis test will be made at the 0.05 level of
significance.

To compute the chi-square value the researcher must first identify an expected distribution for
that table. Under the null hypothesis the same proportion of positive answers (60 percent) should
come from both groups.

There is an easy way to calculate the expected frequencies for the cells in a cross-tabulation
table. To compute an expected number for each cell use the formula

R iC j
E ij 
n (11)
where Ri = total observed frequency in the ith row
Cj = total observed frequency in the jth column
n = sample size

To compute a chi-square statistic the same formula as before is used, except that we calculate
degrees of freedom as the number of rows minus one (R - 1) times the number of columns minus
one (C - 1). As an example, for a 2X2 table, the number of degrees of freedom equals 1:

(R - 1)(C - 1) = (2 - 1)(2 - 1) = 1

Note: Proper use of the chi-square test requires that each expected cell frequency (Eij) have a
value of at least five. If this sample size requirement is not met, the researcher may take a larger
sample or may combine (“collapse”) response categories.

Example 9: A financial consultant is interested in the differences in capital structure within


different firm sizes in a certain industry. The consultant surveys a group of firms with assets
of different amounts and divides the firms into three groups. Each firm is classified according
to whether its total debt is greater than stockholders’ equity or whether its total debt is less
than stockholders’ equity. The results of the survey are as follows:
Firm size (in $ thousands)
<500 500-2000 >2000
Debt less than equity 7 10 8
Debt greater than equity 10 16 9
Do the three firm sizes have the same capital structure? Use the 0.10 significance level.

Solution: Let us assume that the null hypothesis that the firm size is independent of debt.
H0: Firm is independent of debt
Ha: Firm is not independent of debt
Since we are looking for a relationship between company size and debt, the most appropriate
test is the Chi-square. So we use equation (17).
Firm size (in $ thousands) Row
<500 500-2000 >2000
Total
Debt less than equity 7 10 8 25
Debt greater than equity 10 16 9 35
Column Total 17 26 17
Here the total sample size, n = 60. Now the expected frequencies, according to (18), are
25 𝑥 17
E11 = = 7.08
60
25 𝑥 26
E12 = = 10.83
60
25 𝑥 17
E13 = = 7.08
60
35 𝑥 17
E21 = = 9.92
60
35𝑥26
E22 = = 15.17
60
35 𝑥 17
E21 = = 9.92
60
The following table is prepared:
Oi Ei (Oi- Ei) (Oi- Ei)2 (Oi - Ei)2/ Ei
7 7.08 -0.08 0.01 0.00
10 10.83 -0.83 0.69 0.06
8 7.08 0.92 0.85 0.12
10 9.92 0.08 0.01 0.00
16 15.17 0.83 0.69 0.05
9 9.92 -0.92 0.85 0.09
From the table, we have
2
2 O i  E i 
X  
Ei
= 0.32
2
Here r = 2, c = 3, then the degrees of freedom = (r-1)(c-1) = 2, so the critical value 𝜒𝑐𝑟𝑖 =
2 2
4.61. Since 𝜒𝑐𝑎𝑙 < 𝜒𝑐𝑟𝑖 , we accept the null hypothesis. Thus we conclude that the firm size
is not depending on the debt of the company.

Example 10: Five hundred students in a school were graded according to their intelligence
and the economic conditions at their homes. Examine whether there is any association
between economic conditions at home and intelligence:
Economic Intelligence
Conditions Good Bad
Rich 125 100
Poor 75 150
Define suitable hypotheses and test at 5% level of significance.

Solution: We now define the following


H0: There is no relation between Intelligence & economic condition at homes
Ha: There is a relation between Intelligence & economic condition at homes
Here the total sample size n = 450, r =2, c = 2 and row totals are 225 & 225 and the column
totals are 200 & 250. The expected frequencies are
225 𝑥 200 225 𝑥 250
𝐸11 = = 100 𝐸12 = = 125
450 450

225 𝑥 200 225 𝑥 250


𝐸21 = = 100 𝐸22 = = 125
450 450

The following table is prepared:


Oi Ei (Oi- Ei) (Oi- Ei)2 (Oi - Ei)2/ Ei
125 100 25.00 625.00 6.25
75 100 -25.00 625.00 6.25
100 125 -25.00 625.00 5.00
150 125 25.00 625.00 5.00
From the table, we have
2
2 O i  E i 
X  
Ei
= 22.50
2
Since r = 2, c = 2, then the degrees of freedom = (r-1)(c-1) = 1, so the critical value 𝜒𝑐𝑟𝑖 =
2 2
5.59. Since 𝜒𝑐𝑎𝑙 > 𝜒𝑐𝑟𝑖 , we do not accept the null hypothesis. Thus we conclude that there
is a relation between intelligence and economic conditions at homes.

5. Hypothesis Testing for F-Distribution


The F-Test is a procedure for comparing one sample variance with another sample variance. The
key question is whether the two sample variances are different from each other or if they are
from the same population. The F-test utilizes measures of sample variance rather than the sample
standard deviation because summation is allowable with the sample variance.
When independent random samples of sizes n1 and n2 are drawn from two normally distributed
populations, then the ratio
𝑠12 /𝜎12
𝐹= (14)
𝑠22 /𝜎22

follows F distribution with df1= n1 -1 and df2= n2 -1. Here 𝑠12 and 𝑠22 are the variances of two
sample and are given by
∑(𝑥1 −𝑥̅ 1 )2 ∑(𝑥2 −𝑥̅ 2 )2
𝑠12 = and 𝑠22 = (15)
(𝑛1 −1) (𝑛2 −1)

If the two populations have equal variances i.e., 𝜎12 = 𝜎22 , then the ratio become
𝑠12
𝐹= (s1> s2) (16)
𝑠22

has a probability F distribution with df1= n1 -1 for numerator and df2= n2 -1 for denominator.
Note: 1.The larger the ratio the greater the value of F and if the F value is large, the results are
likely to be statistically significant.
2. To test the null hypothesis of no difference between the sample variances, a table of the F-
distribution is necessary. Use of F-table is much like using the tables of the Z- and t-
distributions. These tables indicate that the distribution of F is actually a family of distributions
that changes quite drastically with changes in sample sizes. Thus degrees of freedom must be
specified. Inspection of an F table allows the researcher to determine the probability of finding
an F as large as the calculated F.
Example 13: The following data relate to the number of units of an item produced per shift by
two workers A and B for a number of days:
A 17 22 25 27 23 20 26
B 29 35 32 33 28 27 25 28

Can it be referred that worker A is more stable compared to worker B? Define the hypothesis and
test at a significance level of 0.05.
Solution: We suppose that the null hypothesis as two workers are stable (i.e., variability in rate)
H0: 𝜎𝐴2 = 𝜎𝐵2
H1: 𝜎𝐴2 ≠ 𝜎𝐵2
Prepare the following table to calculate the two sample variances:
Worker A Worker B
X1 X2 (𝑋1 − 𝑋̅1 )2 (𝑋2 − 𝑋̅2 )2
17 29 34.31 0.39
22 35 0.73 28.89
25 32 4.59 5.64
27 33 17.16 11.39
23 28 0.02 2.64
20 27 8.16 6.89
26 25 9.88 21.39
28 2.64
From the above table, we have

n1= 7 and n2 = 8, ̅̅̅


𝑥1 = 22.86 and ̅̅̅
𝑥2 = 29.63,
Ʃ(𝑋1 − 𝑋̅1 )2 =74.86, Ʃ(𝑋2 − 𝑋̅2 )2 =79.86

Now from equation (15)


∑(𝑥1 −𝑥̅ 1 )2 74.86
𝑠12 = = = 12.48
(𝑛1 −1) 6

∑(𝑥2 −𝑥̅ 2 )2 79.87


𝑠22 = = = 11.41
(𝑛2 −1) 7

The F-test statistic is given by


𝑠12 12.48
𝐹= = = 1.094
𝑠22 11.41

Since the critical value of F is F6,7,0.05 = 3.58 is more than the Fcal ==1.094, the null
hypothesis is accepted. Thus we conclude that both workers are stable in terms of their
production capacity.

Practice Problems
1. Define sampling distribution.
2. Define standard error of estimate
3. Discuss various types of estimates.
4. Define Type-I and Type-II errors.
5. What is known as parametric test? What are various parametric tests?
6. What are null and alternate hypotheses?
7. What are the components of hypothesis testing?
8. A sample of 16 elements from a normally distributed population is selected. The sample
mean is 10 with a standard deviation of 4. Find the 95% confidence interval for .
9. In order to estimate the average time spent on the computer terminals per student at a
local university, data were collected for a sample of 81 business students over a one week
period. Assume the population standard deviation is 1.2 hours. If the sample mean is 9 hours,
then find the 95% confidence interval.
10. You are given the following information:
n = 49, x = 54.8, s = 28 and H0:  = 50, Ha:   50
11. What is the appropriate statistic to test the hypotheses? Calculate the test statistic.
12. A soft drink filling machine, when in perfect adjustment, fills the bottles with 12 ounces
of soft drink. A random sample of 25 bottles is selected, and the contents are measured. The
sample yielded a mean content of 11.88 ounces, with a standard deviation of 0.24 ounces.
With a 0.05 level of significance, test to see if the machine is in perfect adjustment.
13. Maxwell’s hot Chocolate is concerned about the effect of the recent year long coffee
advertising campaign on hot chocolate sales. The average weekly hot chocolate sales two
years ago was 984.7 pounds and the standard deviation was 72.6 pounds. Maxwell’s has
randomly selected 30 weeks from the past year and found average sales of 912.1 pounds.
Define suitable hypotheses for testing whether hot chocolate sales have decreased. Use
α=0.02 to test this hypotheses.
14. An article about driving practices in Strathcona County, Alberta, Canada claimed that
46% of the drivers did not stop at stop sign intersections on county roads (Edmonton Journal,
July 19, 2000). Two months later, a follow-up study collected data in order to see whether
this percentage had changed and found 420 out of 820 drivers did not stop at stop sign
intersections. Formulate the hypothesis to determine whether the proportion of drivers who
did not stop at stop sign intersections had changed. Test the hypotheses at α=0.05.
15. According to Bureau of labour statistics, the average weekly pay for US production
worker was $ 441.84 with a standard deviation of $90. A sample of 56 workers revealed that
the average weekly pay was $420.64. Test the hypotheses whether the average weekly pay of
production worker is significantly changed. Use α= 0.05.
16. Determine a statistical hypothesis and perform a chi-square test on the following survey
data: Easy-to-listen music should be played on the office intercom.
Agree 40
Neutral 35
Disagree 25
100
17. An advertising firm is trying to determine the demographics for a new product. They have
randomly selected 75 people in each of 5 different age groups and introduced the product to
them. The results of the survey are given below:
Age Groups
Future Activity 18-29 30-39 40-49 50-59 60-69
Purchase frequently 12 18 17 22 32
Seldom purchase 18 25 29 24 30
Never purchase 45 32 29 29 13
(a) State the null and alternative hypotheses.
(b) Calculate the sample chi-square value.
(c) If the level of significance is 0.02, should the null hypothesis be rejected?

***

You might also like