Professional Documents
Culture Documents
1. Testing of Hypothesis
1.1 Introduction
One of the objectives of statistical investigation is to evaluate whether there is significant
difference between the estimated parameter and true parameter after estimating population
mean or proportion. Naturally a question arises “Does the estimated parameters confirm to
real parameter, or, is there any considerable difference between them”, the answer leads us to
the evaluation of difference. There are tests to assess the significance of such difference,
which are called significance tests.
The basis of statistical tests is hypothesis. First we form a hypothesis regarding the
population. Then we conduct a test to assess whether there is any significant difference. The
hypothesis will be accepted or rejected according the significance of difference revealed by
the test. Therefore, significance test is also called hypothesis tests.
Standard error
A sampling distribution has several sample means or other statistics. A grand mean of such a
sampling distribution can be ascertained, and deviations calculated. The standard deviation of
sampling distribution of a statistic is called standard error of that statistic. The standard error
is very useful quantitative tool in hypothesis testing. Generally the standard error is 𝜎/√𝑛
where σ is the standard deviation and n is the sample size.
Interval Estimation
• To use sample statistics to estimate population parameters.
• Statistical inference is based on estimation and hypothesis testing.
• Types of estimates:
Point estimate: A single number that is used to estimate an unknown
population parameter.
Interval estimate: A range of values used to estimate a population parameter.
• Any sample statistic that is used to estimate a population parameter is called an
estimator. An estimate is a specific observed value of a statistic.
• Properties of a good estimator:
Unbiasedness
Efficiency
Consistency
Sufficiency
Point Estimates
• Point estimator of population mean (μ): Sample mean (x).
• Point estimator of population variance (σ2): Sample variance (s2).
• Point estimator of population standard deviation (σ): Sample standard deviation (s).
• Point estimator of population proportion (p): Sample proportion (p).
Interval Estimates
• An interval estimate describes a range of values within which a population parameter
is likely to appear.
• Interval estimate is constructed using point estimate, standard error and corresponding
probability.
• The probability that we associate with an interval estimate is called the confidence
level. The confidence interval is the range of the estimate we are making. Confidence
limits are the upper and lower limits of the confidence interval.
• A high confidence level seems to signify a high degree of accuracy in the estimate but
high confidence levels will produce large confidence intervals, and such large
intervals are not precise; they give very fuzzy estimates.
• 95% confidence interval means: “That if we select many random samples of the same
size and calculate a confidence interval for each of these samples, then in about 95
percent of these cases, the population parameter will lie within that interval.”
• Interval estimate for mean of large sample (Population s.d. is known):
x z / 2
n
• Interval estimate for mean for large sample (Population s.d. is unknown):
x zα/2 ( s / n )
• When the population standard deviation is not known and the sample size is 30 or
less, t-distribution is used.
• Assumption for using t-distribution: The population is normal or approximately
normal.
• The t-distribution is symmetrical but flatter than the normal distribution, and there is a
different t-distribution for different sample sizes (or degrees of freedom). As the
sample size gets larger, the shape of the t-distribution becomes approximately equal to
the normal distribution.
• Interval estimate for mean using t-distribution:
x tα/2 ( s / n)
The critical value here is the right (or upper) tail. It is quite possible to have one sided tests
where the critical value is the left (or lower) tail. For example, suppose the cloud seeding is
expected to decrease rainfall. Then the null hypothesis could be as follows:
H0: µ ≥ 20 (i.e. average rainfall does not decrease after cloud seeding)
H1: µ < 20 (i.e. average rain decreases after cloud seeding)
Two-tailed hypothesis testing doesn’t specify a direction of the test. For the cloud seeding
example, it is more common to use a two-tailed test. Here the null and alternative hypotheses
are as follows.
H0: µ = 20
H1: µ ≠ 20
The reasons for using a two-tailed test is that even though the experimenters expect cloud
seeding to increase rainfall, it is possible that the reverse occurs and, in fact, a significant
decrease in rainfall results. To take care of this possibility, a two tailed test is used with the
critical region consisting of both the upper and lower tails.
Figure 3 – Two-tailed hypothesis testing
In this case we reject the null hypothesis if the test statistic falls in either side of the critical
region. To achieve a significance level of α, the critical region in each tail must have size α/2.
Since our sample usually only contains a subset of the data in the population, we cannot be
absolutely certain as to whether the null hypothesis is true or not. We can merely gather
information (via statistical tests) to determine whether it is likely or not. We therefore speak
about rejecting or not rejecting the null hypothesis on the basis of some test, but not
of accepting the null hypothesis or the alternative hypothesis. Often in an experiment we are
actually testing the validity of the alternative hypothesis by testing whether to reject the null
hypothesis.
Significance level is the acceptable level of type I error, denoted α. Typically, a significance
level of α = .05 is used (although sometimes other levels such as α = .01 may be employed).
This means that we are willing to tolerate up to 5% of type I errors, i.e. we are willing to
accept the fact that in 1 out of every 20 samples we reject the null hypothesis even though it
is true.
P-value (the probability value) is the value p of the statistic used to test the null hypothesis.
If p < α then we reject the null hypothesis.
Standard test procedures are available to test various hypotheses regarding these parameters.
This makes significance testing powerful and reliable. Central Limit Theorem is the basis of
parametric tests, which enables accurate estimation of values within which populating
parameters will lie. Level of confidence and level of significance plays a crucial role in
making parametric tests useful and handy.
Population parameters like the mean, proportions, variance etc. are of great importance in
significance tests and economic applications. Test based on such parameters are called
parametric tests. Parametric tests enable to specify the parameters of population and the form
of a concerned probability sampling distribution. Statistical tests in which hypothesis deals
with population parameters or sample statistics are called parametric tests. For example,
when we want to test given population mean or population proportion, or any sample statistic,
the test applied is called parametric test.
Two-Tailed Test
Let μ0 be the hypothesized value of the population mean to be tested. Then the null and
alternative hypotheses for two-tailed test are defined as
𝐻0 : 𝜇 = 𝜇0
and 𝐻1 : 𝜇 ≠ 𝜇0
If the standard deviation σ of the population is known, then based on Central limit theorem,
the sampling distribution mean 𝑥̅ would follow the standard normal distribution for a large
sample size. Then the Z-test statistic is given by
𝑥̅ −𝜇 𝑥̅ −𝜇
𝑍= = 𝜎 (1)
𝜎𝑥̅ ( ⁄ )
√𝑛
If the standard deviation σ of the population is not known, then a sample standard deviation s
is used to estimate σ. Then the Z-statistic is given by
𝑥̅ −𝜇 𝑥̅ −𝜇
𝑍= 𝑠 = (2)
𝑠𝑥̅ ( ⁄ )
√𝑛
Now the decision rule based on sample mean for the two-tailed test will be as follows:
Reject H0 if Zcal ≤ - Zα/2 or Zcal ≥ Zα/2
𝐻0 : 𝜇 ≥ 𝜇0 and 𝐻1 : 𝜇 < 𝜇0
The Z-test:
𝑥̅ −𝜇 𝑥̅ −𝜇
𝑍= = 𝜎
𝜎𝑥̅ ( ⁄ )
√𝑛
Decision rule:
Reject H0 if Zcal ≤ - Zα
𝐻0 : 𝜇 ≤ 𝜇0 and 𝐻1 : 𝜇 > 𝜇0
The Z-test:
𝑥̅ −𝜇 𝑥̅ −𝜇
𝑍= = 𝜎
𝜎𝑥̅ ( ⁄ )
√𝑛
Decision rule:
Reject H0 if Zcal ≥ Zα
Example 1: A cell phone battery company claims that its batteries have an average life of
170 hrs. A consumer tested 49 batteries and found that they have an average life 163 hours,
with standard deviation 16 hours. Is the claim valid at 5 per cent significance level?
Solution: Let us assume the null hypothesis H0 that there is no difference between company’s
claim and the sample mean. Then we have
Ho: μ = 170
H1: μ ≠ 170
Given n = 49, Sample mean = 𝑥̅ = 163, Population mean μ = 170, σ = 16 then using equation
(1), we obtain
𝑥̅ −𝜇 𝑥̅ −𝜇 (163−170) −7
𝑍= = = = = −3.06
𝜎𝑥
̅ (𝜎⁄ ) (16/√49) 2.29
√𝑛
Since Zcal = -3.06 is less than its critical Zα= -1.645 at α = 0.05 level of significance the null
hypothesis is rejected. Therefore the claim is not valid.
Example 2: The mean lifetime of a sample of 250 fluorescent light bulbs produced by a
company is found to be 1400 hours with a standard deviation of 160 hours. Test the
hypothesis that the mean lifetime of bulbs produced in general is higher than the mean
lifetime of 1390 hours at α = 0.01 level of significance.
Solution: Let us assume that the null hypothesis that the mean lifetime of bulbs is not more
than 1390 hours, i.e.,
Ho: μ ≤ 1390
H1: μ > 1390
Given n = 250, Sample mean = 𝑥̅ = 1400, Population mean μ = 1370, sample standard
deviation, s = 160 then using equation (2), we obtain
𝑥̅ −𝜇 𝑥̅ −𝜇 (1400−1390) 10
𝑍= = = = = 3.13
𝑠𝑥
̅ (𝑠⁄ ) (160/√250) 3.2
√𝑛
Since Zcal = 3.13 is more than its critical Zα= 2.33 at α = 0.01 level of significance. Thus the
null hypothesis is rejected. Therefore the mean lifetime of bulbs produced in general is higher
than the mean lifetime of 1390 hours at α = 0.01 level of significance.
Example 3: A packing device is set to fill detergent powder packets with a mean weight of 4
kg with a standard deviation of 0.20 kg. The weight of packets is known to drift upwards over
a period of time due to machine fault. A random sample of 80 packets is taken and weighed.
This sample has a mean weight of 4.03 kg. Can we conclude that the mean weight produced
by the machine has increased? Use a 5 per cent significance level.
Ho: μ ≥ 4
H1: μ < 4
Given n = 80, Sample mean = 𝑥̅ = 4.03, Population mean μ = 4, σ = 0.20 then using equation
(1), we obtain
𝑥̅ −𝜇 𝑥̅ −𝜇 (4.03−4) 0.03
𝑍= = 𝜎 = = = 1.36
𝜎𝑥̅ ( ⁄ ) (0.20/√80) 0.022
√𝑛
Since Zcal = 1.36 is less than its critical Zα= 1.645 at α = 0.05 level of significance the null
hypothesis is not rejected (i.e., accepted). Therefore we conclude that the mean weight may
be more than 4 kg.
In parametric tests for proportions, the null hypothesis will be ‘there is no significant difference
between sample proportion and population proportion. Proportion tests follow normal
distribution. Proportion tests are generally conducted as large sample tests.
To conduct a test of hypothesis, it is assumed that the sampling distribution of proportions follow
a standard normal distribution. Now let 𝑝̅ be the sample proportion and its standard deviation
be 𝜎𝑝̅ then the Z statistic is defined as
The three forms of null and alternative hypotheses pertaining to the hypothesised proportion
p0 are as follows:
Null hypothesis Alternative hypothesis
𝐻0 ∶ 𝑝 = 𝑝0 𝐻1 : 𝑝 ≠ 𝑝0 (two-tailed test)
𝐻0 ∶ 𝑝 ≥ 𝑝0 𝐻1 : 𝑝 < 𝑝0 (left-tailed test)
𝐻0 ∶ 𝑝 ≤ 𝑝0 𝐻1 : 𝑝 > 𝑝0 (right-tailed test)
Decision Rule:
Reject H0 if Zcal ≤ - Zα/2 or Zcal ≥ Zα/2 (two-tailed)
Reject H0 if Zcal ≥ Zα (right-tailed)
Reject H0 if Zcal ≤ - Zα (left-tailed)
Example 6: A manufacturer claims that at least 95 per cent of the equipments which he
supplied to a factory confirmed to the specifications. An examination of the sample of 200
pieces of equipment revealed that 18 were faulty. Test the claim of the manufacturer at α =
0.05.
Let μ0 be the hypothesized value of the population mean to be tested and 𝑥̅ be the sample
mean. Then the null and alternative hypotheses for two-tailed test are defined as
If the standard deviation σ of the population is not known, then σ is estimated by the sample
standard deviation‘s’. The test statistic follows t-distribution with n-1 degrees of freedom.
Then the t-test statistic is given by
𝑥̅ −𝜇 𝑥̅ −𝜇
𝑡= = 𝑠 (7)
𝑠𝑥̅ ( ⁄ )
√𝑛
Reject H0 if tcal ≤ - tα
where tα and tα/2 are the table values with n-1 degrees of freedom (critical value of t).
Note: For large samples of 30 or more the t distribution is similar to a normal distribution.
Example 8: Maxwell’s hot Chocolate is concerned about the effect of the recent year long
coffee advertising campaign on hot chocolate sales. Maxwell’s has randomly selected 26
weeks from the past year and found average sales of 912 pounds with a standard deviation of
72 pounds. The average weekly hot chocolate sales two years ago was 980 pounds. Define
suitable hypothesis for testing whether hot chocolate sales have decreased. Use α=0.05 to test
this hypotheses.
Solution: Given that
Population mean, μ = 980, Sample mean, 𝑥̅ = 912,
Sample standard deviation, s = 72 Sample size, n = 26
We now define the null and alternative hypotheses as
𝐻0 : 𝜇 ≥ 980
𝐻1 : 𝜇 < 980
Since the sample size is small, we use t-test statistic to test the above hypothesis. From
equation equation (7), we get
𝑥̅ −𝜇 912−980 −68
𝑡= = = = −4.82
𝑠𝑥̅ (72/√26) 14.12
The degrees of freedom= (n-1) = 25. Now tcri= -1.753. Since tcal ≤ - tα, H0 is not accepted.
Thus we conclude that the hot chocolate sales have decreased.
(𝑥
̅̅̅1̅−𝑥
̅̅̅2̅)−(𝜇1 −𝜇2 ) (𝑥
̅̅̅1̅−𝑥
̅̅̅2̅)−(𝜇1 −𝜇2 )
Zstatistic = = 2 2 (1)
𝜎(𝑥 𝜎 𝜎
̅̅̅̅−𝑥
1 ̅̅̅̅)
2 √( 1 + 2 )
𝑛1 𝑛2
(𝑥
̅̅̅1̅−𝑥
̅̅̅2̅)−(𝜇1 −𝜇2 ) (𝑥
̅̅̅1̅−𝑥
̅̅̅2̅)−(𝜇1 −𝜇2 )
Zstatistic = = 1 1 (2)
𝜎(𝑥
̅̅̅̅−𝑥
1 ̅̅̅̅)
2 𝜎√( + )
𝑛1 𝑛2
If the standard deviations σ1 and σ2 are not known, then we may estimate the standard error of
the sampling distribution by substituting the sample standard deviations s1 and s2 as estimates
of the population standard deviations.
Two-Tailed Test
The null and alternative hypotheses are
𝐻0 : 𝜇1 − 𝜇2 = 𝑑0
𝐻1 : 𝜇1 − 𝜇2 ≠ 𝑑0
Right-Tailed Test
In this case the null and alternative hypotheses are as follows:
𝐻0 : 𝜇1 − 𝜇2 = 𝑑0
𝐻1 : 𝜇1 − 𝜇2 > 𝑑0
𝐻1 : 𝜇1 − 𝜇2 < 𝑑0
Here d0 is some specified difference that is to be tested. If there is no difference between two
population means, then d0 = 0.
Example 1: Independent random samples taken at two local malls provided the following
information regarding purchases by patrons of the two malls.
Hamilton Place Eastgate
Sample Size 80 75
Average Purchase $43 $40
Standard Deviation $ 8 $ 6
At 95% confidence, test to determine whether or not there is a significant difference between the
average purchases by the patrons of the two malls.
Solution: We assume the null hypothesis as there is no significant difference between the average
purchases by patrons of the two malls, then we have
H0: 1 - 2 = 0
Ha: 1 - 2 0
Note: The alternate hypothesis tells you that this is a two-tail test.
= 2.65
Since Zcal = 2.65 is more than the Zα/2 = 1.96. Thus there is no statistical evidence to accept H0 and
hence reject H0 which indicate that there is a difference in average purchases at two malls.
Example 2: An experiment was conducted to compare the mean time in days required to recover
from common cold for a person who are given daily dose of 4 mg of vitamin C versus those who were
not given vitamin supplement. Suppose that 35 adults were randomly selected for each treatment
category and observed the following information:
Note: The alternate hypothesis tells you that this is a single-tailed (left) test.
(5.8−6.9) (−1.1)
= 2 2
= = −2.07
0.53
√(1.2) +(2.9)
35 35
Since we are using one tailed test, then the Zα = 1.65 and Zcal > Zα, the null hypothesis is
rejected. Thus we conclude that the use of vitamin C does not reduce the mean time required to
recover from common cold.
3.2 Hypothesis Testing for Two Population Proportions
The hypothesis testing concepts developed in previous section can be extended to test
whether there is any difference between two population proportions. In this case the null
hypothesis is stated as
𝐻0 ∶ 𝑝1 = 𝑝2
Example 3: Out of a sample of 600 men from a city 450 were smokers. In another sample of 900
men another city, 450 were smokers. Do the data indicate that the cities are significantly different
in smoking habit in the two cities? Test the hypothesis at α = 0.05.
Solution: We define the null and alternative hypotheses as
𝐻0 ∶ 𝑝1 = 𝑝2
𝐻1 ∶ 𝑝1 ≠ 𝑝2.
Here we are using two-tailed test. Given that
450 450
n1= 600, n2= 600, 𝑝
̅̅̅1 = = 0.75, ̅̅̅
𝑝2 = 900 = 0.50
600
Since Zcal = 9.26 which is less than its critical value i.e., Z0.025 =2.58 at the given level of
significance. Thus the null hypothesis H0 is rejected. Hence we conclude that there is
significant difference in the smoking habit of two cities.
3.3 Hypothesis Testing for Single Population Mean – Small sample size
Let μ0 be the hypothesized value of the population mean to be tested and 𝑥̅ be the sample
mean. Then the null and alternative hypotheses for two-tailed test are defined as
If the standard deviation σ of the population is not known, then σ is estimated by the sample
standard deviation‘s’. The test statistic follows t-distribution with n-1 degrees of freedom.
Then the t-test statistic is given by
𝑥̅ −𝜇 𝑥̅ −𝜇
𝑡= = 𝑠 (4)
𝑠𝑥̅ ( ⁄ )
√𝑛
Reject H0 if tcal ≤ - tα
where tα and tα/2 are the table values with n-1 degrees of freedom (critical value of t).
Note: For large samples of 30 or more the t distribution is similar to a normal distribution.
Example 4: Maxwell’s hot Chocolate is concerned about the effect of the recent year long
coffee advertising campaign on hot chocolate sales. Maxwell’s has randomly selected 26
weeks from the past year and found average sales of 912 pounds with a standard deviation of
72 pounds. The average weekly hot chocolate sales two years ago was 980 pounds. Define
suitable hypothesis for testing whether hot chocolate sales have decreased. Use α=0.05 to test
this hypotheses.
Solution: Given that
Population mean, μ = 980, Sample mean, 𝑥̅ = 912,
Sample standard deviation, s = 72 Sample size, n = 26
We now define the null and alternative hypotheses as
𝐻0 : 𝜇 ≥ 980
𝐻1 : 𝜇 < 980
Since the sample size is small, we use t-test statistic to test the above hypothesis. From
equation equation (5), we get
𝑥̅ −𝜇 912−980 −68
𝑡= = = = −4.82
𝑠𝑥̅ (72/√26) 14.12
The degrees of freedom= (n-1) = 25. Now tcri= -1.753. Since tcal ≤ - tα, H0 is not accepted.
Thus we conclude that the hot chocolate sales have decreased.
or
(𝑛1 −1)𝑠12 +(𝑛2 −1)𝑠22
𝑠2 = (7)
(𝑛1 +𝑛2 −2)
The t statistic computed under the null hypothesis has a df = n1 + n2 – 2. Hence, the probability
associated with t can be computed. If the variances are unequal, an approximation to t is computed.
The number of degrees of freedom is obtained by rounding to the nearest integer.
Two-Tailed Test
The null and alternative hypotheses are
𝐻0 : 𝜇1 − 𝜇2 = 𝑑0
𝐻1 : 𝜇1 − 𝜇2 ≠ 𝑑0
Right-Tailed Test
In this case the null and alternative hypotheses are as follows:
𝐻0 : 𝜇1 − 𝜇2 = 𝑑0
𝐻1 : 𝜇1 − 𝜇2 > 𝑑0
Left-Tailed Test
In this case the null and alternative hypotheses are as follows:
𝐻0 : 𝜇1 − 𝜇2 = 𝑑0
𝐻1 : 𝜇1 − 𝜇2 < 𝑑0
Here d0 is some specified difference that is to be tested. If there is no difference between two
population means, then d0 = 0.
Example 6: As per Wall Street Journal, Gasoline prices reached record high levels in 16
states during 2003. Two of the affected states were California and Florida. The American
Automobile Association reported a sample mean price of $1.72 per gallon in Florida and $
2.16 per gallon in California. They used a sample size of 25 for the California and 30 for the
Florida and found the standard deviation as 0.10 in California and 0.08 in Florida. Define
suitable hypothesis and test at α =0.05.
Since we are comparing two populations, it is reasonable to assume that the null hypothesis as there is
no significant difference between the average prices in the two states, then we have
H0: 1 - 2 = 0
Ha: 1 - 2 0
Now the weighted pooled variance ‘s2’ is obtained by equation (11). Thus we have
(𝑛1 −1)𝑠12 +(𝑛2 −1)𝑠22 (25−1)(0.12 )+(30−1)(0.082 )
𝑠2 = =
(𝑛1 +𝑛2 −2) 25+30−2
0.24+0.19
= = 0.08
53
The degrees of freedom= (25+30-1 = 53). Now tcri= 2.021 at α = 0.05. Since tcal ≤ tα, H0 is
not rejected. Thus we conclude that there is no significant difference in the prices of two
states.
where 𝑑̅ be the mean of the difference of the paired sample data, 𝑠𝑑 be the standard deviation of the
differences and is given by
(𝑑−𝑑̅ )2
𝑠𝑑 = √ 𝑛−1
(9)
Example 7: The daily production rates for a sample of factory workers before and after a training
program are shown below.
Worker 1 2 3 4 5
Before 6 10 9 8 7
After 9 12 10 11 9
At 95% confidence test to see if the training program was effective. That is, did the training program
actually increase the production rates?
Solution: In this case we define the following hypotheses:
H0: d 0
Ha: d > 0
We now prepare the following table to test the hypothesis:
(𝑑−𝑑̅ )2 2.8
𝑠𝑑 = √ 𝑛−1
= √ 4 = 0.84
The degrees of freedom= (5-1 = 4). Now tcri= 2.132 at α = 0.05. Since tcal ≥ tα, H0 is
rejected. Thus we conclude that the training programme was effective.
Example 8: To determine the effectiveness of a new weight control diet, six randomly selected
students observed the diet for 4 weeks with the results shown below.
Dieter A B C D E F
Weight Before 138 151 129 125 152 140
Weight After 136 149 136 127 146 144
a. Set up the null and alternative hypotheses to see if this diet is effective.
b. Find the mean and standard deviation of the differences.
c. Test the hypothesis stated in Part a.
d. State your conclusion.
Solution: In this case we define the following hypotheses:
H0: d = 0
Ha: d ≠ 0
We now prepare the following table to test the hypothesis:
Difference 𝟐
Worker Before After (d) ̅)
(𝒅 − 𝒅
A 138 136 -2 4.15
B 151 149 -2 4.15
C 129 136 7 43.89
D 125 127 2 2.64
E 152 146 -6 40.64
F 140 144 4 13.14
From the above table, ∑ 𝑑 = 3, 𝑑̅ = 0.375, n = 6, Ʃ(𝑑 − 𝑑̅)2 = 108.62 and the standard deviation is
(𝑑−𝑑̅ )2 108.65
𝑠𝑑 = √ 𝑛−1
= √ 5
= 4.66
The degrees of freedom= (6-1 = 5). Now tcri= 2.571 at α/2 = 0.025. Since tcal < tα/2, H0 is not
rejected. Thus we conclude that new weight control diet is not effective.
4.1 Introduction
In earlier lessons, prominent differences between sampling distributions have been previously
studied through parameters like mean, standard deviation, proportion etc, which are the
estimates of the parameters of the populations but generally these do not give all the features
of these distributions. This caused the necessity to have some index which can measure the
degrees of difference between the actual frequencies of various groups and can thus compare
all necessary features of them. These tests are easier to explain and easier to understand. This
is the reason why such tests have become popular. But one should not forget the fact that they
are usually less efficient or powerful as they are based on no assumptions, and we all know
that the less one assumes, the less one can infer from a set of data. But then the other side
must also be kept in view that the more one assumes, the more one limits the applicability of
one’s methods. Non parametric tests are quantitative techniques designed for such situations.
A statistical test is a formal technique, based on some probability distributions, for arriving at
a decision about the reasonableness of an assertion or hypothesis. The test technique makes
use of one or more values, obtained from sample data to arrive at a probability statement
about the hypothesis. But such a test technique also makes use of some more assertions about
the population from which the sample is drawn. For instance, it may be assumed that
population is normally distributed, sample drawn is a random sample and similar other
assumptions. The normality of the population distribution forms the basis for making
statistical inferences about the sample drawn from the population. But no such assumptions
are made in case of non parametric tests. Chi-Square test, Signed Rank tests, Rank sum tests,
Wilcoxon Matched-Pairs Signed-Ranks Test, Mann-Whitney Test are some popular non
parametric tests. But we shall discuss here only Chi-Square test.
The mean of a Chi Square distribution is its degrees of freedom. Chi Square distributions are
positively skewed, with the degree of skew decreasing with increasing degrees of freedom. As
the degrees of freedom increases, the Chi Square distribution approaches a normal distribution.
Figure 1 shows density functions for three Chi Square distributions. Notice how the skew
decreases as the degrees of freedom increases.
Note: On the basis of situation, nature and purpose of test, chi-square test may be classified
as – test of independence of attributes, test of goodness of fit , test of homogeneity, and test
for variance.
As in univariate chi-square test a frequency count of data that nominally identify or categorically
rank groups is acceptable for the chi-square test for contingency tables.
To compute the chi-square value the researcher must first identify an expected distribution for
that table. Under the null hypothesis the same proportion of positive answers (60 percent) should
come from both groups.
There is an easy way to calculate the expected frequencies for the cells in a cross-tabulation
table. To compute an expected number for each cell use the formula
R iC j
E ij
n (11)
where Ri = total observed frequency in the ith row
Cj = total observed frequency in the jth column
n = sample size
To compute a chi-square statistic the same formula as before is used, except that we calculate
degrees of freedom as the number of rows minus one (R - 1) times the number of columns minus
one (C - 1). As an example, for a 2X2 table, the number of degrees of freedom equals 1:
(R - 1)(C - 1) = (2 - 1)(2 - 1) = 1
Note: Proper use of the chi-square test requires that each expected cell frequency (Eij) have a
value of at least five. If this sample size requirement is not met, the researcher may take a larger
sample or may combine (“collapse”) response categories.
Solution: Let us assume that the null hypothesis that the firm size is independent of debt.
H0: Firm is independent of debt
Ha: Firm is not independent of debt
Since we are looking for a relationship between company size and debt, the most appropriate
test is the Chi-square. So we use equation (17).
Firm size (in $ thousands) Row
<500 500-2000 >2000
Total
Debt less than equity 7 10 8 25
Debt greater than equity 10 16 9 35
Column Total 17 26 17
Here the total sample size, n = 60. Now the expected frequencies, according to (18), are
25 𝑥 17
E11 = = 7.08
60
25 𝑥 26
E12 = = 10.83
60
25 𝑥 17
E13 = = 7.08
60
35 𝑥 17
E21 = = 9.92
60
35𝑥26
E22 = = 15.17
60
35 𝑥 17
E21 = = 9.92
60
The following table is prepared:
Oi Ei (Oi- Ei) (Oi- Ei)2 (Oi - Ei)2/ Ei
7 7.08 -0.08 0.01 0.00
10 10.83 -0.83 0.69 0.06
8 7.08 0.92 0.85 0.12
10 9.92 0.08 0.01 0.00
16 15.17 0.83 0.69 0.05
9 9.92 -0.92 0.85 0.09
From the table, we have
2
2 O i E i
X
Ei
= 0.32
2
Here r = 2, c = 3, then the degrees of freedom = (r-1)(c-1) = 2, so the critical value 𝜒𝑐𝑟𝑖 =
2 2
4.61. Since 𝜒𝑐𝑎𝑙 < 𝜒𝑐𝑟𝑖 , we accept the null hypothesis. Thus we conclude that the firm size
is not depending on the debt of the company.
Example 10: Five hundred students in a school were graded according to their intelligence
and the economic conditions at their homes. Examine whether there is any association
between economic conditions at home and intelligence:
Economic Intelligence
Conditions Good Bad
Rich 125 100
Poor 75 150
Define suitable hypotheses and test at 5% level of significance.
follows F distribution with df1= n1 -1 and df2= n2 -1. Here 𝑠12 and 𝑠22 are the variances of two
sample and are given by
∑(𝑥1 −𝑥̅ 1 )2 ∑(𝑥2 −𝑥̅ 2 )2
𝑠12 = and 𝑠22 = (15)
(𝑛1 −1) (𝑛2 −1)
If the two populations have equal variances i.e., 𝜎12 = 𝜎22 , then the ratio become
𝑠12
𝐹= (s1> s2) (16)
𝑠22
has a probability F distribution with df1= n1 -1 for numerator and df2= n2 -1 for denominator.
Note: 1.The larger the ratio the greater the value of F and if the F value is large, the results are
likely to be statistically significant.
2. To test the null hypothesis of no difference between the sample variances, a table of the F-
distribution is necessary. Use of F-table is much like using the tables of the Z- and t-
distributions. These tables indicate that the distribution of F is actually a family of distributions
that changes quite drastically with changes in sample sizes. Thus degrees of freedom must be
specified. Inspection of an F table allows the researcher to determine the probability of finding
an F as large as the calculated F.
Example 13: The following data relate to the number of units of an item produced per shift by
two workers A and B for a number of days:
A 17 22 25 27 23 20 26
B 29 35 32 33 28 27 25 28
Can it be referred that worker A is more stable compared to worker B? Define the hypothesis and
test at a significance level of 0.05.
Solution: We suppose that the null hypothesis as two workers are stable (i.e., variability in rate)
H0: 𝜎𝐴2 = 𝜎𝐵2
H1: 𝜎𝐴2 ≠ 𝜎𝐵2
Prepare the following table to calculate the two sample variances:
Worker A Worker B
X1 X2 (𝑋1 − 𝑋̅1 )2 (𝑋2 − 𝑋̅2 )2
17 29 34.31 0.39
22 35 0.73 28.89
25 32 4.59 5.64
27 33 17.16 11.39
23 28 0.02 2.64
20 27 8.16 6.89
26 25 9.88 21.39
28 2.64
From the above table, we have
Since the critical value of F is F6,7,0.05 = 3.58 is more than the Fcal ==1.094, the null
hypothesis is accepted. Thus we conclude that both workers are stable in terms of their
production capacity.
Practice Problems
1. Define sampling distribution.
2. Define standard error of estimate
3. Discuss various types of estimates.
4. Define Type-I and Type-II errors.
5. What is known as parametric test? What are various parametric tests?
6. What are null and alternate hypotheses?
7. What are the components of hypothesis testing?
8. A sample of 16 elements from a normally distributed population is selected. The sample
mean is 10 with a standard deviation of 4. Find the 95% confidence interval for .
9. In order to estimate the average time spent on the computer terminals per student at a
local university, data were collected for a sample of 81 business students over a one week
period. Assume the population standard deviation is 1.2 hours. If the sample mean is 9 hours,
then find the 95% confidence interval.
10. You are given the following information:
n = 49, x = 54.8, s = 28 and H0: = 50, Ha: 50
11. What is the appropriate statistic to test the hypotheses? Calculate the test statistic.
12. A soft drink filling machine, when in perfect adjustment, fills the bottles with 12 ounces
of soft drink. A random sample of 25 bottles is selected, and the contents are measured. The
sample yielded a mean content of 11.88 ounces, with a standard deviation of 0.24 ounces.
With a 0.05 level of significance, test to see if the machine is in perfect adjustment.
13. Maxwell’s hot Chocolate is concerned about the effect of the recent year long coffee
advertising campaign on hot chocolate sales. The average weekly hot chocolate sales two
years ago was 984.7 pounds and the standard deviation was 72.6 pounds. Maxwell’s has
randomly selected 30 weeks from the past year and found average sales of 912.1 pounds.
Define suitable hypotheses for testing whether hot chocolate sales have decreased. Use
α=0.02 to test this hypotheses.
14. An article about driving practices in Strathcona County, Alberta, Canada claimed that
46% of the drivers did not stop at stop sign intersections on county roads (Edmonton Journal,
July 19, 2000). Two months later, a follow-up study collected data in order to see whether
this percentage had changed and found 420 out of 820 drivers did not stop at stop sign
intersections. Formulate the hypothesis to determine whether the proportion of drivers who
did not stop at stop sign intersections had changed. Test the hypotheses at α=0.05.
15. According to Bureau of labour statistics, the average weekly pay for US production
worker was $ 441.84 with a standard deviation of $90. A sample of 56 workers revealed that
the average weekly pay was $420.64. Test the hypotheses whether the average weekly pay of
production worker is significantly changed. Use α= 0.05.
16. Determine a statistical hypothesis and perform a chi-square test on the following survey
data: Easy-to-listen music should be played on the office intercom.
Agree 40
Neutral 35
Disagree 25
100
17. An advertising firm is trying to determine the demographics for a new product. They have
randomly selected 75 people in each of 5 different age groups and introduced the product to
them. The results of the survey are given below:
Age Groups
Future Activity 18-29 30-39 40-49 50-59 60-69
Purchase frequently 12 18 17 22 32
Seldom purchase 18 25 29 24 30
Never purchase 45 32 29 29 13
(a) State the null and alternative hypotheses.
(b) Calculate the sample chi-square value.
(c) If the level of significance is 0.02, should the null hypothesis be rejected?
***