You are on page 1of 11

Shri Ramdeobaba College of Engineering and Management, Nagpur

Hypothesis Testing
A population is an aggregate of objects which are under study. More precisely, a population
consist of numerical values connected with these objects. A population containing a fin ite
number of objects is called a finite population, while a population with infinite number of
objects is called an infinite population.

For any statistical investigation, complete enumeration of finite population is not practicable.
For example to calculate average monthly income of the people of country, we have to
enumerate all earning individuals in the country which is very difficult task. So take the help of
sampling in such case.

A sample is finite subset of statistical individual of population. The number of individual in a


sample is called the sample size. A sample is said to be large if the number of objects in the
sample is at least 30, otherwise it is called small. The process of selecting a sample from
population is called sampling.

A sampling in which the objects are chosen in such a manner that one object has as good chance
of being selected as another is called random sampling.

The statistical constants of the population, namely mean, variance, proportion etc are denoted
by μ, σ2 , etc., respectively, and are called parameters whereas the statistical measures computed
from sample observation alone, namely , mean, variance, etc., are denoted by x, ̅ s 2 , etc., and
are called statistics.

Suppose that we draw possible samples of size n from population at random. For each sample,
we compute the mean. The means of the sample are not identical . The frequency distribution
obtained by grouping the different means according to their frequencies is called sampling
distribution of mean. Similarly, the frequency distribution obtained by grouping different
variances according to their frequency is called sampling distribution of the variances.

The sampling of large samples is assumed to be normal. The standard deviation of sampling
distribution of a statistics is called standard error of that statistics.

Some important relations between Parameter and Statistics :

1. E(statistics) = parameter
2. The relationship between the standard deviation of the mean σx̅ and the population
σ N−n
standard deviation σ is given by σx̅ = √
√n N−1
N-Size of population
n-Sample size
3. If the population size N is large and the sample size n is relatively small then above
formula leads to following formula:
σ
σx̅ = n.

Null Hypothesis: A hypothesis which is definite statement about the population parameter is
called Null hypothesis and is demoted by H0. In fact, the null hypothesis is that which is tested
for for possible rejection under the assumption that it is true.

For example: Let us take the hypothesis that a coin is unbiased(true). Thus H0. is that p=1/2,
where p is probability of head. We toss this coin 10 times and observe the number of times
heads appears. If heads appears to often the we reject the hypothesis H0. And thus decide the
coin is baised.

A hypothesis which is complementary to Null hypothesis is called alternative hypothesis which


is denoted by H1.

For example if H0 : p = 1/2 then alternative hypothesis H1 can be

i. H1. : p ≠ 1/2
ii. H1 : p > 1/2
iii. H1 : p < 1/2
The alternative hypothesis is in (i) is called two-tailed alternative test, in (ii) is called right tailed
alternative test, and in (iii) is known as left-tailed alternative test.

Level of Significance and Critical Region:


Normally, we aim to reject the null if it is false. However, as with any test, there is a small
chance that we could get it wrong and reject a null hypothesis that is true.
The significance level is denoted by α and is the probability of rejecting the null hypothesis, if
it is true. Typical values for α are 0.01, 0.05 and 0.1. It is a value that we select based on the
certainty we need. In most cases, the choice of α is determined by the context we are operating
in, but 0.05 is the most commonly used value.
For Example: Suppose university want to carry out an analysis on how students are performing
on average.

The university dean believes that on average students have a CGPA of 70%.

The null hypothesis is: The population mean grade is 70%.

The alternative hypothesis is: The population mean grade is not 70%

Assuming that the population of grades is normally distributed, all grades received by students
should look in the following way.
μ – is the true population mean.

Performing a Z-test: Now, a test we would normally perform is the Z-test. The formula is:
x̅−μ x̅−μ
z= = σ , where x̅ is sample mean and μ is hypothesized mean and the idea of this test is
σx̅
√n
following:

We are standardizing or scaling the sample mean we got. If the sample mean is close enough to
the hypothesized mean, then Z will be close to 0 and we would accept the null hypothesis.

The question here is the following:


How big should Z be for us to reject the null hypothesis?

So there is a cut-off line. Since we are conducting a two-sided or a two-tailed test, there are two
cut-off lines, one on each side.

When we calculate Z, we will get a value. If this value falls into the middle part, then we cannot
reject the null. If it falls outside, in the shaded region, then we reject the null hypothesis. That is
why the shaded part is called: rejection region, as you can see above.

The area that is cut-off actually depends on the significance level.

Say the level of significance, α, is 0.05. Then we have α divided by 2, or 0.025 on the left side
and 0.025 on the right side.
Now these are values we can check from the z-table. When α is 0.025, Z is 1.96. So, 1.96 on the
right side and minus 1.96 on the left side. Therefore, if the value we get for Z from the test is
lower than -1.96, or higher than 1.96, we will reject the null hypothesis. Otherwise, we
will accept it.

Example of a One-Sided Test:

Suppose Paul says data scientists earn more than $125,000. So, H0 is: μ0 is bigger than $125,000.

The alternative is that μ0 is lower or equal to 125,000.

Using the same significance level, this time, the whole rejection region is on the left. So, the
rejection region has an area of α. Looking at the z-table, that corresponds to a Z-score of 1.645.
Since it is on the left, it is with a minus sign.

Now, when calculating our test statistic Z, if we get a value lower than -1.645, we would reject
the null hypothesis. We do that because we have statistical evidence that the data scientist
salary is less than $125,000. Otherwise, we would accept it.

Another One-Tailed Test:Suppose the university dean told us that the average CGPA students
get is lower than 70%. In that case, the null hypothesis is: μ0 is lower than 70%.

While the alternative is: μ0 is bigger or equal to 70%.

In this situation, the rejection region is on the right side. So, if the test statistic is bigger than the
cut-off z-score, we would reject the null,
To sum up, the significance level and the reject region are quite crucial in the process of
hypothesis testing. The level of significance conducts the accuracy of prediction. We choose it
depending on how big of a difference a possible error could make. On the other hand, the reject
region helps us decide whether or not to reject the null hypothesis.

Critical Values 𝐙𝛂 of Z

Critical Values Level of Significance(𝛂)


of 𝐙𝛂 1% 5% 10%
Two-tailed test |Zα |=2.58 |Zα |=1.96 |Zα |=1.645
Right-tailed test Zα =2.33 Zα =1.65 Zα =1.28
Left-tailed test Zα =-2.33 Zα =-1.65 Zα =-1.28

Procedure for Testing of Hypothesis:

1. Null Hypothesis :set up Null Hypothesis H0 .


2. Alternative Hypothesis: Set up the Alternative Hypothesis H1 . This will be unable us
to decide whether we have to use single-tailed test or two-tailed test.
3. Level of Significance: Choose the appropriate level of Significance 𝛂 .
x̅−μ x̅−μ
4. Test Statistics: Compute z = σx̅
= σ under the Null hypothesis.
√n
Remark: If the population standard deviation σ is unknown then we use its estimate
provided by sample standard deviation i.e σ = s.
5. Conclusion: We Compare z the computed value of Z in step 4 with the significant
value Zα , at given level of significance 𝛂.

Confidence limits for μ: 95% confidence interval for μ is given by :

x̅ − μ
|𝐙| ≤ 𝟏. 𝟗𝟔, 𝐢. 𝐞. | σ | ≤ 𝟏. 𝟗𝟔
√n
σ σ
This implies x̅ − 1.96 ≤ μ ≤ x̅ + 1.96
√n √n
σ
And x̅ ∓ 1.96 are known as 95% confidence interval for μ . Similarly, 99% confidence limits
√n
σ
for μ are x̅ ∓ 2.58 .
√n

However, in sampling from a finite population of size N, the corresponding 95% and 99%
σ N−n σ N−n
confidence limits for μ are respectively x̅ ∓ 1.96 √ and x̅ ∓ 2.58 √ .
√n N−1 √n N−1

Examples

1. The president of chain of stores selling dairy products asserts that the mean content of milk
container is at least 32 ounces. Perform a hypothesis test at the 1% level of significance if the
mean content of random sample of 60 containers is 31.98 ounces and sample standard
deviation is 0.10 ounces.
2. The manufacturer of the light bulbs claims that his light bulb lasts on the average 1600
hours. We want to test his claims. A sample of 100 light bulbs was taken and average bulb
life of this sample was computed to be 1570 hours with standard deviation of 120 hours. At
α = 0.01, test the claim of this manufacturer.
3. In a properly adjusted, an automatic machine should produce parts that have a mean
diameter of 25 mm . Part diameters are normally distributed. The mean diameter of sample
of 10 parts is to be used to check whether or not the machine is running properly. Perform a
hypothesis test at the 5% level if the mean of sample is 25.02 mm and the standard deviation
is 0.024mm.
4. For all children taking an examination, the mean mark was 60% with standard deviation of
8%. A particular class of 30 children achieved an average of 63%. Is this unusual?[Ans: Null
hypothesis accepted at 5% level of significance and rejected at 1% level of significance]
5. In IIT joint entrance test, the score showed μ = 64 & σ = 8. How large a sample of candidates
appearing in the test must be taken in order that there be 10% chance that its mean score is
less than 62%?[ Ans: n=26]
6. If the mean breaking strength of copper wire is 575kg with standard deviation of 8.3kg, how
large a sample must be used so that there be one chance in 100 that the mean breaking
strength of the sample is less than 572 kg?[Ans: sample size n=42]
7. A normal population has a mean of 6.8 and standard deviation of 1.5. A sample of 400
members gave a mean of 6.75. Is the difference between the means significant?[ Ans: No
significant difference, mean null hypothesis is accepted]
8. A research worker wishes to estimate mean of a population by using sufficiently large
sample. The probability is 95% that sample mean will not differ from the true mean by more
than 25% of the standard deviation. How large a sample should be taken?[Ans:sample
should be size of 62]
9. A sample of 900 members has a mean 3.4 cms, and s.d 2.61 cms. Is the sample from a large
population of mean 3.25cms and s.d 2.61cms? If the population is normal and its mean is
unknown, find the 95% confidence interval for true mean.[ Null hypothesis accepted and
confidence interval is 3.2295 to 3.5705.]
Test of Significance for Difference of Means
Let ̅̅̅
𝐱𝟏 be the mean of a random sample of size 𝐧𝟏 from a population with mean 𝛍𝟏 and
variance 𝛔𝟐𝟏 and let Let ̅̅̅
𝐱𝟐 be the mean of an independent random sample size of 𝐧𝟐 from
another population with mean 𝛍𝟐 and variance 𝛔𝟐𝟐 . Since sample size is large both ̅̅̅ 𝐱𝟏 and
𝐱𝟐 are normally distributed. Also ̅̅̅
̅̅̅ 𝐱𝟐 being the difference of two independent
𝐱𝟏 − ̅̅̅,
variates is also normal variate. The Z(s.n.v) corresponding to ̅̅̅ 𝐱𝟐 is given by
𝐱𝟏 − ̅̅̅
̅̅̅̅̅
(𝐱𝟏 − ̅̅̅)
𝐱𝟐 − 𝐄(𝐱 ̅̅̅̅̅𝟏 − ̅̅̅)
𝐱𝟐
𝐙= .
̅̅̅̅̅𝟏 − 𝐱
𝐒. 𝐃. (𝐱 ̅̅̅)
𝟐
Under the null hypothesis 𝐇𝟎 : 𝛍𝟏 = 𝛍𝟐 i.e there is no significant difference between the
population means, we get 𝐄(𝐱 ̅̅̅̅̅𝟏 − ̅̅̅)
𝐱𝟐 = 𝐄((𝐱 ̅̅̅̅̅𝟏 ) − 𝐄(𝐱 ̅̅̅)
𝟐 = 𝛍𝟏 − 𝛍𝟐 = 𝟎;
The variance of ̅̅̅ 𝐱𝟐 is given by:
𝐱𝟏 − ̅̅̅
𝟐 𝟐
̅̅̅̅̅𝟏 − ̅̅̅) ̅̅̅̅̅𝟏 ) + 𝐕( ̅̅̅̅)=𝛔 𝛔
V((𝐱 𝐱𝟐 = 𝐕((𝐱 𝐱𝟐 𝐧𝟏 +𝐧𝟐.
𝟏 𝟐
Thus under the test statistic become
̅̅̅
𝐱𝟏 − ̅̅̅
𝐱𝟐
𝐙=
𝛔𝟐 𝛔𝟐
√ 𝟏 + 𝟐
𝐧𝟏 𝐧𝟐
Remarks 1. If 𝛔𝟐𝟏 = 𝛔𝟐𝟐 = 𝛔𝟐 i.e. if the samples have been drawn from the population with
common S.D. 𝛔 then under 𝐇𝟎 : 𝛍𝟏 = 𝛍𝟐 ,
̅̅̅
𝐱𝟏 − ̅̅̅
𝐱𝟐
𝐙=
𝟏 𝟏
𝛔√ +
𝐧𝟏 𝐧𝟐
2. If 𝛔𝟏 and 𝛔𝟐 are not known then they are estimated from sample values
𝐬𝟏 (𝐒. 𝐃 𝐨𝐟 𝐟𝐢𝐫𝐬𝐭 𝐬𝐚𝐦𝐩𝐥𝐞)𝐚𝐧𝐝 𝐬𝟐 (𝐒. 𝐃. 𝐨𝐟 𝐬𝐞𝐜𝐨𝐧𝐝 𝐬𝐚𝐦𝐩𝐥𝐞) hence

𝐱𝟏 ̅̅̅
̅̅̅−𝐱
𝐙= 𝟐
.
𝐬𝟐 𝐬𝟐
√ 𝟏+ 𝟐
𝐧𝟏 𝐧𝟐

Examples:
1. A sample of 100 electric bulbs produced by a manufacturer A showed the mean life time
of 1,190 hours. And standard deviation of 90 hours. A sample of 75 bulbs produced by
manufacturer B showed a mean life time of 1,230 hours with standard deviation of 120
hours. Is there a difference between mean life time of the two brands at significance
levels of 5% and 1%?[Ans: Z=-2.42, rejected at 5% level of significance and accepted at
1% level of significance]
2. The means of samples of sizes 1000 and 2000 are 67.5 and 68 respectively. Can samples
be regarded as drawn from the sample population of standard deviation 2.5?[Ans:
Z=5.16, Samples cannot be regarded drawn from same population]
3. The mean height of 50 male students who showed above average participation in college
athletics was 68.2 inches with a standard deviation of 2.5 inches, whereas 50 male
students who showed no interest in such participation had a mean height of 67.5 inches
with standard deviation of 2.8 inches. Test the hypothesis that male students who
participate in college athletics are taller than other male students.[Ans:Z=1.32, Null
hypothesis accepted]
4. The yield of wheat in a random sample of 1000 farms in a certain area has a standard
deviation of 192 kg. Another random sample of 1000 farms gives standard deviation of
224kg. Are the standard deviation significantly different?[Ans: Z=-3.43, standard
deviation significantly defferent.]
5. Random samples drawn from two countries gave the following data relating to heights
of adult males:

Country A Country B

Mean height(in inches) 67.42 67.25

Standard deviation(in inches) 2.58 2.50

Numbers in samples 1000 1000

i. Is the difference between the means significant?


[Ans: Z=1.56, so no significance difference between mean.]
6. In a certain factory there are two independent processes manufacturing the same item.
The average weight in a sample of 250 items produced from one process is found to be
120kg with a standard deviation of 12 kgs . While the corresponding figures in a sample
of 400 items from the other process are 124 and 14. Obtain the standard error of
difference between the two sample means: Is this difference significant ? Also find the
99% confidence limits for the difference in the average weights of items produced by the
two processes respectively.[Ans: difference between mean lies between 1.33 to 6.67]
7. The mean height 0f 50 male students who showed above average participation in college
athletics was 68.2 inches with a standard deviation of 2.5 inches; while 50 male students
who showed no interest in such participation had a mean height 0f 67.5 inches with a
standard deviation of 2.8 inches.
(i) Test the hypothesis that male students who participate in college athletics are
taller than other male students.
(ii) By how much should the sample size of each of the two groups be increased in
order that the observed difference of 0.7 inches in the mean heights be significant
at the 5% level of significance.
8. A normal population has a mean of 6.8 and standard deviation of 1.5. A sample of 400
members gave a mean of 6.75. Is the difference between the means significant?
9. The mean of simple samples of sizes 1000 and 2000 are 67.5 and 68 respectively. Can the
samples be regarded as drawn from the same population of standard deviation 2.5?

Single proportion:

For the population of five numbers(N=5) 0,3,6,3,18 it contains three even numbers 0,6,18
and two odd numbers 3 and 3.
The population proportion of even numbers (success) is P=3/5=0.6.
Now consider sample size n=3. Say 0,3,6. Here two numbers are even. Hence the sample
proportion of even numbers is p=2/3, this p represent sample proportion, it can have
different values in different samples.
For example
the sample 0,3 ,3 has p=1/3
the sample 0,6,18 has p=1 and so on.
The 10 samples of size n=3 can be drawn from a population of N=5. The p values for all
10 samples of size 3 are
2/3, 1/3, 2/3, 2/3, 1, 2/3, 1/3, 2/3, 1/3, 2/3.
Here three of the 10 are 1/3, so the probability for p=1/3 is 3/10=0.3, similarly for p=2/3
the probability is 6/10=0.6 and for p=1 the probability is 1/10=0.1. Here we have the
sampling distribution of proportion as follows:
Sample proportion p Probability of p
1/3 0.3
2/3 0.6
1 0.1
It is easy to see that the mean of p i.e. E(p)=0.6=P.
Thus, if P be any given population proportion and Q=1-P then for sample size n from
from a population of size N the following relation holds:
1. Mean(proportion distribution)=Population proportion
PQ N−n
2. σp = √ √
n N−1
PQ
3. If N is large (N → ∞ ) σp = √ n
pq
4. If P is not known then taking p as an estimate of P i.e σp = √ .
n

Test for Single proportion:


p−P
The significant value of Zp is given by Zp = .
σp
The probable limits for the proportion in the population are:
pq
a. p±3√ n
pq
b. The limits for P at level of significance ∝ are given by: p ± zα √ n
(for 5% significance level put zα = 1.96 and for 1% level of significance put zα =
2.58.)
Examples:
1. A dice is thrown 9.000 times and a throw of 3 or 4 is observed 3.240 times. Show that the
dice cannot be regarded as an unbiased one and find the limits between which the
probability of a throw of 3 or 4 lies.
[Ans: Dice is biased, Z=5.36, probability getting 3 or 4 lies between 0.345 to 0.375]
2. A random sample of 500 pineapples was taken from a large consignment and 65 were
found to be bad. Show that the S.E. of the proportion of bad ones in a sample of this size
is 0.015 and deduce that the
percentage of bad pineapples in the consignment almost certainly lies between 8.5 and
17.5.
3. A random sample of 500 apples was taken from, a large consignment and 60 were found
to be bad. Obtain the 98% confidence limits for the percentage number of bad apple in
the consignment.
[Ans 8.61 to 15.38, ]
4. In a sample of 1,000 people in Maharashtra, 540 are rice eaters and the rest are wheat
eaters. Can we assume that both rice and wheat are equally popular in this State at 1%
level of significance?
[z=2.532 , Null hypothesis accepted]
5. Twenty people were attacked by a disease and only 18 survived. Will you reject the
hypothesis that the survival rate, if attacked by this disease is 85% in favor of the
hypothesis that it is more, at 5% level.
[ Hint,: Alternative hypothesis is one sided(right tailed), apply right tailed testfor testing
significance of Z, Z=.633, Accept the null hypothesis. Here Null hypothesis is
H0=P=0.85]

Test of Significance for difference of Proportions

Suppose we want to compare two distinct populations with respect to the prevalence of a
certain characteristic , say A, among their members. Let X1 and X2 be the number of persons
possessing the given characteristic say A in random samples of sizes n1 and n2 from the two
X X
populations respectively. Then sample proportions are given by p1 = n1 and p2 = n2 .
1 2
If P1 and P2 are the population proportions, then E(p1 ) = P1 , E(p2 ) = P2 ; V(p1 ) =
P1 Q1 P2 Q2
and V(p2 ) = .
n1 n2
Since for large samples, p1 and p2 are normally distributed, (p1 − p2 ) is also normally
distributed, The standard variable corresponding to difference (p1 − p2 ) is given by
(p1 −p2 )−E (p1 −p2 )
Z= .
√ (p1 −p2 )
Under the null hypothesis H0 : P1 = P2 , we have
P 1 Q1 P 2 Q2
E (p1 − p2 ) = E(p1 ) − E(p2 ) = 0; V(p1 − p2 ) = V(p1 ) + V(p2 ) = + .
n1 n2

Hence under the null hypothesis H0 : P1 = P2 the test statistics for the difference of proportions
(p1 −p2 ))
becomes Z = .
P Q P Q
√ 1n 1 + 2n 2
1 2

In general, we do not have any information as to the proportion of A’s in the populations from
which the samples have been taken. Under H0 : P1 = P2 =P(say), an unbiased estimate of the
n1 p1 +n2 p2 X1 +X2
̂
population proportion P, based on both the samples is given by P = = .
n1 +n2 n1 +n2
Examples

1. Random samples of 400 men and 600 women were asked whether they would like to
have a flyover near their residence. 200 men and 325 women were in favor of the
proposal. Test the hypothesis that proportions of men and women in favor of the
proposal, are same against that they are not, at 5% level.[Ans: Z=-1.269, we may
conclude that men and women do not differ significantly as regards proposal of flyover
concerned]
2. A company has the head office at Calcutta and a branch at Bombay. The personnel
director wanted to know if the workers at the two places would like the introduction of
a new plan of work and a survey was conducted for this purpose. Out of a sample of
500 workers at Calcutta. 62% favoured the new plan. At Bombay out of a sample of 400
workers, 41% were against the new plan. Is there any significant difference between the
two groups in their attitude towards the new plan at 5% level?[Ans:Z=0.917, we
conclude that there is no significance difference between the two groups in their
attitudes toward new plan]
3. Before an increase in excise. duty on tea,800 personsns out of a sample of 1 ,000 persons
were found to be tea drinkers. After.an. increase in duty. 800 people were tea drinkers in
a sample of 200 people. Using standard error of proportion, state whether there is a
significant decrease in the consumption of tea after the increase in excise duty ?[Ans:
z=6.842, There is significant decrease in consumption of tea after increase in the excise
duty.]
4. A cigarette manufacturing firm claims that its brand A of the cigarettes outsells its brand
B by 8%. If it is found that 42 out of a sample of 100 smokers prefer brand A and 18 out
of another random sample of 100 smokers prefer brand B. test whether the 8% difference
is a valid claim.[Ans: Z=-1.02, we may conclude that a difference of 8% in sale of two
brands of of cigarettes is a valid claim ]
5. On the basis of their total scores, 200 candidates of a civil service examination are
divided into two groups, the upper 30 per cent and the remaining 70 per ·cent. Consider
the first question of this examination .Among the first group, 40 had the correct answer,
whereas among the second group, 80 had the correct answer. On the basis of these
results, can one conclude that the first question is no good at discriminating ability of the
type being examined here?[Ans: z=1.258.Here we conclude that the first question is not
good enough to distinguish between the ability of the two groups of candidates]
6. In a year there are 956 births in a town A. of which 52.5% were males, while in towns A
and B combined, this proportion in a total of 1406 births was 0.496. Is there and
significant difference in the proportion of male births in the two towns?[Ans: Z=3.368,
we conclude that there is significant difference in the proportion of male births in the
towns A and B]
7. In two large populations, there are 30 and 25 per cent respectively of blue-eyed people.
Is this difference likely to be hidden in samples of 1.200 and 900 respectively from the
two populations ?[Ans: Z=2.56, we conclude that difference in population proportion is
unlikely to be hidden sampling]

You might also like