You are on page 1of 30

Hypothesis Testing

Proposition: It is a statement about observable


phenomena(concepts) that may be judged as
true or false.

Hypothesis: It is a formulated proposition for


empirical testing.

A hypothesis is of a tentative and conjectural


nature.
Null hypothesis(𝐇𝟎 ): A statement in which no
difference or effect is expected.

If the null hypothesis is not rejected ,no change


will be made.

Alternative hypothesis (𝐇𝟏 ): A statement that


some difference or effect is expected.

Accepting the alternative hypothesis will lead to


changes in opinions or actions.
One-tailed test: A test of the null hypothesis
where the alternative hypothesis is expressed
directionally.
Right tail test
H0 : P ≤ 0.4
H1 : P ˃ 0.4

Left tail test


H0 : P ≥ 0.4
H1 : P ˂ 0.4
Two-tailed test: A test of the null hypothesis
where the alternative hypothesis is not
expressed directionally.
H0 : P = 0.4
H1 : P ≠ 0.4
Test statistic
A measure of how close the sample has come to
the null hypothesis
𝑡−𝜃
Z=
𝑆𝐸(𝑡)
It often follows a well known distribution,such
as normal, t or chi-square distribution
Test-Statistic Used for
(i) Z-test For test of Hypothesis involving large sample i.e. > 30
(ii) t-test For test of Hypothesis involving small sample i.e. ≤30 and
if is σ unknown
(iii) χ𝟐 - test For testing the discrepancy between observed
frequencies and expected frequencies without any
reference to population parameter
(iv) F-test For testing the sample variances.
Statistical Decision of the Test
True Situation
H0 is the True H0 if False
Accept H0 Correct Decision(1-α) Type-II Error (β)

Reject H0 Type-I Error(α) Correct Decision(1-β)

Type I error: Also known as alpha error, it occurs


when the sample results lead to the rejection of a
null hypothesis that is in fact true.

Type II error: Also known as beta error, it occurs


when the sample results lead to the acceptance of a
null hypothesis that is in fact false.
α = Level Significance
1-α = Confidence
α =The Probability of doing Type I error

Β =The Probability of doing Type II error

Level of significance(α): It is the Probability of


doing Type I error

Power of a test(1-β): The Probability of rejecting


null hypothesis when it is in fact false and
should be rejected
Critical value 𝑍𝛼 at 𝛼% level of significance
Level of
Type of Test Significance
1% 5%
1. Two-tailed 2.58 1.96
2. One-tailed
-Right tail +2.33 +1.645
-Left tail -2.33 -1.645
A General Procedure for hypothesis testing
1.Formulate the null hypothesis and alternative
hypothesis.
2.Select an appropriate test
3.Choose the level of significance
4.Collect data and calculate the test
statistic(𝑍𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 ).
5.Determine the critical value of the test(𝑍𝑡𝑎𝑏𝑢𝑙𝑎𝑟 )
6.Make a statistical decision
If 𝑍𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 < 𝑍𝑡𝑎𝑏𝑢𝑙𝑎𝑟 then H0 is accepted
𝑍𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 > 𝑍𝑡𝑎𝑏𝑢𝑙𝑎𝑟 then H0 is rejected
7.Express the statistical decision in to research decision
1. Test for Mean of any Z= 𝑿ഥ −μ
ഥ)
𝑺𝑬(𝑿
random sample if σ
population standard 𝑺𝑬(𝑿 ഥ)=
𝒏
deviation ( ) is known Or
𝑺
𝑺𝑬(𝑿ഥ)=
𝒏

ഥ = Sample mean
Where, 𝑿
μ= population mean

Where, σ = population S.D.


S = population S.D.

n= sample size

Note: In case of finite population, the standard error has to be multiplied by the
𝑁−𝑛
finite population multiplier,
𝑁−1
Case Study - 1.Phillips company claims that the
length of life of its electric bulbs 2000 hours with
standard deviation of 30 hours. A random
sample of 25 showed an average life of 1940
hours with a standard deviation of 25 hours. At
5% level of significance can we conclude that
the sample has come from a population with
mean of 2000 hours?
Discussion of Case Study - 1.
Hint: 𝑯𝟎 can be confirmed from the statement
of claim / conclusion, if population standard
deviation is given then sample standard
deviation is redundant.
Discussion of Case Study - 1.
• Hint: 𝑯𝟎 can be confirmed from the statement of claim / conclusion, if
population standard deviation is given then sample standard deviation is
redundant.

• Solution:
• Give ∝= 𝟓%, 𝒙 ഥ = 𝟏𝟗𝟒𝟎, 𝝁 = 𝟐𝟎𝟎𝟎, 𝝈 = 𝟑𝟎, 𝑺 = 𝟐𝟓, 𝒏 = 𝟐𝟓
• 𝑯𝟎 : 𝝁 = 𝟐𝟎𝟎𝟎 𝒉𝒐𝒖𝒓𝒔
• 𝑯𝟏 : 𝝁 ≠ 𝟐𝟎𝟎𝟎 𝒉𝒐𝒖𝒓𝒔 (two tailed test)
• 𝒕=𝒙 ഥ
𝝈 𝟑𝟎
• 𝒙) =
S.E (ഥ = =𝟔
𝒏 𝟐𝟓

𝒙−𝝁 𝟏𝟗𝟒𝟎−𝟐𝟎𝟎𝟎
• Z = 𝑺.𝑬(ഥ𝒙) = = −𝟏𝟎
𝟔
• 𝒁𝒄𝒂𝒍𝒄𝒖𝒍𝒂𝒕𝒆𝒅 = −𝟏𝟎 = 𝟏𝟎
• 𝒁𝒕𝒂𝒃𝒖𝒍𝒂𝒓 = 𝒁𝟎.𝟎𝟓 = 𝟏. 𝟗𝟔
• 𝒁𝒄𝒂𝒍 > 𝒁𝒕𝒂𝒃

• Since the calculated value of 𝒁 is greater than the tabular value, 𝑯𝟎 is rejected.
• The sample has not come from a population with mean 2000.
Case Study - 2. A pharma company hypothesizes
that the effect of a certain sedative is 13 hours
with a known standard deviation of 2 hours.
From a sample of 16 patients, it is found that
the sample mean to be 12 hours. At 0.01 level of
significance, should be company conclude that
the average effect of the sedative is less than or
equal to 13 hours.
Discussion of Case Study - 2.
Hint: Less than or equal is null hypothesis.
Discussion of Case Study - 2.
Hint: Less than or equal is null hypothesis.
Given 𝝁𝟎 = 𝟏𝟑, 𝝈 = 𝟐, 𝒏 = 𝟏𝟔, 𝒙 ഥ = 𝟏𝟐
∝= 𝟎. 𝟎𝟏
𝑯𝟎 : 𝝁 ≤ 𝟏𝟑
𝑯𝟏 : 𝝁 > 𝟏𝟑 (one tailed)
𝝈 𝟐 𝟐 𝟏
𝒙) =
S.E (ഥ = = = = 𝟎. 𝟓
𝒏 𝟏𝟔 𝟒 𝟐
ഥ−𝝁
𝒙 𝟏𝟐−𝟏𝟑
Z= = = −𝟐
𝑺.𝑬(ഥ
𝒙) 𝟎.𝟓
𝒁𝒄𝒂𝒍𝒄𝒖𝒍𝒂𝒕𝒆𝒅 = 𝟐
𝒁𝒕𝒂𝒃𝒖𝒍𝒂𝒓 = 𝒁𝟎.𝟎𝟏 = 𝟐. 𝟑𝟑 (one tailed)
𝑺𝒊𝒏𝒄𝒆 𝒁𝒄𝒂𝒍 < 𝒁𝟎.𝟎𝟏 = 𝟐. 𝟑𝟑,

𝑯𝟎 is accepted. Hence the average effect of the sedative is


less than or equal to 13 hours.
Case Study - 3. In order to test whether the
average weekly maintenance cost of a fleet of
buses is more than Rs.500, a random sample of
49 buses was taken. The mean and the standard
deviation were found to be Rs.506 and Rs.42.
Assume α = 0.05
Discussion of Case Study - 3.
Hint: Sample standard deviation is used in S.E.
as 𝝈 is unknown, more than 𝝁𝟏
Discussion of Case Study - 3.
Hint: Sample standard deviation is used in S.E. as 𝝈 is
unknown, more than 𝝁𝟏
Given 𝝁𝟎 = 𝟓𝟎𝟎, 𝒏 = 𝟒𝟗, 𝒙 = 𝟓𝟎𝟔, 𝑺 = 𝟒𝟐, ∝=
𝟎. 𝟎𝟓
𝑯𝟎 : 𝝁 ≤ 𝟓𝟎𝟎
𝑯𝟏 : 𝝁 > 𝟓𝟎𝟎 (one tailed)
𝑺 𝟒𝟐
𝒙) = =
SE (ഥ = 𝟔
𝒏 𝟒𝟗
ഥ−𝝁
𝒙 𝟓𝟎𝟔−𝟓𝟎𝟎
Z= = =𝟏
𝑺.𝑬(ഥ
𝒙) 𝟔
𝒁𝒕𝒂𝒃𝒖𝒍𝒂𝒓 = 𝒁𝟎.𝟎𝟓 =1.645 (one tailed)
Since 𝒁𝑪𝒂𝒍 < 𝒁𝒕𝒂𝒃 = 𝟏. 𝟔𝟒𝟓, 𝑯𝟎 𝒊𝒔 𝒂𝒄𝒄𝒆𝒑𝒕𝒆𝒅
Hence, the average weekly maintenance cost of a fleet
of bases is not more than Rs.500.
Sample Size Determination based on
Precision Rate and Confidence
𝑡−𝜃
Z=
𝑆𝐸(𝑡)
Where 𝜃 ϵ [t –Z. 𝑆𝐸(𝑡), t +Z. 𝑆𝐸(𝑡)],
i.e, The parameter , 𝜃 has a chance of belonging to the
confidence interval with the lower limit t –Z. 𝑆𝐸(𝑡) to the
upper limit t +Z. 𝑆𝐸(𝑡).
i.e ,There is a chance of getting 𝜃 if one searches with in the
distance of Z. 𝑆𝐸(𝑡) with reference to t.
i.e There is a confidence of getting 𝜃 at a distance of Z.
𝑆𝐸(𝑡) either to the left or to the right of t.
Hence Z. 𝑆𝐸(𝑡) is the acceptance error which is known as
the desired precision.
e= Error of acceptance= Precision=Z𝜶 . 𝑺𝑬(𝒕)
HOW TO DETERMINE SAMPLE SIZE
If the size of sample is too small it may not help in the analysis, on the other hand if it is
too large there may be waste of resource. To strike a balance between the two one
should determine optimum sample size with a specific level of precision.
But an optimum size would be the one which secures a compromise between the
precision to be sacrificed and the effort involved in observing the sample of a given size.

Practical Steps in Determining Sample Size for Estimating A Mean


Step-1 : Specify the desired confidence level (say 95%, 99%)
Step-2 : Calculate the permissible sampling error (E)
Step-3 : Calculate the standard deviation, 𝝈
Step-4 : Calculate the sample size (n) for estimating a mean as follows:
𝒛𝟐 𝝈𝟐 𝝈𝒁 𝟐
Sample Size (n) = 𝒐𝒓
𝑬𝟐 𝑬
Illustration
It is known that the population standard deviation in waiting time
for new gas connection in a particular town is 25 days. How large a
sample should be chosen to be 95% confident that the average
waiting time is within 6.125 days of the true average?

Sol:
𝝈 = 𝟐𝟓 , 𝒁 𝒄𝒐𝒓𝒓𝒆𝒔𝒑𝒐𝒏𝒅𝒊𝒏𝒈 to 95% confidence level = 1.96,
E=6.125

𝝈𝒁 𝟐 𝟐𝟓×𝟏.𝟗𝟔 𝟐
Sample Size, 𝒏 = = = 𝟔𝟒
𝑬 𝟔.𝟏𝟐𝟓
PRACTICAL STEPS INVOLVED IN DETERMINING SAMPLE SIZE FOR ESTIMATING A PROPORTION

Step-1  Specify the desired confidence level, (say 95%, 99%)


Step-2  Calculate the permissible sampling error, (E)
Step-3  Calculate the estimated proportion of success (P)
Step-4  Calculate the sample size (n) for estimating a proportion as follows:

𝒁𝟐 𝑷 (𝟏−𝑷) 𝒁𝟐 𝑷𝑸
Sample Size, 𝒏 = 𝒐𝒓
𝑬𝟐 𝑬𝟐
Illustration
A firm wishes to estimate with an error of not more than 0.02 and a level
of confidence of 98%, the proportion of customers who prefer its brand of
household detergent. Sales reports indicate that about 0.10 of all
consumers prefer the firm’s brand. What is the requisite sample size?
Sol:
𝑷 = 𝟎. 𝟏𝟎, 𝟏 − 𝑷 = 𝟏 − 𝟎. 𝟏𝟎 = 𝟎. 𝟗𝟎, 𝑬 = 𝟎. 𝟎𝟐
Z corresponding to 98% confidence interval = 2.33
𝒁𝟐 𝑷 (𝟏−𝑷)
Sample Size (𝒏) =
𝑬𝟐

𝟐.𝟑𝟑 𝟐 ×𝟎.𝟏×𝟎.𝟗
𝒏= = 𝟏𝟐𝟐𝟏. 𝟓𝟎 ≈ 1222 (approx)
(𝟎.𝟎𝟐) 𝟐

𝒏 =1222.
Case Study-I:
Determine the size of the sample for estimating
the true weight of the cereal containers for the
universe with N=5000 on the basis of the
following information:
(1) The variance of weight = 4 ounces on the
basis of past records
(2) Estimate should be with in 0.8 ounces of the
true average weight with 99% probability.
Will there be a change in the size of the sample
if we assume infinite population in the above
given case ?
Discussion:
When N is known
σ 𝑁−𝑛
E=z. .
𝑛 𝑁−1

2 2 σ2 𝑁−𝑛
=>𝐸 =𝑧 . .
𝑛 𝑁−1
2 2 (2)2 5000−𝑛
⇒ (0.8) =(2.58) . .
𝑛 5000−1

4999(0.64) 5000−𝑛
 =
(2.58)2 . (2)2 𝑛

4999(0.64) 5000
 = -1
(2.58)2 . (2)2 𝑛

3199.36 5000
 +1 =
26.6256 𝑛
Discussion:
3199.36 5000
 +1 =
26.6256 𝑛

3199.36+26.6256 5000
 =
26.6256 𝑛

3225.9856 5000
 =
26.6256 𝑛

5000
 121.16105 =
𝑛

5000
n =
121.16105

=> n =41.27 =41 approx


Discussion:
When N is infinite / unknown
σ
E=z.
𝑛
σ 2
𝐸 2 =𝑧 2 .
𝑛
σ 2
n= 𝑧2. 2
𝐸
(2)2
n= (2.58)2 .
(0.8)2
6.6564 .(4)
n=
0.64

n =41.6025 ≈ 42 (approx.)
Case Study- II:
Suppose a certain hotel management is
interested in determining the percentage of the
hotel’s guess who stay for more than 3 days.The
reservation manager wants to be 95 percent
confident that the percentage has been
estimated to be within ± 3% of the true value.
What is the most conservative sample size
needed for this problem ?
Discussion:
When N is unknown
𝑝𝑞
E=z.
𝑛
2 𝑝𝑞
2
=>𝐸 =𝑧 .
𝑛
2 𝑝𝑞
=>n =𝑧 . 2
𝐸
2 0.5(1−0.5)
=>n =(1.96) .
(0.03)2
3.8416(0.25)
=>n =
0.0009
0.9604
=>n =
0.0009
=> n = 1067.11≈ 1067 (approx).

You might also like