This action might not be possible to undo. Are you sure you want to continue?
Prof GRC Nair
A hypothesis is a statement or claim / assumption
/assertion, which is put to test.
It is neither proved nor provable.
The accused is innocent, Earth is round, Vitamin C
prevents cold, Fiesta gives 20 km per liter, u = 100 etc.
Every hypothesis implies its contradiction or alternative.
The accused is guilty, Earth is flat, Vitamin C has no
effect on cold, Fiesta gives < 20 km only, u = l00 etc
What is a Hypothesis?
Testing is done on evidence from sample data.
A Null Hypothesis, denoted by H
0
, may be an assertion.
This is the assertion we hold to be true until we have
sufficient statistical evidence to conclude otherwise.
H
0
: u = 100
The Alternative Hypothesis, denoted by H
1
, is the
assertion of situation not covered by the Null Hypothesis.
H
1
: u = 100 or H
1
< 100 or H
1
> 100
H
0
and H
1
are:
Mutually exclusive
± Only one can be true.
Collectively Exhaustive
± Together they cover all possibilities, so one or the
other must be true.
Statistical Hypothesis Testing
One hypothesis is maintained to be true
until a decision is made to reject it as false:
Guilt is established ³beyond reasonable
doubt´.
The alternative is highly improbable.
DecisionMaking
A Hypothesis is either true or false, and we
may reject it or we may fail to reject it on the
basis of information in the test.
Hypotheses about other parameters such as
population proportions and and population
variances are also possible. For example
H
0
: p 40%
H
1
: p < 40%
H
0
: s
2
50
H
1
: s
2
> 50
Hypothesis about other Parameters
u
e
The Null Hypothesis:
Often represents the status quo
situation or an existing belief.
Is maintained, or held to be true, until
a test leads to its rejection in favor of
the Alternative Hypothesis.
Is rejected as false or not rejected
(continuing to take it as true) on the
basis of a test statistic test statistic.
The Null Hypothesis, H
0
A test statistic test statistic is a sample statistic computed
from sample data. The value of the test
statistic is used to decide whether to reject
the null hypothesis or not .
The decision rule decision rule of a statistical hypothesis
test is a rule that specifies the conditions
under which the Null Hypothesis may be
rejected.
Consider H
0
: u = 100. We may have a decision rule that
says: ³Reject H
0
if the sample mean is less than 95 or
more than 105.´
In a courtroom we may say: ³The accused is innocent
until proven guilty beyond a reasonable doubt.´
The Concepts of Hypothesis Testing
There are two possible states of nature:
H
0
is true
H
0
is false
There are two possible decisions:
Reject H
0
ie takes H
0
as false.
Fail to reject H
0
ie takes H
0
as true.
Possible Decisions
Correct Decisions
A decision to reject or not to reject H
0
may be:
Correct
A true hypothesis may not be rejected
»eg: An innocent defendant may be acquitted
A false hypothesis may be rejected
»eg: A guilty defendant may be convicted
A decision may be correct in two ways:
Fail to reject a true H
0
Reject a false H
0
A decision to reject or not to reject H
0
may be:
Incorrect
A true hypothesis may be rejected
»eg: An innocent defendant may be convicted
A false hypothesis may not be rejected
»eg: A guilty defendant may be acquitted
A decision may be incorrect in two ways:
Reject a true H
0
: Type I Error
Fail to reject a false H
0
: Type II Error
Incorrect Decisions  Errors
A contingency table illustrates the possible
outcomes of a statistical hypothesis test.
Type I and Type II Errors
The Probability of a Type I error is denoted
by E.
E is called the level of significance of
the test. (1 E) is the confidence level.
The Probability of a Type II error is
denoted by F.
1  F is called the power of the test.
E and F are conditional probabilities:
E
F
= P(Reject H H is true)
= P(Accept H H is false)
0 0
0 0
Significance & Power of a Test
When E increases, F decreases. Hence give
preference for the one which is more
important for taking proper decision,
considering the consequences. eg:
Keep E very low in a murder case.
Innocent should not be punished. (F high)
H
0
: Food is free of poison. Keep F very low.
May cause only inconvenience of
transportation, delay etc only
If action is to be taken when a parameter is
either too high or too low compared with
some specific value µa¶, then H
1
is that, the
parameter = to µa¶, and such test is a two
tailed test. eg: Identifying Indians by height.
H
0
: Q ! 5¶6´ H
1
: Q = 5¶6´
1Tailed and 2Tailed Tests
The number of tails of a statistical test is
determined by the need for an action.
If action is to be taken when a parameter is
less than some value a , then the
alternative hypothesis is that, the
parameter is less than a, and such test is a
lefttailed test. eg: effect of vitamin C to
reduce common cold & sneezing.
H
0
: Q ! 15 H
1
: Q 15
If action is to be taken when a parameter is
greater than some value a, then the
alternative hypothesis is that, the parameter
is greater than a, and such test is a right
tailed test. eg: effect of tuition to increase
mark.
H
0
: Q ! 45 H
1
: Q " 45
The rejection region of a statistical
hypothesis test is the range of numbers that
will lead us to reject the null hypothesis in
case the test statistic falls within this range.
The rejection region, also called the critical
region, is defined by the critical points.
The rejection region is defined so that,
before the sampling takes place, our test
statistic will have a probability E of falling
within the rejection region if the null
hypothesis is true.
Rejection Region
Lower Rejection
Region
Upper Rejection
Region
0 . 8
0 . 7
0 . 6
0 . 5
0 . 4
0 . 3
0 . 2
0 . 1
0 . 0
0.025 0.025
0.95
Nonrejection
Region
0
1.96
1.96
Standard Normal Distribution. Two tailed test 5% significance
Critical Values of µz¶ for:
Two tailed test
90% 1.645, 95% 1.96, 99% 2.58
One tailed test
90% 1.28, 95% 1.645, 99% 2.33
Some Critical Values for Tests
Steps
Identify the Null Hypothesis.
Decide if two tail or single tail test.
Form the Alternative Hypothesis.
Select the appropriate distribution.
Set the critical value for the significance level
specified and mark the rejection area.
Calculate the std error of the statistic.
Convert the observed values to standardized
values.
Mark the position of the sample value on the
graph.
Compare the position visaavis the critical value
and decide to reject H
0
or not.
We will see different types of hypothesis tests, namely
One Sample Tests
Tests of population means  Z and t
Test of Proportion  Z
Two Sample tests
Tests for Means
Tests for Proportion
Type of Hypothesis Tests
Cases in which the test statistic is Z
W is known and the population is normal.
W is known or unknown and the sample size is
large > 30. The population need not be normal
for large sample.
One Sample Tests  Means
n

!
x
z
The formula for calculating Z is :
W
Q
Cases in which the test statistic is t
W is unknown, sample is small, and the population
is normal.
One Sample Tests  Means

!
S / n
x
t
The formula for Calculating t is :
Q
A group of 36 men are suspected to be
Indians. Their average height was found to
be 5¶ 8´. The mean height of Indians is
known to be 5¶ 6´ with a std deviation of 3´.
Test the hypothesis that they are Indians at
99% confidence level (1% significance
level).
H
0
: u = 5¶ 6´, H
1
: u = 5¶ 6´. Large sample.
Example ± z test 1
The average mark for a subject in a college
is 55 with std deviation 10. A group of 36
students from this college, who were
undergoing special tuition, was found to
have an average mark of 60. Test whether
the tuition was effective at 1% significance
level.
(Hint ± one tailed ). Large sample. Use z test
H
0
: Q = 55 H1 : Q " 55
Example  2
An automatic bottling machine fills cola into
two liter (2000 cc) bottles. A consumer
advocate wants to test the null hypothesis that
the average amount filled by the machine into
a bottle is at least 2000 cc. A random sample of
40 bottles coming out of the machine was
selected and the exact content of the selected
bottles are recorded. The sample mean was
1999.6 cc. The population standard deviation
is known from past experience to be 1.30 cc.
Test the null hypothesis at the 5% significance
level.
Example  3
H
0
: u > 2000
H
1
: u 2000
n = 40. Large sample .Use z test. One tailed.
For E = 0.05, the critical value of z is 1.645
The test statistic is:
Do not reject H
0
if: [z >1.645]
Reject H
0
if: ?z lo4ªJ
z
x
W / root n
!
 Q
0
0
H Reject 1.95 =
=
0
1.3 =
1999.6 = x
40 = n
40
1.3
2000  1999.6
!
n
x
z
W
Q
W
From a large population of unemployed youth,
a random sample of 25 was taken and their IQ
measured. It averaged 97 with std deviation of
12. Could it be inferred that their IQ is lower
than that of the average population(100). Test at 5%
Significance level. ans:
H
0
: u=100 H
1
: u < 100.
n= 25,(small sample) W is un known. So use t distribution.
S.E = 2.4. t = (x  u) / S.E = 1.25.
Table value at d.f =24 for 5%(one tail) = 1.711. since 1.25 is
within the acceptance region, Ho is not rejected.
Example  4  t test
Testing Proportion ± Large Sample
H
0
: p= p
H
1
: p =p, or p < p or p> p
For large sample (if both np
and nq are > 5) use Z test
SE = W
p
= root (pq/n)
z = (p ± p)/W
p
A manufacturer claims that at least 95% of the
machinery he supplied was confirming to the
specifications. An examination of a sample of
200 machines showed that 16 were faulty. Test
his claim at µE¶ of i. 5% ii. 1%.
Ho: p >.95; H
1
: p <.95 ; p= 84/200=.92; n=200;
S.E= root of (pq/n) = root of 0.95*0.05/200= 0.0154;
z=(0.92 0.95)/0.0154 = 1.948 > 1.645
So Ho rejected at 95%
But 1.948 < 2.33
So not rejected at 99%
Example  5
Two Sample Tests
Means of Large Samples
S.E of difference of 2 sample means
= root of [(W
1
2
/n
1
)+(W
2
2
/n
2
)]
Use Z test
H
0
: u
1
=u
2
; H
1
: as the case may be
Z= (x
1
x
2
)/S.E
A reading test is given to a class that consists of Indian
children and Pakistani children. The results of the test
on sample of size 100 Indians and 120 Pakistanis gave
the following data.
Indian : Mean marks = 74
Standard deviation = 8
Pakistani : Mean Marks = 70
Standard deviation = 10
Are the Indians superior to the Pakistanis children?
Test at significant at a 0.01. ans:
S.E= 1.214, z= 3.295. one tailed. 3.3 >2.33. Ho rejected.
ie, there is statistical evidence of superiority.
Example  6
Use t test
Find the combined std deviation of the
samples µs¶
S
2
= [(n
1
1)s
1
2
+ (n
2
1)s
2
2
] / (n
1
+n
2
2)
S.E = s * root of [(1/n
1
)+(1/n
2
)]
t = (x
1
x
2
)/S.E d.f = n
1
+n
2
2
Two Sample Tests
Means of Small Samples
A sample of 15 children from Mumbai showed that
the mean time that they spend watching TV is 28.50
hours per week with a S.D of 4 hours. Another
sample of 16 children from Calcutta showed that the
mean time spent by them watching TV is 23.25
hours per week with a S.D of 5 hours. Using a 2.5%
significance level can you conclude that the mean
time spent watching TV by children in Mumbai is
greater than that for children in Calcutta.
S
2
= (14x16+15x25)/(15+162) = 20.655 s= 4.545
SE= 4.545xroot of [(1/15) + (1/16) ) =1.6335
t = (28.523.25)/1.6335 = 3.214. > table value at d.f 29
for one side 2.5%, ie, 2.0452.
Reject Ho. Answer is yes.
Example ± 7. t test
Similar elements of population but under
different conditions.
eg: before and after a treatment
Paired samples consisting same/similar
elements before and after.
H
0
:u
1
=u
2.
Use t test.
s
2
= 7 d
2
/n ± (7d/n)
2.
S.E =s /root (n± 1)
t =d / S.E. d.f = n 1
Two Sample Tests
Means of Dependent Samples
IQ test was administered to 5 persons before and
after they were trained. The results are as given
below. Is there statistical evidence of improvement?
Test at 5% significance.
Candidates I II III IV V
IQ before 110 120 123 132 125
I Q after 120 118 125 136 121
Example ± 8. Dependent samples
s.nIQ. (before) IQ (after) d d
2
1
110 120 10 100
2
120 118 2 4
3
123 125 2 4
4
132 136 4 16
5
125 121 4 16
y 7d !l0 7d
!l40
y d =10/5=2 t = 0.82 < table value for d.f
4, one tailed, which is 2.132. So no
evidence.
S.E of difference of 2 proportions
= root of {pq [(1/n
1
)+ (1/n
2
)]}
p = (n
1
p
1
+ n
2
p
2
)/(n
1
+n
2
)
Use Z test
Z = (p
1
p
2
)/S.E
Two Sample Tests
Proportions of Large Samples
When the role of Tulsi in a popular
serial was played by Smriti Irani, it was
observed that 400 out of 500 viewers
interviewed used to watch that serial.
Later when the same character was
played by Gautmi Kapoor, the
viewership fell to 400 out of 600 viewers
interviewed. Is the drop in viewership
significant ?
Example ± 9
Ho: H
1
:
p
1
= 400/500=0.8
p
2
= 400/600=0.667
p = (400+400)/(500+600)=8/11
q=3/11.
S.E= root {pq[(1/n
1
)+(1/n
2
)]}=.027
z= (p
1
p
2
)/S.E = 4.926 > 2.33 , for 1%
significance level by one tailed test.
There is significant drop in viewership.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue listening from where you left off, or restart the preview.