You are on page 1of 29

STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

3. Hypothesis Testing
Outline

3.0 Hypothesis Testing


3.1 Hypothesis Test for Population Mean
3.2 Hypothesis Test for Population Variance

Objectives

At the end of this chapter, the students should be able to


1. Formulate problems in the form of statistical hypothesis and test them.
2. Perform hypothesis test for population means and variances.
3. Understand Type I and Type II error for hypothesis testing

3.0 Hypothesis test

The inferential statistics is divided into two, which is the estimation of population parameters
(introduce in Chapter 2) and hypothesis testing. A Hypothesis test is the process of testing
the validity of the claim made on the population parameter. This method is used to decide
which of two contradictory claims about the parameter is correct. There are five basic steps
in hypothesis testing:

Step 1: Hypothesis Statement


Step 2: Test Statistics
Step 3: State the Rejection Area/Determine the critical value
Step 4: Decision Making
Step 5: Conclusion

Step 1: Hypothesis Statement

 The null hypothesis (H0)

 Ho specifies the value of the population parameter to be tested in a hypothesis


test.
 “Null” means “no difference”
 statistical hypothesis states that there is no difference between a parameter
and a specific value, or there is no different between two parameter
 always includes the equal sign( = )
 the decision is based on the null hypothesis

 The alternative Hypothesis (H1)

 a statistical hypothesis states that there is a specific difference between a


parameter and a specific value, or there is a specific difference between two
parameter
 the inequalities sign are greater (>),less (<) or unequal ( ≠)
 statement which is true if the null hypothesis is false
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

Examples 1
State the hypothesis, H0 and H1 for the following situation:

SITUATION A
A researcher thinks that if expectant mothers use vitamin pills, the birth weight of the
babies will increase. The average of the birth weights of the population is 8.6 pounds.

Hypothesis Statement: H0 : __________ H1:__________

SITUATION B
A SCUBA instructor wants to record the collective depths each of his students dives
during their checkout. He is interested in how the depths vary, even though everyone should
have been at the same depth. He believes the standard deviation is less than three feet.

Hypothesis Statement: H0 : __________ H1:__________

SITUATION C
A medical researcher is interested in finding out whether a new medication will have
any undesirable side effects. He is particularly concerned with the pulse rate of the patients
who take the medication. He knows that the mean pulse rate for the population under study is
82 beats per minute

Hypothesis Statement: H0 : __________ H1:__________

Step 2: Determine the Test Statistics / Compute the test value


 test value is the numerical value obtained from the statistical test
 refer formula on table 3.2 to determine or calculate the test statistics of the
hypothesis test.

Step 3: State the Rejection Area / Determine the critical value

 Critical value

 separates the rejection region from the non-rejection region


 to obtained the critical value, α must be chosen first.
 obtained from table statistics ( example: Z 0.01 , t 0.05,df ,  2 0.05,df )

 Rejection Region

 the range of the test value indicates that there is a significance difference
 the null hypothesis, H0 should be rejected (accept H1)

 Non-rejection region

 the range of the test value indicate that there is no significance difference
 the null hypothesis, H0 should not be rejected (accept H0)
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

There are three types of test in hypothesis testing, depending on the direction of the
inequality sign of the alternative hypothesis. Refer figure 3.1 below:

One-tailed right test

 Indicates that the H0 should be rejected


when the test value is in the rejection
region on the right side of the mean.
Consider situation A
 based on H1:    0

One-tailed left test

 The H0 should be rejected when the test


value is in the rejection region on the
left side of the mean. Consider situation
B
 based on H1:    0

Two-tailed test

 α is divided into two equal parts


 The H0 should be rejected when the test
value is in either of the two rejection
regions. Consider situation C
 based on H1:    0

Figure 3.1: Three types of rejection area in hypothesis testing

Example 2
Write down the null and alternative hypotheses for each statement. Determine whether it is a
one-tailed test or two-tailed test

a) An engineer hypothesizes that the mean number of defects can be decreased in a


manufacturing process of compact discs by using robots instead of humans for
certain tasks. The mean number of defective discs per 1000 is 18.

Hypothesis Statement: H0 : __________ H1:__________ Test : _____________

b) Suppose a math instructor believes that the standard deviation for his final exam is
five points. One of his best students thinks otherwise. The student claims that the
standard deviation is more than five points.

Hypothesis Statement: H0 : __________ H1:__________ Test : ______________

c) A psychologist feels that playing soft music during a test will change the results of the
test. The psychologist is not sure whether the grades will be higher or lower. In the
past, the mean of the score was 73.

Hypothesis Statement: H0 : __________ H1:__________ Test : ____________


STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

Step 4 Decision Making

 Deciding whether to reject or do not reject the null hypothesis based on the
comparison between the test statistic and the critical value.
 Ho is rejected when the value of the test statistics lies in the critical region.
 Ho is accepted or do not reject Ho when the value of the test statistics does not lie in
the critical region.

Step 5: Conclusion
 Summarize the result based on the decision making in step 4, reject H0 or do not
reject H0
 Figure 3.2 shows the four possible outcomes and the summary statement for each
decision.

Claim is at H0 Claim is at H1

Reject H0 Do not reject H0 Reject H0 Do not reject H0

There is not There is enough There is not


There is enough
enough evidence evidence to enough evidence
evidence to
to reject the support the to support the
reject the claim
claim claim claim

Figure 3.2: The Four Possible Outcomes and the Summary Statement

In the hypothesis testing situation, there are four possible outcomes. Refer table 3.1.

H0 true H0 False

Reject H0 Error Type 1


Correct Decision
Do not reject H0
Correct Decision Error Type II
Table 3.1: Possible Outcome In The Hypothesis Testing

 Type I Error : Error of rejecting Ho when Ho is true. The maximum probability of


committing such an error is denoted by α.
 Type II Error : Error of not rejecting Ho when Ho is false. The probability of committing
such an error is denoted by β.
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

The following figure 3.3 shows an overview of the hypothesis testing for mean and variance.
It describes the entire topic to be discussed for each section.

Hypothesis Testing

Mean Variance

One Population Two Population


Mean One Two
Means
Population Population
Variance Variances

σ Known σ Unknown

Independent Dependent Samples


Samples (Paired Samples)

σ1, σ2 σ1, σ2
Known Unknown

σ1 = σ2 σ1 ≠ σ2
Equal Unequal

Figure 3.3: An Overview of the Hypothesis Testing


STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

3.1 Hypothesis Test for Population Mean

Table 3.2 shows the test statistic of the hypothesis test for one population mean, difference
between two population means and mean difference of paired samples.

Null Hypothesis Test statistic


One population mean of a x  0
zcalc 
normal distribution, 
n
Ho :   o
variance, 2 is known
One population mean of a x  0
normal distribution, t calc  ; df  n  1
s
n
Ho :   o
variance, 2 is unknown
Difference between means of
two normal distributions, zcalc 
x 1 
 x 2   1  2 
12 22
Ho : 1   2  0 
n1 n2
variances, 12 and 22 are
known
Difference between means of
two normal distributions, t calc 
x 1 
 x 2  1   2 
; df  n1  n2  2,
1 1
Sp 
Ho : 1   2  0 n1 n 2
variances, 12  22 and
sp 
n1  1 s12  n2  1 s22
unknown n1  n2  2
Difference between means of
two normal distributions, t calc 
x 1 
 x 2  1   2 
;
2 2
S1 S
 2
Ho : 1   2  0 n1 n2
variances, 12  22 and
s 
2
2
n1  s2 2 n2
df 
1
unknown
s   s 
2 2 2 2
1 n1 2 n2
n1  1 n2  1
Mean difference for paired
samples from normal d  d
distributions, t calc  ; df  n  1 where n is no. of pairs
sd
Ho :  d  0 n
Table 3.2: Test Statistic of the Hypothesis Test
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

3.1.1 One Population Mean

In this section, we consider methods of testing claims made about a population mean μ.
When testing a claim about the value of a population mean, the test statistic will depend on
whether the population standard deviation is known or unknown.

The z-test is used when σ 2 is known and t-test is used when σ 2 is unknown. This section is
important in describing the same general method used in the following section.

Use
Is σ / σ2 known? 𝐱 − 𝛍𝟎
Yes 𝐳𝐜𝐚𝐥 = 𝛔
𝐧
No

σ = population standard deviation

Use
𝐱 − 𝛍𝟎
𝐭 𝐜𝐚𝐥 = 𝐬
𝐧

s = sample standard deviation

Figure 3.4: Methods for Inferences About One Mean

The symbol μ0 is the value of μ that is assumed for purposes of the hypothesis test. This
topic will discuss in detail the procedure of the statistical hypothesis testing for one
population mean and also for two population means.

The following table shows the rejection area of the hypothesis test for one population mean
based on different alternative hypothesis (H1).
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

Rejection area
Hypothesis
σ is known - z test
2
σ 2 is unknown - t test
Two-tailed
test
H0 :   0
H1 :   0

Reject H0 if Z cal  Z α or Z cal   Z α Reject H0 if t cal  t α or t cal   t α


, df , df
2 2
2 2

Right-tailed
test
H0 :   0
H1 :   0

Reject H0 if Z cal  Z α Reject H0 if t cal  t α, df


Left-tailed
test
H0 :   0
H1 :   0

Reject H0 if Z cal  Z α Reject H0 if t cal  t α, df


Table 3.3: Rejection Area of the Hypothesis Test for One Population Mean

**Note that this guide on rejection area also can be used to the hypothesis test for difference
between two population means (independent variable) and mean difference of paired
samples (dependent variable).
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

One Population Mean - Variance 𝝈𝟐 is Known (use z distribution)

Example 1
A random sample of 50 statistics professors has a mean IQ score of 120. Assuming that σ
is known to be 12, use α  0.01 to test the claim that the mean IQ score of statistics
professors is greater than 118.

Step 1: Hypothesis

Ho : μ = 118
H1 : μ > 118 (claim)

Step 2: Test Statistic (Refer to Table 3.2)

120  118
Z cal   1.18 𝜎 2 is known – z distribution
12
50

Step 3: Rejection area

Find the critical value from Table 4: Z α  Z 0.01  2.3263

Reject H0 if Z cal  Z α  2.3263 (right-tailed test)

Step 4: Decision

Since Z cal  1.18  2.3263 , do not reject H0. ( Z cal falls in the noncritical region)

Step 5: Conclusion

There is not enough evidence to accept the claim the mean IQ score of statistics professors
is greater than 118.
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

One Population Mean - Variance 𝝈𝟐 is Unknown (use t distribution)

Example 2
The Jaya Tobacco Company advertised that its bestselling no filtered cigarettes contain 40
mg of nicotine. Consumer Advocate magazine ran tests of 10 randomly selected cigarettes
and found the amounts (in mg) shown in the accompanying list. Using a 0.01 significance
level, test the editor’s belief that the mean is equal to 40 mg of nicotine.

47.3 39.3 40.3 38.3 46.3 43.3 42.3 49.3 40.3 46.3

Step 1: Hypothesis

Ho : μ = 40 (claim)
H1 : μ ≠ 40

𝜎 2 is unknown - t distribution
Step 2: Test Statistic (Refer to table 3.2)

43.3 − 40 Use the sample data to find:


𝑡𝑐𝑎𝑙 = = 2.746
3.8006 x  43.3, s  3.8006 n  10
10

Step 3: Rejection area

Find the critical value from Table 7: t α  t 0.005 , 9  3.250


, n 1
2

Reject H0 if t cal  t α  3.250 or t cal  t α  3.250 (two-tailed test)


,n 1 ,n 1
2 2

Step 4: Decision

Since  3.250  t cal  2.746  3.250 , do not reject H0. ( t cal falls in the noncritical region)

Step 5: Conclusion

There is enough evidence to conclude that the claim is equal to 40mg of nicotine.
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

3.1.2 Testing of Hypothesis of Two Population Means (Independent Variables)

In this section, we consider methods for using sample data from two independent samples to
test hypotheses made about the difference between two population means.

Use z distribution:
Are Yes 𝐱𝟏 − 𝐱𝟐 − 𝛍𝟏 − 𝝁𝟐
𝝈𝟏 and 𝝈𝟐 𝐳=
known? 𝛔𝟐𝟏 𝛔𝟐𝟐
𝐧𝟏 + 𝐧𝟐

No

Use t distribution with POOLED


Can it be Yes standard error:
assumed that
𝝈𝟏 = 𝝈𝟐 ? 𝐱 𝟏 − 𝐱 𝟐 − 𝛍𝟏 − 𝝁𝟐
𝐳=
𝟏 𝟏
𝒔𝒑 𝒏 + 𝒏
No 𝟏 𝟐

Use t distribution:

𝐱 𝟏 − 𝐱𝟐 − 𝛍𝟏 − 𝛍𝟐
𝐭=
𝛔𝟐𝟏 𝛔𝟐𝟐
𝐧𝟏 + 𝐧𝟐

Figure 3.5: Methods of Inferences about Two Independent Means

_________________________________________________________________________

𝐻𝑜 : 𝜇1 − 𝜇2 = 0 𝐻𝑜 : 𝜇1 − 𝜇2 = 0 𝐻𝑜 : 𝜇1 − 𝜇2 = 0
𝐻1 : 𝜇1 − 𝜇2 ≠ 0 𝐻1 : 𝜇1 − 𝜇2 > 0 𝐻1 : 𝜇1 − 𝜇2 < 0

_________________________________________________________________________
Figure 3.6: Hypothesis Test of Two Population Means (Independent Variables)
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

Two Population Means - 𝝈𝟐𝟏 , 𝝈𝟐𝟐 Known (use z distribution)

Example 3
A researcher claims that students in a public primary school have exam marks that are
higher than those of students in private primary schools. Random samples of 50 students
from each type of school are selected and given a test. The results are shown. At α = 0.05,
test the claim.

Public Primary School Private Primary School


x1 = 88 x2 = 82
σ1 = 13 σ2 = 13
n1 = 50 n2 = 50

Step 1: Hypothesis

Ho : μ1 − μ2 = 0 or Ho : μ1 = μ2
H1 : μ1 − μ2 > 0 or H1 : μ1 > μ2 (claim)

Step 2: Test Statistic (Refer to Table 3.2)

Z cal 
x 1 
 x 2  μ1  μ 2 

88  82  0  2.308 𝜎12 , 𝜎22 are known
2 2 2 2
σ1 σ 13 13 (use z distribution)
 2 
n1 n2 50 50

Step 3: Rejection area

Find the critical value from Table 4: Z α  Z 0.05  1.6449

Reject H0 if Z cal  Z α  1.6449 (right-tailed test)

Step 4: Decision

Since Z cal  2.308  1.6449 , reject H0. ( Z cal falls in the critical region)

Step 5: Conclusion

There is enough evidence to support the claim that students in a public primary school have
exam marks that are higher than those of students in private primary school.
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

Two Population Means - 𝝈𝟐𝟏 , 𝝈𝟐𝟐 Unknown, Equal Variances 𝝈𝟐𝟏 = 𝝈𝟐𝟐 (use t
distribution)

Example 4
A researcher claims that there is a difference in average television watching times between
teens (ages 12-18) and adults (ages 19 – 30). Based on the sample statistics obtained
below, test the claim. Use α = 0.05. Assume that the data are approximately normally
distributed with same variances.

Teens (ages 12 – 18) Adults (ages 19 – 30)


x1 = 21 x2 = 18
s1 = 3.5 s2 = 4.1
n1 = 15 n2 = 15
Solution:
Step 1: Hypothesis

Ho : μ1 − μ2 = 0 or Ho : μ1 = μ2
H1 : μ1 − μ2 ≠ 0 or Ho : μ1 ≠ μ2 (claim)

Step 2: Test Statistic (Refer to Table 3.2)

Sp 
n 1  1s1  n 2  1s 2
2 2


143.5  14 4.12
2 (Poop
 3.81
n1  n 2  2 15  15  2

t cal 
x 1 
 x 2  μ1  μ 2 

21  18   0  2.16
1 1 1 1
sp  3.81 
n1 n 2 15 15

Step 3: Rejection area

Find the critical value from Table 7: t α  t 0.005 , 28  2.048


, n1n 2  2
2

Reject H0 if t cal  t α  2.048 or t cal  t α  2.048 (two-tailed test)


,n1n 2  2 ,n 1
2 2

Step 4: Decision

Since t cal  3.81  2.048 , reject H0. ( t cal falls in the critical region)

Step 5: Conclusion

There is enough evidence to support the claim that that there is a difference in average
television watching times between teens (ages 12-18) and adults (ages 19 – 30).
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

Two Population Means - 𝝈𝟐𝟏 , 𝝈𝟐𝟐 Unknown, Unequal Variances 𝝈𝟐𝟏 ≠ 𝝈𝟐𝟐 (use t
distribution)

Example 5
Refer to the sample data listed below and use a 0.01 significance level to test the claim that
the mean amount of tar in filtered king-size cigarettes is less than the mean amount of tar in
no filtered king-size cigarettes. All measurements are in milligrams. Assume that both
populations are normally distributed but the variances are not equal.

Filtered 16 15 16 14 16 1 16 18 10 14 12
11 14 13 13 13 16 16 8 16 11

No filtered 23 23 24 26 25 26 21 24

(Source: Elementary Statistics, Mario F. Triola, Pearsons 2004: pg 464)

Solution:
Step 1: Hypothesis

Ho : μ1 − μ2 = 0 or Ho : μ1 = μ2

H1 : μ1 − μ2 < 0 or H1 : μ1 < μ2 (claim)

Step 2: Test statistic (Refer to Table 3.2)

t cal 
x 1 
 x 2  μ1  μ 2 

13.38  24   0  10.68
Use the sample data to find:
2 2
3.735 2 1.610 2 x 1  13.38, s1  3.735 n1  21
s1 s
 2 
n1 n2 21 8 x 2  24, s 2  1.610 n 2  8

2
 s 12 s 2 2   3.735 2 1.610 2 
     
n n 2 
df   1   21 8   26.67  26
Round down the df to
 s12 
2
 s2 2 
2
 3.735 2 
2
 1.610 2 
2 the next smaller integer
       
n  n   21   8 
 1   2 
n1  1 n2  1 20 7
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

Step 3: Rejection area

Find the critical value from Table 7: t α, df  t 0.01, 26  2.479

Reject H0 if t cal  t α, df  2.479 (left-tailed test)

Step 4: Decision

Since t cal  10.68  2.479 , reject H0 ( t cal falls in the critical region)

Step 5: Conclusion

There is enough evidence to support the claim that the mean amount of tar in filtered king-
size cigarettes is less than the mean amount of tar in no filtered king-size cigarettes.

3.1.3 Two Population Means (Dependent Variables) – Paired Samples

In this section, we focus on dependent samples, which we refer to as paired samples.

The variable under consideration in this case is d = (X1 – X2), where X1 and X2 are the before
and after measurements, respectively. With paired samples, there is some relationship so
that each value in one sample is paired with a corresponding value in the other sample.

_________________________________________________________________________

Two-tailed test: Right-tailed test: Left-tailed test


H0 : μ d  0 H0 : μ d  0 H0 : μ d  0
H1 : μ d  0 H1 : μ d  0 H1 : μ d  0

_________________________________________________________________________
Figure 3.7: Hypothesis Test of Two Population Means (Paired Samples)
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

Example 6
A study was conducted on the effects of a special class designed to aid students with verbal
skills. Each student was given a verbal skill test twice, both before and after completing a 4
week class. The results are given in the table below.

Score before 5 3 2 4
Score after 8 10 5 7
(Source: Final Exam Jun 2014 Q2b)

a) Can you conclude that the effect of a special class is positive at   0.05 ?
b) In the context of the problem, state the Type I error that may be made.

Solution:

a) Step 1: Hypothesis (Let d = before – after)

Ho : μd = 0
H1 : μd < 0 (Claim)

Step 2: Test statistic (Refer to Table 3.2)

Score before 5 3 2 4 Standard deviation of the differences,


Score after 8 10 5 7 sd = 2
Difference, d -3 -7 -3 -3
d  μd  4  0
n = 4, Mean of the differences, d = -4 t cal    4
sd 2
n 4

Step 3: Rejection area

Find the critical value from Table 7: t α, df  t 0.05, 3  2.353

Reject H0 if t cal  t α, df  2.353 (left-tailed test)

Step 4: Decision

Since t cal  4  2.353 , reject H0 ( t cal falls in the critical region)

Step 5: Conclusion

There is enough evidence to conclude that the effect of a special class is positive.
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

b) Type I Error: Error of rejecting Ho when Ho is true.

In the context of the problem, type I error is the error of stating that the effect of a special
class is positive when there is no effect of having a special class.

3.2 Hypothesis Test for Population Variance

Test hypothesis about the variability of a normal population is divided to


1) One population variance ,  / standard deviation,
2

2) Ratio between Two Population variance,  / standard deviation,
2

3.2.1 One Population Variance

Suppose that we wish to test the hypothesis about a population variance  or standard
2

deviation  , then the test statistic will follow a chi-square,  distribution with n −1 degrees
2

of freedom, df. Table 3.4 below present the hypothesis statement, test statistics and the
rejection area:

Hypothesis Test State the Rejection


Step
statement Statistics Area
H0 : 12   22
H1 : 12   22
One sided right
Or
 2

 n  1 s2 Reject H0 if
test Hypothesis
H 0 : 1   2 0
02 02  2 , n1
H1 : 1   2

H0 : 12   22
H1 : 12   22
One sided left test
Or
 2

 n  1 s2 Reject H0 if
Hypothesis
H 0 : 1   2 0
02 02  12, n1
H1 : 1   2

H0 : 12   22 Reject Ho if

H1 : 12   22
02  2 2, n1
Two-sided
Hypothesis Or 02 
n  1 s 2

or if
H 0 : 1   2
02
  12(  2), n1
2
0
H1 : 1   2
Table3.4: The Hypothesis Statement, Test Statistics and the Rejection Area For One Population
Variance
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

Example 7
The weight of an object is measured using an electronic scale that reports the true weight
plus a random fluctuation is normally distributed. The manufacturing company of the
electronic scale claims that the variance of fluctuation is 4 mg. Assume that the fluctuations
are independent. The manufacturing company measured the weight of an object 8 times,
and the following values are obtained:

100.8 98.2 102.5 105.1 102.2 99.4 93.6 98.5

Test at   0.05 that the variance of fluctuation is greater than what the company claims.
(Source: Final Exam Jun 2014)

Solution

Given,  02  4 , n = 8

Calculate the variance for sample, s 2 using calculator or formula.

 x  2
800.32
 x 2

n
80144 .55 
8
s2    12.08
n 1 7

Step 1: Hypothesis Statement


One tail right test, α = 0.05
H0 :  2  4
H1 :  2  4 (claim)

Step 2: Test Statistics/Compute the test value

712.08 
2   21.14
4

Step 3: State the Rejection Area

Reject H0 if  2 >  2 0.05,7  20.278 .

Find the critical value


from statistical table 8:

α = 0.05, df = n–1 = 7
 2 0.05,7  20.278

Step 4: Decision Making

Since  2  21.14 > 20.278 , reject H0.


STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

The value of test


statistics lies in
the rejection area

Step 5: Conclusion

At α = 0.05, there is enough evidence to support that the variance of fluctuation is greater
than what the company claims

Example 8
The standard deviation of scores on a statistics test for all students in semester four was 12
in 2015. A sample of scores for 25 students who took this test gave a variance of 165. Test
whether at 5% significance, the standard deviation of all semester four on this test is
different from 12. Assume that the scores on this test are normally distributed.

Solution

Given, Population standard deviation,   12 , sample variance, s 2  165 , n  25 ,   0.05

Step 1: Hypothesis Statement


Two tail, α/2 = 0.025
H0 :   12
H1 :   12 (claim)

Step 2: Test Statistics/Compute the test value

24165 
2   27.5
144

Step 3: State the Rejection Area Find the critical value from
statistical table 8:
Reject H0 if  2 <  2 0.05,7  12.401 or  2 >  2 0.957,24  39.364 α/2 = 0.025, df = n–1 = 24
 210.025,24   2 0.957,24
left side:
 12.401
right side:  2
0.025,24  39.364

Step 4: Decision Making

Since  2  27.5 < 39.364 , do not reject H0


STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

The value of test


statistics lies in the
non-rejection area

Step 5: Conclusion

At α = 0.05, there is not enough evidence to support that the standard deviation of all
semester four on this test is different from 12.

3.2.2 Ratio between Two Population Variances

1
2
1   2 , which is equivalent to  1.
2 2
Two equal variances would satisfy the equation
 22
Since sample variances are related to chi-square distributions, and the ratio of chi-square
distributions is an F-distribution, we can use the F-distribution to test against a null
hypothesis of equal variances or standard deviation. The following table 3.5 shows the step
to test the ratio between two population variances.

Step Hypothesis Statement Test Statistics Rejection Area


H0 : 12   22
H1 : 12   22
One sided right test S12 Reject H0 if
Or
Hypothesis F0 
H 0 : 1   2 S 22 F0  F,n11,n2 1
H1 : 1   2

H0 : 12   22
Reject H0 if
H1 : 12   22
One sided left test S12
F0  F1,n11,n2 1
Or
Hypothesis F0 
H 0 : 1   2 S 22 1
F0 
H1 : 1   2 F,n2 1,n11

H0 : 12   22
Reject Ho if
Two-sided H1 : 12   22 F0  F(  / 2),n11,n2 1
S12
Hypothesis Or F0 
S 22
H 0 : 1   2 or if
H1 : 1   2
F0  F(1 / 2),n11,n2 1

Table 3.5: The Hypothesis Statement, Test Statistics and the Rejection Area for Two Population
Variances
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

Example 9
It is found that the variance of waiting time to see a doctor in the emergency room for
hospital A and B are 750 minutes and 1020 minutes respectively. If a sample of 24 patients
was used in hospital A and 27 in Hospital B, is there enough evidence to conclude that the
1% significant level, the variance of the waiting time in hospital A is less than the waiting
time in hospital B?
(Source: Final Exam Dec 201 Q4)
Solution

Given, n A  24 , S 2A  750 , nB  27 , S B2  1020

Step 1: Hypothesis Statement

A
2 One tail left test , α = 0.01
H0 :  1 or  2A  B2
 B2
H1 :  2A  B2 (claim)

Step 2: Test Statistics/Compute the test value

S 2A 750
F0    0.7353
S B2 1020

Step 3: State the Rejection Area

Reject H0 if F0 < F10.01,23,26  0.3636 . Find the critical value from statistical
table 9:

α = 0.01, df : nA–1 = 23 , nB – 1 = 26
1
F10.01,23,26 
F0.01,26,23
1
  0.3636
2.75

Step 4: Decision Making

Since 0.7353 > 0.3636, do not reject H0


The value of test
statistics lies in the
non-rejection region
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

Step 5: Conclusion

At α = 0.01, there is not enough evidence to conclude that the variance of the waiting time in
hospital A is less than the waiting time in hospital B

Example 10
The France listening test scores for students from a class is summarized in the following
table. Assume that the scores are normally distributed for both populations.

Sample size Sample variance


Male 20 25
Female 17 19

Test at 10% level of significance that the two populations have equal variances.

Solution

Step 1: Hypothesis Statement

M
2
H0 :  1 or M
2
 F2 (claim)
 F2 Two tail test , α/2 = 0.05
H1 : M
2
 F2

Step 2: Test Statistics/Compute the test value

2
SM 25
F0    1.3158
S F2 19

Step 3: State the Rejection Area

1
Reject H0 if F0 <  0.4273 or F0 > F0.05,19,16  2.42
F0.05,16,19
Find the critical value from statistical
table 9:

α = 0.01, df : nM–1 = 19 , nF – 1 = 16

right side: F0.05,19,16  2.42


1
F10.05,19,16 
F0.05,16,19
left side:
1
  0.4273
2.34
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

Step 4: Decision Making

Since 1.3158 < 2.42 , do not reject H0

The value of test statistics


lies in the non rejection
area

Step 5: Conclusion

At α = 0.1, there is not enough evidence to reject the claim that the two populations have
equal variances

Exercises

Exercise 3.1 - Hypothesis Test for One Population Mean

1. A researcher wishes to claim that the average cost of tuition and fees at a four-year
public college is greater than RM5700. She selects a random sample of 36 four-year
public colleges and finds the mean to be RM5950. The population standard deviation
is RM659. Is there evidence to support the claim at α  0.05 ?

[Answer: zcal = 2.28, reject Ho]

2. A doctor claims that the mean age at which baby start walking is 11 months. Azmi
wanted to check if this claim is true. He took a random sample of 25 children and
found that the mean age at which these children started walking was 11.5 months. The
population standard deviation was 0.6 months. Using   0.05 , test the claim that the
mean at which all children start walking is from 11 months.

[Answer: zcal = 4.17, reject Ho]

3. The following figures show the amount of coffee (in ounces) filled by a machine in six
randomly selected jars.

15.9 15.7 16.3 15.9 15.7 16.2

Assume a normal distribution for the amount of coffee in a jar with population standard
deviation 0.229 ounces. At the 5% level of significance, test the claim that the mean
amount of coffee filled in a jar is less than 16 ounces?
(Source: Dec 2016)
[Answer: zcal = -0.488, do not reject Ho]
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

4. A job placement director claims that he average starting salary for CEO is RM24000. A
sample of 10 CEO has a mean of RM23450 and a standard deviation of RM400. Is
there enough evidence to support the claim at   0.05 ?

[Answer: tcal = -4.35, reject Ho]

5. An educator claims that the average salary of substitution teachers in school is less
than RM60 per day. A random sample of eight school is selected, and the daily
salaries (in RM) are shown. Is there enough evidence to support the researcher’s
claim at α  0.10 ?

60 56 60 55 70 55 60 55

[Answer: tcal = -0.624, do not reject Ho]

6. The principal of the elementary school thinks that the average IQ of students at his
school is more than 108. To prove his point, he administers an IQ test to 20 randomly
selected students. Among the sampled students, the average IQ is 115 with a
standard deviation of 10. Test the claim at α  0.10 ?

[Answer: tcal = 3.13, reject Ho]

Exercise 3.2 - Hypothesis Test for Two Population Means

1. A survey found that the average hotel room rate in State I is RM88.22 and the average
room in State II is RM80.61. Assume that the data were obtained from two samples of
50 hotels each. The population standard deviations were RM5.62 and RM4.83,
respectively. At α  0.05 , can it be concluded that there is a significant difference in
the rates?

[Answer: zcal = 7.45, reject Ho]

2. These data were obtained in a study comparing persons with disabilities with persons
without disabilities. A scale known as the barriers to Health Promotion Activities for
Disabled Persons (BHADP) Scale gave the data. At α  0.01, test the claim that
persons with disabilities score higher than persons without disabilities.

Disabled Non-Disabled
Sample size 132 137
Sample mean 31.83 25.07
Population standard deviation 7.93 4.80

[Answer: zcal = 8.42, reject Ho]

3. A sample of 15 students from Maju College showed that the mean time they spend in
revision subject of construction is 28.50 hours per week with a standard deviation of 4
hours. Another sample of 17 students from Jaya College showed that the mean time
spent by them in the revision of the subject is 24.50 hours per week with a standard
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

deviation of 5 hours. Using 2.5% significance level, can you conclude that the mean time
spent in revision by student from Maju College is greater than student in Jaya College?
Assume that the mean times they spend in revision of the subject are normally
distributed for each of the two colleges and the standard deviations for the two
populations are equal.

[Answer: tcal = 2.476 , reject Ho]

4. The Khaf Company has developed a new battery. The engineer in charge claims that the
new battery will operate continuously longer than the old battery. Test the engineer’s
claim that the new batteries run longer than the old batteries. Assume that the means
time of the new and old batteries are normally distributed and the standard deviations for
the two populations are equal. (use α  0.05 )

New Battery Old Battery


n1  34 n2  28
X1  200 X 2  190
s1  20 s 2  40

[Answer: tcal = 1.278 , do not reject Ho]

5. The average size of a farm in Pekan I is 185 acres. The average size of a farm in Pekan
II is 100 acre. Assume that the data were obtained from two samples with standard
deviations of 38 acres and 12 acres, and sample size of 8 and 9 respectively. Can it be
concluded at α  0.05 that the average size of the farms in Pekan I and Pekan II are
different? (Assume equal variances)

[Answer: tcal = 6.385, reject Ho]

6. Within a school district, students were randomly assigned to one of two Mathematics
teachers. They administered the same test. Mr. Safwan had 30 students and Mr.
Syazwan had 25 students. Mr. Safwan’s students had an average test mark of 78 with a
standard deviation of 10 and Mr. Syazwan’s students had an average test mark of 85
with a standard deviation of 15. At α  0.10 , test the hypothesis that Mr. Safwan and Mr.
Syazwan are equally effective teachers. (Assume unequal variances)

[Answer: tcal = -1.993 , reject Ho]

7. A researcher wishes to determine whether the salaries of professional nurses employed


by private hospitals are higher than those of nurses by government hospitals. At α  0.01
, can she conclude that the private hospitals pay more than government hospitals?
(Assume unequal variances)

Private Government
X1  RM 26800 X2  RM 25400
s1  RM 600 s2  RM 450
n1  10 n2  8

[Answer: tcal = 5.654 , reject Ho]


STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

8. A statistician is interested in estimating the difference between the mean household


incomes for two neighborhoods. Independent random samples of households in the
neighborhoods provide the following results:

Neighborhood 1 Neighborhood 2
n1  8 n2  12
X1  RM 15700 X 2  RM 14500
s1  RM 700 s2  RM 850

At α  0.05 , can we conclude that the mean households incomes for Neighborhood 1
and Neighborhood 2 are different? (Assume unequal variances)

[Answer: tcal = 3.44, reject Ho]

Exercise 3.3 - Hypothesis Test for Paired Samples

1. In an effort to improve the performance of students in a Statistics course, a lecturer


provides a weekly one hour tutorial for 10 selected students. Each student was given a
test twice, before and after completing a 14 weeks session. The test score results are
given below.

Student 1 2 3 4 5 6 7 8 9 10
Before 45 55 40 70 63 35 61 30 50 44
After 60 70 50 75 70 65 55 66 70 63

Test at 5% significance level whether attending the tutorial helped to improve the
students’ performance.
(Source: Dec 2016)
[Answer: tcal = -3.92, reject Ho]

2. A computer scientist is investigating the usefulness of two different design languages in


improving programming tasks. Ten expert programmers who are familiar with both
languages are asked to code a standard function in both languages and the time (in
minutes) is recorded.

Programmer 1 2 3 4 5 6 7 8 9 10
Design
17 16 21 14 18 24 16 14 21 23
Language 1
Design
18 14 19 11 23 21 10 13 19 24
Language 2

Test at 5% significance level whether the mean coding time for Design Language 1 is
longer than Design Language 2.
(Source: Jun 2018)
[Answer: tcal = 1.28, do not reject Ho]

3. A sample of ten 13-year old children were provided with a breakfast of low glycemic
index (GI) foods on the first day and high GI foods on the second day. The two
breakfasts contained the same quantities of carbohydrate, fat and protein. On each day
a buffet lunch was provided, and the number of calories eaten at lunchtime was
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

recorded. The objective is to determine whether the kind of breakfast eaten has an
effect on the mean calorie intake. A hypothesis test is needed to determine whether
these results show that there would be differences in the mean calorie intake for other
children who ate low and high GI breakfasts.

Student 1 2 3 4 5 6 7 8 9 10
Lunchtime calorie
intake after low GI 300 315 330 400 290 310 315 340 350 300
breakfast
Lunchtime calorie
intake after high 350 370 450 490 500 330 400 470 340 410
GI breakfast

Test at 5% level of significance, whether there is any difference in the mean calorie
intake during lunchtime among the ten children.
(Source: Dec 2018)
[Answer: tcal = -4.37, reject Ho]

4. A workshop was conducted to improve students’ knowledge on basic of statistics. Data


were collected from 10 students, before and after the workshop.

Student 1 2 3 4 5 6 7 8 9 10
Pre-test 14 9 13 15 10 12 11 13 14 12
Post-test 14 8 13 16 11 13 12 13 15 13

Test the hypothesis that the workshop is effective to improve students’ knowledge on
basic of statistics at α  0.05 .

[Answer: tcal = 2.236, reject Ho]

Exercise 3.4 - Hypothesis Test for One Population Variance

1. The shell thickness (in millimeter) of the bird eggs recorded for 10 randomly selected
eggs are shown as follows: (dec2016 Q4b)

0.15 0.29 0.32 0.39 0.25 0.18 0.26 0.37 0.35 0.20

Test at 5% significant level that the true standard deviation of shell thickness of the eggs
is less than 1 mm. Assume that the shell thickness is normally distributed.

[Answer:  2  0.061, reject Ho]

2. The television habits of 30 children were observed. The sample mean was found to be
48.2 hours per week, with a standard deviation of 12.4 hours per week. Test at 10% the
claim that the standard deviation was more than 16 hours per week.
[Answer:  2  17.418 , do not reject Ho]

3. A post office finds that the population variance for normally distributed waiting times for
customers on Monday is less than 27.04 minutes. The post office experiments with a
single, main waiting line and finds that for a random sample of 25 customers, the waiting
STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

times for customers have a variance of 12.25 minutes. At 5% level of significant, test
that a single line causes smaller variation among waiting times for customers.
[Answer:  2  10.873 , reject Ho]

4. The population variance of scores on a statistic test for all semester five students was
152. A sample of scores for 20 students who took this test gave a variance of 170.
Assume that the scores are normally distributed, test at 5% significance whether the
variance of all semester five on this test is different from 152.
[Answer:  2  21.25 , do not reject Ho]

5. A company produces metal pipes of a standard length, and claims that the standard
deviation of the length is more than 1.2 cm. One of its clients decides to test this claim
by taking a sample of 20 pipes and checking their lengths. They found that the standard
deviation of the sample is 1.5. Test at 10% significant level that the company claims is
true.

[Answer:  2  29.688 , reject Ho]

6. The manufacturer has designed the helmets so that the population mean force
transmitted by the helmets to the workers is 800 pounds with a standard deviation is 40
pounds. Tests were run on a random sample of n = 35 helmets. The sample mean and
sample standard deviation were found to be 942 pounds and 45.5 pounds, respectively.
Do the data provide sufficient evidence that the population standard deviation are not
equal 40 pounds? Use α = 0.01.
[Answer:  2  43.993 , do not reject Ho]

7. A company produce electric devices operated by a thermostatic control. The population


standard deviation of the temperature at whice these controls actually operate is 2.0
degree Fahrenheit. The standard deviation for 18 samples of these controls were found
to be 2.36 degress Fahrenheit. Test at 1% level that the standard deviation of the
operating temperature is less than 2.0 degree Fahrenheit

[Answer:  2  23.67 , do not reject Ho]

Exercise 3.5 - Hypothesis Test for Two Population Variances

1. The amounts of time required by Dr R and Dr Y to do routine insurance checkup have a


standard deviation of 4.2 minutes and 3.0 minutes respectively. If a sample of 25
patients do the procedure by Dr R and 21 patient by Dr Y, test at the α = 0.05 level of
significant whether the amounts of time are more variable for Dr R?

[Answer: F0  1.96 , do not reject Ho]


STATISTICS FOR SCIENCE AND ENGINEERING CS/STA408/2020

2. The psychologist conducted a survey of a random 34 male college students and 29


female college students. Here is a descriptive summary of the results of her survey:

Male Female
n = 34 n = 29
Mean = 105.5 Mean = 90.9
S = 20.1 S = 12.2

Is there sufficient evidence at the α = 0.05 level to conclude that the variance of the
fastest speed driven by male college students differs from the variance of the fastest
speed driven by female college students?

[Answer: F0  2.7144 , reject Ho]

3. A pharmaceutical manufacturer purchases a particular material from two different


suppliers Y and Z. The manufacturer select 9 shipments from each of two suppliers and
measures the percentage of impurities in the raw material for each shipment. The
variance for supplier Y is 0.273, while for supplier Z is 0.094. Do the data provide
sufficient evidence to indicate that there is a difference in the variability of the shipment
impurity levels for two suppliers? Test using α = 0.1

[Answer: F0  2.904 , do not reject Ho]

4. The breaking strength of 12 bundles of wool fibres have a sample standard deviation 10.
In addition, the breaking strength of another 13 bundles of synthetic fibres have a
sample standard deviation 5.06.Assume the breaking strength of two populations are
normally distributed. Test at 5% level of significance that the two populations have
unequal variances.

[Answer: F0  3.906 , reject Ho]

You might also like