You are on page 1of 49

Chapter 11

Introduction to
Hypothesis Testing

1
11.1 Introduction
• The purpose of hypothesis testing is to determine
whether there is enough statistical evidence in favor of a
certain belief about a parameter.
• Examples
– Is there statistical evidence in a random sample of potential
customers, that support the hypothesis that more than 10% of the
potential customers will purchase a new products?
– Is a new drug effective in curing a certain disease? A sample of
patients is randomly selected. Half of them are given the drug while
the other half are given a placebo. The improvement in the patients
conditions is then measured and compared.
2
11.2 Concepts of Hypothesis Testing
• The critical concepts of hypothesis testing.
– Example:
• An operation manager needs to determine if the mean
demand during lead time is greater than 350.
• If so, changes in the ordering policy are needed.
– There are two hypotheses about a population mean:
• H0: The null hypothesis  = 350
• H1: The alternative hypothesis  > 350
to pr ove
t you want
w h a
This is 3
11.2 Concepts of Hypothesis Testing
• Assume the null hypothesis is true (= 350).

= 350
– Sample from the demand population, and build a statistic
related to the parameter hypothesized (the sample mean).
– Pose the question: How probable is it to obtain a sample
mean at least as extreme as the one observed from the
sample, if H0 is correct? 4
11.2 Concepts of Hypothesis Testing
• Assume the null hypothesis is true (= 350).

x  355
x  450
= 350
– Since the x is much larger than 350, the mean  is likely
to be greater than 350. Reject the null hypothesis.
– In this case the mean  is not likely to be greater than
350. Do not reject the null hypothesis.
5
Types of Errors
• Two types of errors may occur when deciding whether to
reject H0 based on the statistic value.
– Type I error: Reject H0 when it is true.
– Type II error: Do not reject H0 when it is false.
• Example continued
– Type I error: Reject H0 ( = 350) in favor of H1 ( > 350)
when the real value of  is 350.
– Type II error: Believe that H0 is correct ( = 350) when the
real value of  is greater than 350.
6
Controlling the probability of
conducting a type I error
• Recall:
– H0:  = 350 and H1:  > 350.
– H0 is rejected if x is sufficiently large
• Thus, a type I error is made if x  critical value
when  = 350.
• By properly selecting the critical value we can limit the
probability of conducting a type I error to an acceptable
level. Critical value

= 350 x 7
11.3 Testing the Population Mean When the
Population Standard Deviation is Known
• Example 11.1
– A new billing system for a department store will be cost-
effective only if the mean monthly account is more than
$170.
– A sample of 400 accounts has a mean of $178.
– If accounts are approximately normally distributed with
 = $65, can we conclude that the new system will be
cost effective?

8
Testing the Population Mean ( is Known)

• Example 11.1 – Solution


– The population of interest is the credit accounts at
the store.
– We want to know whether the mean account for all
customers is greater than $170.
H1 :  > 170
– The null hypothesis must specify a single value of
the parameter 
H0 :  = 170
9
Approaches to Testing
• There are two approaches to test whether the
sample mean supports the alternative
hypothesis (H1)
– The rejection region method is mandatory for manual
testing (but can be used when testing is supported
by a statistical software)
– The p-value method which is mostly used when a
statistical software is available.
10
The Rejection Region Method

The rejection region is a range of values such


that if the test statistic falls into that range, the
null hypothesis is rejected in favor of the
alternative hypothesis.

11
The Rejection Region Method –
for a Right - Tail Test
Example 11.1 – solution continued

• Recall: H0:  = 170


H1:  > 170
therefore,

• It seems reasonable to reject the null hypothesis and


believe that  > 170 if the sample mean is sufficiently large.
Reject H0 here

Critical value of the


12
sample mean
The Rejection Region Method
for a Right - Tail Test
Example 11.1 – solution continued

• Define a critical value x L for x that is just large enough


to reject the null hypothesis.
• Reject the null hypothesis if

xx  xxLL
13
Determining the Critical Value for the
Rejection Region
• Allow the probability of committing a Type I error
be  (also called the significance level).
• Find the value of the sample mean that is just
large enough so that the actual probability of
committing a Type I error does not exceed 
Watch…

14
Determining the Critical Value –
for a Right – Tail Test
Example 11.1 – solution continued

x L  170
z 
65 400

x
 x  170 xL
P(commit a Type I error) = P(reject H0 given that H0 is true)

= P( x  x L given that H0 is true) … is allowed to be 


Since P( Z  Z  )   we have: 15
Determining the Critical Value –
for a Right – Tail Test
Example 11.1 – solution continued

 = 0.05
x L  170
 x  170 xL z 
65 400
65
x L  170  z  .
400
If we select   0.05, z .05  1.645.
65
x L  170  1.645  175.34. 16
400
Determining the Critical value
for a Right - Tail Test

Reject
Re ject the
the null
nullhypothesis
hypothesis ifif
xx 175
175..34
34

Conclusion
Conclusion
Sincethe
Since thesample
samplemean
mean(178)
(178)isisgreater
greaterthan
than
thecritical
the criticalvalue
valueofof175.34,
175.34,there
thereisissufficient
sufficient
evidencetotoinfer
evidence inferthat
thatthe
themean
meanmonthly
monthly
balanceisisgreater
balance greaterthan
than$170
$170atatthe
the5%
5%
significancelevel.
significance level.
17
The standardized test statistic
– Instead of using the statistic x, we can use the
standardized value z.

x 
z
 n
– Then, the rejection region becomes
One tail test
z  z

18
The standardized test statistic

• Example 11.1 - continued


– We redo this example using the standardized test
statistic.
Recall:H0:  = 170
H1:  > 170
– Test statistic:
x   178  170
z   2.46
 n 65 400
– Rejection region: z > z.051.645.
19
The standardized test statistic

• Example 11.1 - continued

Reject
Re ject the
the null
nullhypothesis
hypothesis ifif
ZZ 11..645
645

Conclusion
Conclusion
SinceZZ== 2.46
Since 2.46>>1.645,
1.645,reject
rejectthe
thenull
null
hypothesisininfavor
hypothesis favorofofthe
thealternative
alternative
hypothesis.
hypothesis.
20
P-value Method
– The p-value provides information about the amount of
statistical evidence that supports the alternative
hypothesis.

– The p-value of a test is the probability of observing a


test statistic at least as extreme as the one computed,
given that the null hypothesis is true.

– Let us demonstrate the concept on Example 11.1

21
P-value Method

The probability of observing a


test statistic at least as extreme as 178,
given that  = 170 is…

P( x  178 when   170)


178  170
 P( z  )
65 400
 P( z  2.4615)  .0069
 x  170
x  178 The p-value 22
Interpreting the p-value
Because the probability that the sample mean will
assume a value of more than 178 when  = 170 is
so small (.0069), there are reasons to believe that
 > 170.
Note how the event
x  178 is rare under H0
when  x  170, but...
…it becomes more
probable under H1,
when  x  170 H 0 :  x  170
H1 :  x  170
x  178 23
Interpreting the p-value

Wecan
We canconclude
concludethat
thatthe
thesmaller
smallerthe
thep-value
p-value
themore
the morestatistical
statisticalevidence
evidenceexists
existstotosupport
supportthe
the
alternativehypothesis.
alternative hypothesis.

H 0 :  x  170
H1 :  x  170
x  178 24
Interpreting the p-value

• Describing the p-value


– If the p-value is less than 1%, there is overwhelming
evidence that supports the alternative hypothesis.
– If the p-value is between 1% and 5%, there is a strong
evidence that supports the alternative hypothesis.
– If the p-value is between 5% and 10% there is a weak
evidence that supports the alternative hypothesis.
– If the p-value exceeds 10%, there is no evidence that
supports the alternative hypothesis.
25
The p-value and the Rejection Region
Methods
– The p-value can be used when making decisions
based on rejection region methods as follows:
• Define the hypotheses to test, and the required
significance level 
• Perform the sampling procedure, calculate the test statistic
and the p-value associated with it.
• Compare the p-value to Reject the null hypothesis only
if p-value <; otherwise, do not reject the null hypothesis.
 = 0.05
The p-value
 x  170
26
x L  175.34 x  178
Conclusions of a Test of Hypothesis

•• IfIfwe
wereject
rejectthe
thenull
nullhypothesis,
hypothesis,we weconclude
concludethat
that
thereisisenough
there enoughevidence
evidencetotoinfer
inferthat
thatthe
thealternative
alternative
hypothesisisistrue.
hypothesis true.

•• IfIfwe
wedodonot
notreject
rejectthe
thenull
nullhypothesis,
hypothesis,we weconclude
conclude
thatthere
that thereisisnot
notenough
enoughstatistical
statisticalevidence
evidencetotoinfer
infer
The alternative hypothesis
that the alternative hypothesis is true. The alternative
that the alternative hypothesis is true. is the more important hypothesis
is the more important
one.
one.ItItrepresents
representswhat
what
we
weareareinvestigating.
investigating.
27
A Left - Tail Test
• The SSA Envelop Example.
– The chief financial officer in FedEx believes that
including a stamped self-addressed (SSA) envelop in
the monthly invoice sent to customers will decrease
the amount of time it take for customers to pay their
monthly bills.
– Currently, customers return their payments in 24
days on the average, with a standard deviation of 6
days.

28
A Left - Tail Test
• The SSA envelop example – continued
– It was calculated that an improvement of two days on the
average will cover the costs of the envelops (checks can be
deposited earlier).
– A random sample of 220 customers was selected and SSA
envelops were included with their invoice packs.
– The times customers’ payments were received were
recorded (SSA.xls)
– Can the CFO conclude that the plan will be profitable at
10% significance level? 

29
A Left - Tail Test
• The SSA envelop example – Solution
– The parameter tested is the population mean
payment period ()
– The hypotheses are:
H0:  = 22
H1:  < 22(The CFO wants to know whether the
plan will be profitable)

30
A Left - Tail Test
• The SSA envelop example – Solution continued
– The rejection region:
It makes sense to believe that  < 22 if the sample
mean is sufficiently smaller than 22.
– Reject the null hypothesis if

xx  xxSS
31
Left-tail test

A Left -Tail Test


• The SSA envelop example – Solution continued
– The standardized one tail left hand test is:
x 21.63  22
z   .91
 n 6 220

Define the rejection region


z   z   z .10  1.28

Since -.91 > –1.28 do not reject the null hypothesis.


The p value = P(Z<-.91) = .1814
Since .1814 > .10, do not reject the null hypothesis 32
A Two - Tail Test
• Example 11.2
– AT&T has been challenged by competitors who
argued that their rates resulted in lower bills.
– A statistics practitioner determines that the mean
and standard deviation of monthly long-distance bills
for all AT&T residential customers are $17.09 and
$3.87 respectively.

33
A Two - Tail Test
• Example 11.2 - continued
– A random sample of 100 customers is selected and
customers’ bills recalculated using a leading
competitor’s rates (see Xm11-02).
– Assuming the standard deviation is the same (3.87),
can we infer that there is a difference between
AT&T’s bills and the competitor’s bills (on the
average)?

34
A Two - Tail Test
• Solution
– Is the mean different from 17.09?
H0:  = 17.09
H1 :   17.09
– Define the rejection region
z   z / 2 or z  z / 2

35
A Two – Tail Test
Solution - continued

20.025 20.025

x 17.09 x

If H0 is true ( =17.09), x can still fall far We want this erroneous


above or far below 17.09, in which case rejection of H0 to be a
we erroneously reject H0 in favor of H1 rare event, say 5%
(  17.09) chance.
36
A Two – Tail Test
Solution - continued
x 17.55  17.09
z   1.19
 n 3.87 100
20.025
17.55
x 17.09 x
20.025
From the sample we have: 20.025 20.025
x  17.55
-z= -1.96 0 z= 1.96
37
Rejection region
A Two – Tail Test
Two-tail test

There is insufficient evidence to infer that there is a


difference between the bills of AT&T and the competitor.

Also, by the p value approach:


The p-value = P(Z< -1.19)+P(Z >1.19)
= 2(.1173) = .2346 > .05
20.025 20.025

-1.19 0 1.19
x 17.55  17.09 -z= -1.96 z= 1.96
z   1.19
 n 3.87 100 38
11.4 Calculating the Probability of a
Type II Error
• To properly interpret the results of a test of
hypothesis, we need to
– specify an appropriate significance level or judge the
p-value of a test;
– understand the relationship between Type I and
Type II errors.
– How do we compute a type II error?

39
Calculation of the Probability
of a Type II Error
• To calculate Type II error we need to…
– express the rejection region directly, in terms of the
parameter hypothesized (not standardized).
– specify the alternative value under H1.
• Let us revisit Example 11.1

40
Calculation of the Probability
of a Type II Error
Express the rejection
region directly, not in
• Let us revisit Example 11.1 standardized terms

– The rejection region was x  175.34 with  = .05.


– Let the alternative value be  = 180 (rather than just
>170) H :  = 170
0

H1:  = 180

Do not reject H0 Specify the


alternative value
=.05 under H1.
= 170 xL  180
175.34
41
Calculation of the Probability
of a Type II Error
– A Type II error occurs when a false H0 is not
rejected.

A false H0…
…is not rejected
H0:  = 170
H1:  = 180

=.05
x  175.34
= 170 xL  180
175.34
42
Calculation of the Probability
of a Type II Error
  P( x  175.34 given that H 0 is false )
 P( x  175.34 given that   180)
175.34  180
 P( z  )  .0764
65 400
H0:  = 170
H1:  = 180

= 170 xL  180
175.34
43
Effects on  of changing 

• Decreasing the significance level  increases


the value of and vice versa

2 <  2 > 

= 170 180
44
Judging the Test

• A hypothesis test is effectively defined by the


significance level and by the sample size n.

• If the probability of a Type II error is judged to


be too large, we can reduce it by
– increasing , and/or
– increasing the sample size.

45
Judging the Test

• Increasing the sample size reduces 


xL   
Re call : z   , thus x L    z 
 n n

By increasing the sample size the


standard deviation of the sampling
distribution of the mean decreases.
Thus, x Ldecreases.

46
Judging the Test

• Increasing the sample size reduces 


xL   
Re call : z   , thus x L    z 
 n n
Note what happens when n increases:
 does not change,
but  becomes smaller

= 170 xxxLLxLxLxLL 180 47


Judging the Test

• Increasing the sample size reduces 


• In Example 11.1, suppose n increases from 400
to 1000.
 65
xL    z   170  1.645  173.38
n 1000
173.38  180
  P( Z  )  P( Z  3.22)  0
65 1000
•  remains 5%, but the probability of a Type II
drops dramatically. 48
Judging the Test
• Power of a test
– The power of a test is defined as 1 - 
– It represents the probability of rejecting the null
hypothesis when it is false.

49

You might also like