You are on page 1of 102

7-1

Hypothesis Testing
7-2

Using Statistics
• A hypothesis is a statement or assertion about the state of nature (about
the true value of an unknown population parameter):

✓The accused is innocent


✓  = 100
• Every hypothesis implies its contradiction or alternative:
✓The accused is guilty
✓  100
• A hypothesis is either true or false, and you may fail to reject it or you
may reject it on the basis of information:

✓Trial testimony and evidence


✓Sample data
7-3

Statistical Hypothesis Testing


• A null hypothesis, denoted by H0, is an assertion about one or more population
parameters. This is the assertion we hold to be true until we have sufficient
statistical evidence to conclude otherwise.
✓Often represents the status quo situation or an existing belief.
✓There is nothing new happening, the old theory is still true, the old standard
is correct, and the system is in control
✓Is maintained, or held to be true, until a test leads to its rejection in favor of
the alternative hypothesis.
✓Is accepted as true or rejected as false on the basis of a consideration of a test
statistic.

• The alternative hypothesis, denoted by H1 or Ha ,is the assertion of all


situations not covered by the null hypothesis.
✓The new theory is true, the are new standards, the system is out of control,
and/or something is happening
✓Generally speaking, new hypotheses that business researchers want to
“prove” are stated in the alternative hypothesis.
7-4

Statistical Hypothesis Testing

• H0 and H1 are:
✓ Mutually exclusive
– Only one can be true.
✓ Exhaustive
– Together they cover all possibilities, so one or the other must be
true.
Example
Suppose flour packaged by a manufacturer is sold by weight; and a
particular size of package is supposed to average 40 ounces. Suppose
the manufacturer wants to test to determine whether their packaging
process is out of control as determined by the weight of the flour
packages.
7-6

1-Tailed and 2-Tailed Tests

The tails of a statistical test are determined by the need for an action. If action
is to be taken if a parameter is greater than some value a, then the alternative
hypothesis is that the parameter is greater than a, and the test is a right-tailed
test. H0:   50
H1:   50

If action is to be taken if a parameter is less than some value a, then the


alternative hypothesis is that the parameter is less than a, and the test is a left-
tailed test. H0:   50
H1:   50

If action is to be taken if a parameter is either greater than or less than some


value a, then the alternative hypothesis is that the parameter is not equal to a,
and the test is a two-tailed test. H0:  = 50
H1:   50
Example
Suppose a company has held an 18% share of the market. However,
because of an increased marketing effort, company officials believe the
company’s market share is now greater than 18%, and the officials
would like to prove it.
Example
Suppose a company has held an 18% share of the market. However,
because of an increased marketing effort, company officials believe the
company’s market share is now greater than 18%, and the officials
would like to prove it.
7-9

Hypothesis about other Parameters

• Hypotheses about other parameters such as population


proportions and population variances are also possible. For
example

✓H0: p  40%
✓H1: p < 40%

✓H0: s2  50
✓H1: s2  50
Example
A vendor claims that his company fills any accepted order, on the
average, in at most six working days. You suspect that the average is
greater than six working days and want to test the claim. How will you
set up the null and alternative hypotheses?
Example
A vendor claims that his company fills any accepted order, on the
average, in at most six working days. You suspect that the average is
greater than six working days and want to test the claim. How will you
set up the null and alternative hypotheses?
Example
A manufacturer of golf balls claims that the variance of the weights of
the company’s golf balls is controlled to within 0.0028 oz2. If you wish
to test this claim, how will you set up the null and alternative
hypotheses?
Example
A manufacturer of golf balls claims that the variance of the weights of
the company’s golf balls is controlled to within 0.0028 oz2. If you wish
to test this claim, how will you set up the null and alternative
hypotheses?
Example
At least 20% of the visitors to a particular commercial Web site where
an electronic product is sold are said to end up ordering the product. If
you wish to test this claim, how will you set up the null and alternative
hypotheses?
Example
Suppose a company has held an 18% share of the market. However,
because of an increased marketing effort, company officials believe the
company’s market share is now greater than 18%, and the officials
would like to prove it.

Because the company officials are only interested in “proving” that the
market share has increased and the inclusion of the “less than” sign in the
null hypothesis is confusing. Also, If the equal part of the null hypothesis is
rejected because the market share is seemingly greater, then certainly the
“less than” portion of the null hypothesis is also rejected because it is
further away from “greater than” than is “equal.” Using this logic, the null
hypothesis for the market share problem can be written as
Hypothesis Testing Process

The first four steps in testing


hypotheses should always
be completed before the
study is undertaken
Example
Suppose flour packaged by a manufacturer is sold by weight; and a
particular size of package is supposed to average 40 ounces. Suppose
the manufacturer wants to test to determine whether their packaging
process is out of control as determined by the weight of the flour
packages.
Solution
i) Suppose a sample of 100 such packages is randomly selected, and
a sample mean of 40.01 ounces is obtained.
ii) Suppose a sample mean of 50 ounces is obtained for 100 packages.

When is the sample mean so far away from the population mean that
the null hypothesis is rejected?
• Confidence Interval
7-19

Rejection Region

• The rejection region of a statistical hypothesis test is the range of


numbers that will lead us to reject the null hypothesis in case the
test statistic falls within this range. The rejection region, also
called the critical region, is defined by the critical points. The
rejection region is defined so that, before the sampling takes
place, our test statistic will have a probability  of falling within
the rejection region if the null hypothesis is true.
7-20

Nonrejection Region
• The nonrejection region is the range of values (also
determined by the critical points) that will lead us not to reject
the null hypothesis if the test statistic should fall within this
region. The nonrejection region is designed so that, before the
sampling takes place, our test statistic will have a probability 1-
 of falling within the nonrejection region if the null hypothesis
is true

✓In a two-tailed test, the rejection region consists of the


values in both tails of the sampling distribution.
7-21

The Concepts of Hypothesis Testing


• A test statistic is a sample statistic computed from sample data.
The value of the test statistic is used in determining whether or not
we may reject the null hypothesis.

• The decision rule of a statistical hypothesis test is a rule that


specifies the conditions under which the null hypothesis may be
rejected.

Consider H0:  = 100. We may have a decision rule that says:


“Reject H0 if the sample mean is less than 95 or more than 105.”
7-22

Decision Making

• There are two possible states of nature:


✓H0 is true
✓H0 is false
• There are two possible decisions:
✓Fail to reject H0 as true
✓Reject H0 as false
7-23

Decision Making
• A decision may be correct in two ways:
✓Fail to reject a true H0
✓Reject a false H0
• A decision may be incorrect in two ways:
✓Type I Error: Reject a true H0
• The Probability of a Type I error is denoted by .
✓Type II Error: Fail to reject a false H0
• The Probability of a Type II error is denoted by .
7-24

Errors in Hypothesis Testing


• A decision may be incorrect in two ways:
✓Type I Error: Reject a true H0
◼The Probability of a Type I error is denoted by .
◼  is called the level of significance of the test

✓Type II Error: Accept a false H0


◼The Probability of a Type II error is denoted by .
◼ 1 -  is called the power of the test.

•  and  are conditional probabilities:


 = P(Reject H 0 H 0 is true)

✓ = P(Accept H 0 H 0 is false)
7-25

Type I and Type II Errors


A contingency table illustrates the possible outcomes
of a statistical hypothesis test.

The “state of nature” is how things actually are and the “action” is
the decision that the business researcher actually makes
Type - I Error
Suppose the flour-packaging process actually is “in control” and is
averaging 40 ounces of flour per package. Suppose also that a business
researcher randomly selects 100 packages, weighs the contents of
each, and computes a sample mean. It is possible, by chance, to
randomly select 100 of the more extreme packages (mostly heavy
weighted or mostly light weighted) resulting in a mean that falls in the
rejection region. The decision is to reject the null hypothesis even
though the population mean is actually 40 ounces. In this case, the
business researcher has committed a Type I error.
✓ if a manager fires an employee because some evidence indicates
that she is stealing from the company and if she really is not stealing
from the company,
✓ Suppose a worker on the assembly line of a large manufacturer
hears an unusual sound and decides to shut the line down. If the
sound turns out not to be related to the assembly line and no
problems are occurring with the assembly line
Type - II Error
Suppose in the case of the flour problem that the packaging process is
actually producing a population mean of 41 ounces even though the
null hypothesis is 40 ounces. A sample of 100 packages yields a sample
mean of 40.2 ounces, which falls in the nonrejection region. The
business decision maker decides not to reject the null hypothesis. The
packaging procedure is out of control and the hypothesis testing
process does not identify it.
✓ Suppose in the business world an employee is stealing from the
company. A manager sees some evidence that the stealing is
occurring but lacks enough evidence to conclude that the employee
is stealing from the company. The manager decides not to fire the
employee based on theft.
✓ Suppose the worker decides not enough noise is heard to shut the
line down, but in actuality, one of the cords on the line is unraveling,
creating a dangerous situation.
How are α and β related ?
1. alpha can only be committed when the null hypothesis is rejected and beta
can only be committed when the null hypothesis is not rejected, a business
researcher cannot commit both a Type I error and a Type II error at the
same time on the same hypothesis test
2. alpha and beta are inversely related. If alpha is reduced, then beta is
increased, and vice versa.

➢ In terms of the manufacturing assembly line, if management makes it


harder for workers to shut down the assembly line (reduce Type I
error), then there is a greater chance that bad product will be made or
that a serious problem with the line will arise (increase Type II error).
➢ Legally, if the courts make it harder to send innocent people to jail,
then they have made it easier to let guilty people go free.

One way to reduce both errors is to increase the sample size. If a larger
sample is taken, it is more likely that the sample is representative of the
population, which translates into a better chance that a business
researcher will make the correct choice
7-29

Statistical Significance
While the null hypothesis is maintained to be true throughout a
hypothesis test, until sample data lead to a rejection, the aim of a
hypothesis test is often to disprove the null hypothesis in favor of the
alternative hypothesis. This is because we can determine and
regulate , the probability of a Type I error, making it as small as we
desire, such as 0.01 or 0.05. Thus, when we reject a null hypothesis,
we have a high level of confidence in our decision, since we know
there is a small probability that we have made an error.
Example
A survey of CPAs across the United States found that the average net
income for sole proprietor CPAs is $74,914.Because this survey is now
more than ten years old, an accounting researcher wants to test this
figure by taking a random sample of 112 sole proprietor accountants in
the United States which showed a sample mean of $78,695. Assume
the population standard deviation of net incomes for sole proprietor
CPAs is $14,530.
Solution
A survey of CPAs across the United States found that the average net
income for sole proprietor CPAs is $74,914.Because this survey is now
more than ten years old, an accounting researcher wants to test this
figure by taking a random sample of 112 sole proprietor accountants in
the United States which showed a sample mean of $78,695. Assume
the population standard deviation of net incomes for sole proprietor
CPAs is $14,530.
Step 1

Step 2
Solution
Step 3
Type I error rate, or alpha, which is .05 in this problem

Step 4
Because the test is two tailed and alpha is .05, there is 2 or .025 area in each
of the tails of the distribution. Thus, the rejection region is in the two ends
of the distribution with 2.5% of the area in each.
Solution
Step 5

Step 6
Because this test statistic, z = 2.75, is greater than the critical value of z in the
upper tail of the distribution, z = +1.96,

Step 7
Reject the null hypothesis
Step 8
Statistically, the researcher has enough evidence to reject the figure of
$74,914 as the true national average net income for sole proprietor CPAs.
7-34

Critical Value Method


A company that delivers packages within a large metropolitan
area claims that it takes an average of 28 minutes for a package to
be delivered from your door to the destination. Suppose you want
to carry out a hypothesis test of this claim at 95% confidence by
taking a sample of 100 packages, with an average delivery time of
31.5 minutes & standard deviation of 5 minutes.
7-35

Picturing Hypothesis Testing

95% confidence
Population interval around
mean under H0 observed sample mean

 = 28 30.52 x = 31.5 32.48


It seems reasonable to reject the null hypothesis, H0:  = 28, since the hypothesized
value lies outside the 95% confidence interval. If we are 95% sure that the
population mean is between 30.52 and 32.58 minutes, it is very unlikely that the
population mean will actually be 28 minutes.

Note that the population mean may be 28 (the null hypothesis might be true), but
then the observed sample mean, 31.5, would be a very unlikely occurrence. There
is still the small chance ( = 0.05) that we might reject the true null hypothesis.
 represents the level of significance of the test.
7-36

Nonrejection Region

If the observed sample mean falls within the nonrejection region, then you fail to
reject the null hypothesis as true. Construct a 95% nonrejection region around
the hypothesized population mean, and compare it with the 95% confidence
interval around the observed sample mean:

s 5 s 5
 0  z.025 = 28  1.96 95% non- 95% Confidence x  z .025 = 315
.  1.96
n 100 rejection region Interval n 100
around the around the
= 28.98 =  27,02 ,28.98 population Mean Sample Mean . .98 =  30.52 ,32.48
= 315

27.02 0=28 28.98 30.52 x=5 32.48

The nonrejection region and the confidence interval are the same width, but
centered on different points. In this instance, the nonrejection region does not
include the observed sample mean, and the confidence interval does not include
the hypothesized population mean.
7-37

Solution
A company that delivers packages within a large metropolitan
area claims that it takes an average of 28 minutes for a package to
be delivered from your door to the destination. Suppose you want
to carry out a hypothesis test of this claim at 95% confidence by
taking a sample of 100 packages, with an average delivery time of
31.5 minutes & standard deviation of 5 minutes.
Set the null and alternative hypotheses: s 5
x  z = 315
.  196
.
H0:  = 28
. 025
n 100
H1:   28
.  .98 = 30.52, 32.48
= 315
Collect sample data:
n = 100 We can be 95% sure that the average
x = 31.5 time for all packages is between 30.52
s=5 and 32.48 minutes.
Since the asserted value, 28 minutes,
Construct a 95% confidence interval for is not in this 95% confidence interval,
the average delivery times of all we may reasonably reject the null
packages: hypothesis.
Example
A survey of CPAs across the United States found that the average net
income for sole proprietor CPAs is $74,914.Because this survey is now
more than ten years old, an accounting researcher wants to test this
figure by taking a random sample of 112 sole proprietor accountants in
the United States which showed a sample mean of $78,695. Assume
the population standard deviation of net incomes for sole proprietor
CPAs is $14,530.
Solution
A survey of CPAs across the United States found that the average net
income for sole proprietor CPAs is $74,914.Because this survey is now
more than ten years old, an accounting researcher wants to test this
figure by taking a random sample of 112 sole proprietor accountants in
the United States which showed a sample mean of $78,695. Assume
the population standard deviation of net incomes for sole proprietor
CPAs is $14,530.
7-40

The p-Value

The p-value is the probability of obtaining a value of the test statistic as


extreme as, or more extreme than, the actual value obtained, when the null
hypothesis is true.

The p-value is the smallest level of significance, , at which the null


hypothesis may be rejected using the obtained value of the test statistic.

RULE: When the p-value is less than  , reject H0.


7-41

The p-Value
7-42

Example
An automatic bottling machine fills cola into two liter (2000 cc) bottles. A
consumer advocate wants to test the null hypothesis that the average amount
filled by the machine into a bottle is at least 2000 cc. A random sample of 40
bottles coming out of the machine was selected and the exact content of the
selected bottles are recorded. The sample mean was 1999.6 cc. The population
standard deviation is known from past experience to be 1.30 cc. Test this
hypotheses at 95% confidence with the help of p-value.
7-43

Example
An automatic bottling machine fills cola into two liter (2000 cc) bottles. A
consumer advocate wants to test the null hypothesis that the average amount
filled by the machine into a bottle is at least 2000 cc. A random sample of 40
bottles coming out of the machine was selected and the exact content of the
selected bottles are recorded. The sample mean was 1999.6 cc. The population
standard deviation is known from past experience to be 1.30 cc. Test this
hypotheses at 95% confidence with the help of p-value.
n = 40
H0:   2000 x = 1999.6
H1:   2000
s = 1.3
n = 40
For  = 0.05, the critical value
of z is -1.645 x−
z= 0 = 1999.6 - 2000
s 1.3
x − 0
The test statistic is: z = s n 40
n
Do not reject H0 if: [z -1.645] = − 1.95  Reject H
Reject H0 if: z −5] 0
7-44

Example
An automatic bottling machine fills cola into two liter (2000 cc) bottles. A
consumer advocate wants to test the null hypothesis that the average amount
filled by the machine into a bottle is at least 2000 cc. A random sample of 40
bottles coming out of the machine was selected and the exact content of the
selected bottles are recorded. The sample mean was 1999.6 cc. The population
standard deviation is known from past experience to be 1.30 cc. Test this
hypotheses at 95% confidence with the help of p-value.

z = x −s 0 = 1999.6 - 2000
H0:   2000 1.3
H1:   2000 n 40
n = 40, 0 = 2000, x-bar = 1999.6,
s = 1.3 = −1.95
p - value = P(Z  -1.95)
x − 0 = 0.5000 - 0.4744
The test statistic is: z =
s = 0.0256
n
Example
In an attempt to determine why customer service is important to
managers in the United Kingdom, researchers surveyed managing
directors of manufacturing plants in Scotland. One of the reasons
proposed was that customer service is a means of retaining customers.
On a scale from 1 to 5, with 1 being low and 5 being high, the survey
respondents rated this reason more highly than any of the others, with
a mean response of 4.30. Suppose U.S. researchers believe American
manufacturing managers would not rate this reason as highly and
conduct a hypothesis test to prove their theory. Alpha is set at .05. Data
are gathered and the following results are obtained. Use these data to
determine whether U.S. managers rate this reason significantly lower
than the 4.30 mean ascertained in the United Kingdom. Assume from
previous studies that the population standard deviation is 0.574.
Solution
Step 1

Step 2

Step 3
Type I error rate, or alpha, which is .05 in this problem
Solution
Step 4
Because the test is one tailed test and alpha is .05, there is .05 area in the
left tail of the distribution.
Step 5,6
I)

II)
Solution
Step 7
1. Observed value method
Because the observed test statistic is not less than the critical value and is
not in the rejection region, the statistical conclusion is that the null
hypothesis cannot be rejected
2. Critical Value method
Because the mean obtained from the sample data is 4.156, the researchers
fail to reject the null hypothesis
3. p-Value method
The observed test statistic is z = -1.42. The probability of getting a z value at
least this extreme when the null hypothesis is true is .5000 - .4222 = .0778.

Step 8
The test does not result in enough evidence to conclude that U.S. managers
think it is less important to use customer service as a means of retaining
customers than do UK managers. Customer service is an important tool for
retaining customers in both countries according to managers.
7-49

Type I and Type II Errors


A contingency table illustrates the possible outcomes
of a statistical hypothesis test.

The “state of nature” is how things actually are and the “action” is
the decision that the business researcher actually makes
Solving For Type II Errors
In business, failure to reject the null hypothesis may mean staying with
the status quo, not implementing a new process, or not making
adjustments. If a new process, product, theory, or adjustment is not
significantly better than what is currently accepted practice,
the decision maker makes a correct decision. However, if the new
process, product, theory, or adjustment would significantly improve
sales, the business climate, costs, or morale, the decision maker makes
an error in judgment (Type II)
• In business, Type II errors can translate to lost opportunities, poor
product quality (as a result of failure to discern a problem in the
process), or failure to react to the marketplace.
• The Type II error plays an important role in business statistical
decision making
• Determining the probability of committing a Type II error is more
complex than finding the probability of committing a Type I error.
Computing 

Suppose a researcher is conducting a statistical test on the following


hypotheses

A Type II error can be committed only when the researcher fails to


reject the null hypothesis and the null hypothesis is false. Often,
when the null hypothesis is false, the value of the alternative mean is
unknown, so the researcher will compute the probability of
committing Type II errors for several possible values.
Computing 
Suppose that, in testing the preceding hypotheses, a sample of 60 cans
of beverage yields a sample mean of 11.985 ounces. Assume that the
population standard deviation is 0.10 ounces. For 95% confidence level,
z.05 = -1.645. The observed z value from sample data is
Computing 
Calculate a critical value for the sample mean, .
Computing 
By not rejecting the null hypothesis, the researcher either makes a
correct decision or commits a Type II error. What is the probability of
committing a Type II error in this problem if the population mean
actually is 11.99?

If the null hypothesis is false, the researcher will fail to reject the null
hypotheses whenever is in the nonrejection region, > 11.979
ounces.
Computing 

Hence there is an 80.23%


chance of committing a Type II
error if the alternative mean is
11.99 ounces.
Example
Consider the following null and alternative hypotheses:
H0:   1000
H1:   1000

Let s = 5,  = 5%, and n = 100. Compute  when alternative mean is


998.
7-57

Solution
Example
Suppose a researcher is conducting a statistical test on the following
hypotheses

Let s = 0.10 ounces,  = 5%, and n = 60. Compute  when


alternative mean is 11.96 ounces.
Solution
Observations About Type II Errors
• If the alternative mean or proportion is close to the hypothesized
value, the probability of committing a Type II error is high.
• If the alternative value is relatively far away from the hypothesized
value, the probability of committing a Type II error is small

Suppose a researcher is testing to determine whether a company really


is filling 2-liter bottles of cola with an average of 2 liters. If the company
decides to underfill the bottles by filling them with only 1 liter, a sample
of 50 bottles is likely to average a quantity near the 1-liter fill rather than
near the 2-liter fill. Committing a Type II error is highly unlikely. Even a
customer probably could see by looking at the bottles on the shelf that
they are underfilled.
However, if the company fills 2-liter bottles with 1.99 liters, the bottles
are close in fill volume to those filled with 2.00 liters. A customer
probably could not catch the underfill just by looking.
Operating Characteristics & Power
Curves
The power of a statistical hypothesis test is the probability of
rejecting the null hypothesis when the null hypothesis is false.

Power = (1 - )
7-62

The Hypothesis Test

We will see the three different types of hypothesis tests, namely

✓Tests of hypotheses about population means.


✓Tests of hypotheses about population proportions.
✓Tests of hypotheses about population variances.
7-63

Testing Population Means

• Cases in which the test statistic is Z

✓s is known and the population is normal.


✓s is known and the sample size is at least 30. (The population
need not be normal)

The formula for calculating Z is :


x−
z=
s 
 
 n
7-64

Testing Population Means

• Cases in which the test statistic is t

✓s is unknown but the sample standard deviation is known and


the population is normal.

The formula for calculating t is :


x−
t=
 s 
 
 n
7-65

Example

A coin is to tested for fairness. It is tossed 25 times and only 8 Heads are
observed. Test if the coin is fair at an  of 5% (significance level).
7-66

Example

A coin is to tested for fairness. It is tossed 25 times and only 8 Heads are
observed. Test if the coin is fair at an  of 5% (significance level).

Let p denote the probability of a Head


H0: p = 0.5
H1: p  05
Because this is a 2-tailed test, the p-value = 2*P(X  8)
From the binomial tables, with n = 25, p = 0.5, this value
2*0.054 = 0.108.
Since 0.108 >  = 0.05, then
do not reject H0
7-67

Testing Population Variances

• For testing hypotheses about population variances, the test


statistic (chi-square) is:
(n − 1)s 2

 =
2

s 2
0

where s is the claimed value of the population variance in the


2
0

null hypothesis. The degrees of freedom for this chi-square


random variable is (n – 1).

Note: Since the chi-square table only provides the critical values, it cannot
be used to calculate exact p-values. As in the case of the t-tables, only a
range of possible values can be inferred.
7-68

Example

A manufacturer of golf balls claims that they control the weights of the golf balls
accurately so that the variance of the weights is not more than 1 mg2. A random sample
of 31 golf balls yields a sample variance of 1.62 mg2. Is that sufficient evidence to
reject the claim at an  of 5%?
7-69

Solution

A manufacturer of golf balls claims that they control the weights of the golf balls
accurately so that the variance of the weights is not more than 1 mg2. A random sample
of 31 golf balls yields a sample variance of 1.62 mg2. Is that sufficient evidence to
reject the claim at an  of 5%?

Let s2 denote the population variance. Then


H 0 : s2  1
H 1 : s2  
The p-value of 0.0173
Since this value is less than the  of 5%, we reject the null hypothesis.
7-70

Example

As part of a survey to determine the extent of required in-cabin storage capacity, a


researcher needs to test the null hypothesis that the average weight of carry-on baggage
per person is  0 = 12 pounds, versus the alternative hypothesis that the average weight is
not 12 pounds. The analyst wants to test the null hypothesis at  = 0.05.
7-71

Example

As part of a survey to determine the extent of required in-cabin storage capacity, a


researcher needs to test the null hypothesis that the average weight of carry-on baggage
per person is  0 = 12 pounds, versus the alternative hypothesis that the average weight is
not 12 pounds. The analyst wants to test the null hypothesis at  = 0.05.

H0:  = 12 The Standard Normal Distribution

H1:   12
0.8
0.7 .95
0.6

For  = 0.05, critical values of z are ±1.96


0.5
0.4

x − 0
0.3

=
.025 .025
0.2

The test statistic is: z


s 0.1
0.0

n 0
z

Do not reject H0 if: [-1.96  z 1.96]


-1.96 1.96

Lower Rejection Nonrejection Upper Rejection


Reject H0 if: [z <-1.96] or z 1.96] Region Region Region
7-72

Additional Examples (a): Solution

n = 144 The Standard Normal Distribution


0.8

x = 14.6 0.7 .95


0.6
0.5
s = 7.8 0.4
0.3

x −  0 14.6-12
.025 .025
0.2

z= = 0.1

s 7.8 0.0

z
-1.96 0 1.96
n 144 
Lower Rejection Nonrejection Upper Rejection
2.6 Region
= =4 Region Region
0.65

Since the test statistic falls in the upper rejection region, H0 is rejected, and we may
conclude that the average amount of carry-on baggage is more than 12 pounds.
7-73

Examples

An insurance company believes that, over the last few years, the average liability
insurance per board seat in companies defined as “small companies” has been $2000.
Using  = 0.01, test this hypothesis using Growth Resources, Inc. survey data.
7-74

Examples

An insurance company believes that, over the last few years, the average liability
insurance per board seat in companies defined as “small companies” has been $2000.
Using  = 0.01, test this hypothesis using Growth Resources, Inc. survey data.

n = 100
H0:  = 2000 x = 2700
H1:   2000 s = 947

For  = 0.01, critical values of z are ±2.576 x − 0 2700 - 2000


z= =
x − 0 s 947
The test statistic is: z=
s n 100
n
700
= 7 .39  Reject H
Do not reject H0 if: [-2.576  z  2.576] =
94.7 0

Reject H0 if: [z <-2.576] or z 2.576]


7-75

Example

The Standard Normal Distribution


Since the test statistic falls in
0.8
0.7 .99 the upper rejection region, H0
0.6
0.5 is rejected, and we may
0.4
0.3
.005 .005
conclude that the average
0.2
0.1 insurance liability per board
seat in “small companies” is
0.0

z
-2.576 0 2.576

 more than $2000.


Lower Rejection Nonrejection Upper Rejection
Region Region Region
7-76

Example

The average time it takes a computer to perform a certain task is believed to be 3.24
seconds. It was decided to test the statistical hypothesis that the average performance
time of the task using the new algorithm is the same, against the alternative that the
average performance time is no longer the same, at the 0.05 level of significance.
7-77

Examples

The average time it takes a computer to perform a certain task is believed to be 3.24
seconds. It was decided to test the statistical hypothesis that the average performance
time of the task using the new algorithm is the same, against the alternative that the
average performance time is no longer the same, at the 0.05 level of significance.

H0:  = 3.24 n = 200


H1:   3.24 x = 3.48
s = 2.8
For  = 0.05, critical values of z are ±1.96
x − 0 3.48 - 3.24
x − 0 z=
The test statistic is: z= =
s s 2.8
n n 200

Do not reject H0 if: [-1.96  z 1.96] 0.24


= = 1.21  Do not reject H
0.20 0
Reject H0 if: [z < -1.96] or z 1.96]
7-78

Example

The Standard Normal Distribution


0.8
Since the test statistic falls in
0.7 .95 the nonrejection region, H0 is
0.6
0.5 not rejected, and we may
0.4
0.3
.025 .025
conclude that the average
0.2
0.1 performance time has not
0.0

-1.96 0 1.96 z
changed from 3.24 seconds.
2

Lower Rejection Nonrejection Upper Rejection


Region Region Region
7-79

Example

According to the Japanese National Land Agency, average land prices in central Tokyo
soared 49% in the first six months of 1995. An international real estate investment
company wants to test this claim against the alternative that the average price did not rise
by 49%, at a 0.01 level of significance.
7-80

Example

According to the Japanese National Land Agency, average land prices in central Tokyo
soared 49% in the first six months of 1995. An international real estate investment
company wants to test this claim against the alternative that the average price did not rise
by 49%, at a 0.01 level of significance.

H0:  = 49 n = 18

H1:   49 x = 38
s = 14
n = 18
For  = 0.01 and (18-1) = 17 df ,
x − 38 - 49
critical values of t are ±2.898 t = 0 =
s 14
x − 0
t= n 18
The test statistic is: s
n
- 11
= −3.33  Reject H
Do not reject H0 if: [-2.898  t  2.898]
=
3.3 0

Reject H0 if: [t < -2.898] or t  2.898]


7-81

Example

The t Distribution Since the test statistic falls in


0.8
0.7 .99
the rejection region, H0 is
0.6
0.5
rejected, and we may conclude
0.4
0.3
that the average price has not
.005 .005
0.2
0.1
risen by 49%. Since the test
0.0

t
statistic is in the lower
-2.898 0 2.898

− rejection region, we may


Lower Rejection
Region
Nonrejection
Region
Upper Rejection
Region
conclude that the average
price has risen by less than
49%.
7-82

Example

Canon, Inc,. has introduced a copying machine that features two-color copying capability
in a compact system copier. The average speed of the standard compact system copier is
27 copies per minute. Suppose the company wants to test whether the new copier has the
same average speed as its standard compact copier. Conduct a test at an  = 0.05 level of
significance.
7-83

Example

Canon, Inc,. has introduced a copying machine that features two-color copying capability
in a compact system copier. The average speed of the standard compact system copier is
27 copies per minute. Suppose the company wants to test whether the new copier has the
same average speed as its standard compact copier. Conduct a test at an  = 0.05 level of
significance.
n = 24
H0:  = 27 x = 24.6
H1:   27 s = 7.4
n = 24
For  = 0.05 and (24-1) = 23 df , x − 0 24.6 - 27
t = =
critical values of t are ±2.069 s 7.4
x − 0 24
t= n
The test statistic is: s
n -2.4
= = −1.59  Do not reject H
Do not reject H0 if: [-2.069  t  2.069] 1.51 0

Reject H0 if: [t < -2.069] or t  2.069]


7-84

Example

The t Distribution
0.8
Since the test statistic falls in
0.7
0.6
.95 the nonrejection region, H0 is
0.5 not rejected, and we may not
0.4
0.3
.025 .025
conclude that the average
0.2
0.1 speed is different from 27
0.0

-2.069 0 2.069 t
copies per minute.
−5

Lower Rejection Nonrejection Upper Rejection


Region Region Region
7-85

Example

An investment analyst for Goldman Sachs and Company wanted to test the hypothesis
made by British securities experts that 70% of all foreign investors in the British market
were American. The analyst gathered a random sample of 210 accounts of foreign
investors in London and found that 130 were owned by U.S. citizens. At the  = 0.05
level of significance, is there evidence to reject the claim of the British securities experts?
7-86

Example

An investment analyst for Goldman Sachs and Company wanted to test the hypothesis
made by British securities experts that 70% of all foreign investors in the British market
were American. The analyst gathered a random sample of 210 accounts of foreign
investors in London and found that 130 were owned by U.S. citizens. At the  = 0.05
level of significance, is there evidence to reject the claim of the British securities experts?

n = 210
H0: p = 0.70 130
H1: p  0.70 p =
210
= 0.619

n = 210
For  = 0.05 critical values of z are ±1.96 p - p
0 0.619 - 0.70
The test statistic is: z = p − p0 z=
p q
=
(0.70)(0.30)
p0 q 0 0 0
n 210
n
Do not reject H0 if: [-1.96  z  1.96] -0.081
= −2.5614  Reject H
Reject H0 if: [z < -1.96] or z  1.96]
=
0.0316 0
7-87

Example

The EPA sets limits on the concentrations of pollutants emitted by various industries. Suppose that the
upper allowable limit on the emission of vinyl chloride is set at an average of 55 ppm within a range of two
miles around the plant emitting this chemical. To check compliance with this rule, the EPA collects a
random sample of 100 readings at different times and dates within the two-mile range around the plant. The
findings are that the sample average concentration is 60 ppm and the sample standard deviation is 20 ppm.
Is there evidence to conclude that the plant in question is violating the law?
7-88

Example

The EPA sets limits on the concentrations of pollutants emitted by various industries. Suppose that the
upper allowable limit on the emission of vinyl chloride is set at an average of 55 ppm within a range of two
miles around the plant emitting this chemical. To check compliance with this rule, the EPA collects a
random sample of 100 readings at different times and dates within the two-mile range around the plant. The
findings are that the sample average concentration is 60 ppm and the sample standard deviation is 20 ppm.
Is there evidence to conclude that the plant in question is violating the law?

H0:   55 n = 100
x = 60
H1:  55 s = 20
n = 100
For  = 0.01, the critical value x − 0 60 - 55
z= =
of z is 2.326 s 20
x − 0 n 100
z=
The test statistic is: s
n 5
= = 2.5  Reject H
Do not reject H0 if: [z  2.326] 2 0
Reject H0 if: z 2.326]
7-89

Example

Critical Point for a Right-Tailed Test


Since the test statistic falls in
0 .4

the rejection region, H0 is


0 .3 0.99
rejected, and we may conclude
f(z)

0 .2
that the average concentration
0 .1
00
of vinyl chloride is more than
0 .0
-5 0 5 55 ppm.
z 2.326
2.5

Nonrejection Rejection
Region Region
7-90

Example
A certain kind of packaged food bears the following statement on the package: “Average net weight 12 oz.”
Suppose that a consumer group has been receiving complaints from users of the product who believe that they are
getting smaller quantities than the manufacturer states on the package. The consumer group wants, therefore, to
test the hypothesis that the average net weight of the product in question is 12 oz. versus the alternative that the
packages are, on average, underfilled. A random sample of 144 packages of the food product is collected, and it is
found that the average net weight in the sample is 11.8 oz. and the sample standard deviation is 6 oz. Given these
findings, is there evidence the manufacturer is underfilling the packages?
7-91

Example
A certain kind of packaged food bears the following statement on the package: “Average net weight 12 oz.”
Suppose that a consumer group has been receiving complaints from users of the product who believe that they are
getting smaller quantities than the manufacturer states on the package. The consumer group wants, therefore, to
test the hypothesis that the average net weight of the product in question is 12 oz. versus the alternative that the
packages are, on average, underfilled. A random sample of 144 packages of the food product is collected, and it is
found that the average net weight in the sample is 11.8 oz. and the sample standard deviation is 6 oz. Given these
findings, is there evidence the manufacturer is underfilling the packages?

n = 144
H0:   12
H1:   12 x = 11.8
s = 6
n = 144
For  = 0.05, the critical value
of z is -1.645 x−
z= 0 = 11.8 -12
x − 0 s 6
z=
The test statistic is: s n 144
n
Do not reject H0 if: [z -1.645] =
-.2
= −0.4  Do not reject H
Reject H0 if: z −5] .5 0
7-92

Example

Critical Point for a Left-Tailed Test


Since the test statistic falls in
0.4

the nonrejection region, H0 is


0.3 0.95
not rejected, and we may not
f(z)

0.2

005
conclude that the manufacturer
0.1
is underfilling packages on
0.0
-5 0 5
z
average.
-1.645
-0.4

Rejection Nonrejection
Region Region
7-93

Additional Examples (i)

A floodlight is said to last an average of 65 hours. A competitor believes that the average life of the
floodlight is less than that stated by the manufacturer and sets out to prove that the manufacturer’s
claim is false. A random sample of 21 floodlight elements is chosen and shows that the sample
average is 62.5 hours and the sample standard deviation is 3. Using =0.01, determine whether
there is evidence to conclude that the manufacturer’s claim is false.
7-94

Additional Examples (i)

A floodlight is said to last an average of 65 hours. A competitor believes that the average life of the
floodlight is less than that stated by the manufacturer and sets out to prove that the manufacturer’s
claim is false. A random sample of 21 floodlight elements is chosen and shows that the sample
average is 62.5 hours and the sample standard deviation is 3. Using =0.01, determine whether
there is evidence to conclude that the manufacturer’s claim is false.

H0:   65
H1:   65
n = 21
For  = 0.01 an (21-1) = 20 df, the
critical value -2.528

The test statistic is:

Do not reject H0 if: [t -2.528]


Reject H0 if: z  −2528]
7-95

Additional Examples (i) : Continued

Critical Point for a Left-Tailed Test


Since the test statistic falls in
0 .4

the rejection region, H0 is


0 .3 0.95
rejected, and we may conclude
f(t)

that the manufacturer’s claim


0 .2

005
0 .1
is false, that the average
0 .0
-5
-2.528
0 5
t
floodlight life is less than 65
-3.82 hours.
Rejection Nonrejection
Region Region
7-96

Additional Examples (j)


“After looking at 1349 hotels nationwide, we’ve found 13 that meet our standards.” This statement by the Small
Luxury Hotels Association implies that the proportion of all hotels in the United States that meet the association’s
standards is 13/1349=0.0096. The management of a hotel that was denied acceptance to the association wanted to
prove that the standards are not as stringent as claimed and that, in fact, the proportion of all hotels in the United
States that would qualify is higher than 0.0096. The management hired an independent research agency, which
visited a random sample of 600 hotels nationwide and found that 7 of them satisfied the exact standards set by the
association. Is there evidence to conclude that the population proportion of all hotels in the country satisfying the
standards set by the Small Luxury hotels Association is greater than 0.0096?
7-97

Additional Examples (j)


“After looking at 1349 hotels nationwide, we’ve found 13 that meet our standards.” This statement by the Small
Luxury Hotels Association implies that the proportion of all hotels in the United States that meet the association’s
standards is 13/1349=0.0096. The management of a hotel that was denied acceptance to the association wanted to
prove that the standards are not as stringent as claimed and that, in fact, the proportion of all hotels in the United
States that would qualify is higher than 0.0096. The management hired an independent research agency, which
visited a random sample of 600 hotels nationwide and found that 7 of them satisfied the exact standards set by the
association. Is there evidence to conclude that the population proportion of all hotels in the country satisfying the
standards set by the Small Luxury hotels Association is greater than 0.0096?

H0: p  0.0096
H1: p  0.0096
n = 600

For  = 0.10 the critical value 1.282

The test statistic is:

Do not reject H0 if: [z 1.282]


Reject H0 if: z 282]
7-98

Additional Examples (j) : Continued

Critical Point for a Right-Tailed Test


Since the test statistic falls in
0 .4

the nonrejection region, H0 is


0 .3 0.90
not rejected, and we may not
f(z)

0 .2

conclude that proportion of all


0 .1 00
hotels in the country that meet
the association’s standards is
0 .0
-5 0 5
z 1.282
0.519 greater than 0.0096.
Nonrejection Rejection
Region Region
7-99

The p-Value Revisited

Standard Normal Distribution Standard Normal Distribution

0.4 0.4

p-value=area to
p-value=area to
0.3 right of the test statistic 0.3
right of the test statistic
=0.3018
=0.0062
f(z)

f(z)
0.2 0.2

0.1 0.1

0.0 0.0
-5 0 0.519 5 -5 0 5
z 2.5 z

The p-value is the probability of obtaining a value of the test statistic as extreme as,
or more extreme than, the actual value obtained, when the null hypothesis is true.

The p-value is the smallest level of significance, , at which the null hypothesis
may be rejected using the obtained value of the test statistic.
7-100

The p-Value: Rules of Thumb

When the p-value is smaller than 0.01, the result is considered to


be very significant.

When the p-value is between 0.01 and 0.05, the result is


considered to be significant.

When the p-value is between 0.05 and 0.10, the result is


considered by some as marginally significant (and by most as not
significant).

When the p-value is greater than 0.10, the result is considered not
significant.
7-101

p-Value: Two-Tailed Tests

p-value=double the area to


left of the test statistic
=2(0.3446)=0.6892
0.4

f(z) 0.3

0.2

0.1

0.0
-5 0 5
-0.4 0.4
z

In a two-tailed test, we find the p-value by doubling the area in


the tail of the distribution beyond the value of the test statistic.
7-102

The p-Value and Hypothesis Testing

The further away in the tail of the distribution the test statistic falls, the smaller
is the p-value and, hence, the more convinced we are that the null hypothesis is
false and should be rejected.

In a right-tailed test, the p-value is the area to the right of the test statistic if the
test statistic is positive.

In a left-tailed test, the p-value is the area to the left of the test statistic if the
test statistic is negative.

In a two-tailed test, the p-value is twice the area to the right of a positive test
statistic or to the left of a negative test statistic.

For a given level of significance, :


Reject the null hypothesis if and only if   p-value

You might also like