You are on page 1of 25

# 3/16/2011

Hypothesis Testing 2
{ CE 21. Engineering Statistics

1
3/16/2011

Summary:
Let X1,…,Xn be a large (n>30) sample from a population with
mean µ and standard deviation σ.
To test a null hypothesis of the form:
H0: µ ≤µ0, H0:µ ≥µ0, H0: µ =µ0:
𝑋−𝜇0
- Compute the z-score: 𝑧 = .
𝜎/ 𝑛
If 𝜎 is unknown it can be approximated with s.
- Compute the P-value. The P-value is an area under the
normal curve, which depends on the alternate hypothesis as
follows:
Alternate Hypothesis P-value
H1: µ > µ0 Area to the right of z
H1: µ < µ0 Area to the left of z
H1: µ ≠ µ0 Sum of the areas in the tails cut off by z and -z

##  The smaller the P-value, the more certain we can be

that Ho is false.
 The larger the P-value, the more plausible Ho

## becomes, but we can never be certain that Ho is

true.
 A rule of thumb suggests to reject Ho whenever

## P ≤ 0.05. While this rule is convenient, it has no

scientific basis.
 If P ≤ 0.05, the result is statistically significant at the

## 5% level. Or the null hypothesis is rejected at 5%

level.

2
3/16/2011

The Relationship
Between Hypothesis
Tests and Confidence
Intervals

## Example: The sample mean lifetime of 50 micro

drills was =12.68 holes drilled and s=6.83. Setting
α to 0.05 (5%), the 95% confidence interval for µ
was computed to be (10.79, 14.57).

## Test Ho: µ = 10.79 versus H1: µ ≠ 10.79. Do a

similar test for 14.57.

3
3/16/2011

##  The 95% confidence interval consists of

precisely those values of µ whose P-values are
greater than 0.05 in a hypothesis test.

##  The confidence interval contains all the values

that are plausible for the population mean µ.

Quiz:
1. For which value is the null hypothesis more plausible:
P=0.5 or P=0.05?
2. If P=0.01, which is the best conclusion?
a. H0 is definitely false.
b. H0 is definitely true.
c. There is a 1% probability that H0 is true.
d. H0 might be true, but it’s unlikely.
e. H0 might be false, but it’s unlikely.
f. H0 is plausible.
3. True or False: If P=0.02, then
a. The result is statistically significant at the 5% level.
b. The result is statistically significant at the 1% level.
c. The null hypothesis is rejected at the 5% level.
d. The null hypothesis is rejected at the 1% level.

4
3/16/2011

Proportion

## The same procedures essentially apply when

dealing with population proportions.
However, as discussed in previous lessons,

𝜇=𝑝
𝑝 1−𝑝
𝜎2 =
𝑛
This test requires that the sample proportion be approximately
normally distributed.
 This assumption will be justified when both np0 > 10 and n(1-p0) > 10.

##  p0 is the population proportion specified in the null distribution.

5
3/16/2011

Example 7:
A supplier of semiconductor wafers claims
that of all the wafers he supplies, no more than
10% are defective. A sample of 400 wafers is
tested and 50 of them, or 12.5%, are defective.
Can we conclude that the claim is false?

6
3/16/2011

Example 8:
The article “Refinement of Gravimetric Geoid
Using GPS and Leveling Data” presents a
method for measuring orthometric heights
above sea level. For a sample of 1225 baselines,
926 gave results that were within the class C
spirit leveling tolerance limits. Can we conclude
that this method produces results within the
tolerance limits more than 75% of the time?

Summary:
Let X be the number of successes in n independent
Bernoulli trials, each with success probability p; in other
words, let X~Bin (n,p).
To test a null hypothesis, assuming that both np0 and
n(1-p0) are greater than 10:

## Compute the P-value. The P-value is an area under the

normal curve, which depends on the alternate hypothesis
as follows:
Alternate Hypothesis P-value
H1: p > p0 Area to the right of z
H1: p < p0 Area to the left of z
H1: p ≠ p0 Sum of the areas in the tails cut off by z
and-z

7
3/16/2011

Population Mean

## -Uses the t-test, rather than the z-test.

-If 𝜎 is known, use z, not t.

## Example 9: Spacer collars for a transmission

countershaft have a thickness specification of 38.98-
39.02. The process that manufactures the collars is
supposed to be calibrated so that the mean thickness
is 39.00 mm.

## A sample of six collars is drawn and measured for

thickness. The six thicknesses are 39.030, 38.997,
39.012, 39.008, 39.019, and 39.002. Assume that the
population of thicknesses is approximately normal.
Can we conclude that the process needs recalibration?

8
3/16/2011

## Example 10: Before a substance can be deemed

safe for landfilling, its chemical properties
must be characterized. An article reports that
in a sample of six replicates of sludge from a
New Hampshire wastewater treatment plant,
the mean pH was 6.68 with a standard
deviation of 0.20. Can we conclude that the
mean pH is less than 7.0?

9
3/16/2011

## Tests for the Difference

Between Two Means

## For Large Samples (nX > 30 and nY > 30):

To test a null hypothesis either of the form
H0 :
H0 :
H0::

## If and are unknown they may be

approximated with and , respectively.

10
3/16/2011

Example 11:
An engineer claims that a new type of power
supply for home computers lasts longer than
the old type. Independent random samples of
75 each of the two types are chosen, and the
sample means and standard deviations of their
New: 𝑋= 4387 h s1=252 h
Old: 𝑋= 4260 h s2=231 h

## Can you conclude that the mean lifetime of

new power supplies is greater than that of the
old power supplies?

## For Small Samples:

Should come from normal populations
with means and and standard
deviations and
If and are not known to be equal, use the
following procedure in testing the null
hypothesis.

## Compute degree of freedom, v, rounded

down to the nearest integer.

11
3/16/2011

## iii. Compute the P-value. The P-value is an area under the

normal curve, which depends on the alternate hypothesis as
follows:

## Alternate Hypothesis P-value

H1: 𝜇𝑋 − 𝜇𝑌 > ∆0 Area to the right of t
H1: 𝜇𝑋 − 𝜇𝑌 < ∆0 Area to the left of t
H1: 𝜇𝑋 − 𝜇𝑌 ≠ ∆0 Sum of the areas in the tails cut off
by t and –t

Tests for the difference between two proportions
(pages 425-428)
Tests with paired data (page 439- 441)

12
3/16/2011

Schedule
 March 16 – Chi Square Tests
 March 18- F Tests, Power
 March 23- Third Long Exam, w/ cheat sheet (Wednesday)
 4-6 PM
 Conflict: 25 1 PM
 April 1- Final Presentation (Friday)

13
3/16/2011

##  Used when data consists of nominal or ordinal variables

Nominal variables:
Variables with no inherent order or ranking sequence,
-e.g. numbers used as names (group 1, group 2...), gender,

Ordinal variables:
Variables with an ordered series,
- e.g. "greatly dislike, moderately dislike, indifferent,
moderately like, greatly like".
***Numbers assigned to such variables indicate rank order only -
the "distance" between the numbers has no meaning.

## Multinomial trial – an experiment that

can result in k outcomes, where k ≥ 2

##  generalization of the Bernoulli trial

 Example: Roll of a fair die (6 outcomes)

14
3/16/2011

## The Chi-Sqaure test has two main uses:

 Comparing the distribution of one category variable

## (nominal or ordinal) with another.

 Comparing an observed distribution with a

##  Comparing the distribution of one category variable

with another.

Example:
Of 120 male and 100 female applicants to university, 90 male and 40 female had
work experience.
Does the gender of an applicant to university correspond to whether or not they
have prior work experience?

15
3/16/2011

##  Comparing an observed distribution with a

theoretically expected one.
Example:

## In a population of mice, do the proportions differ from those

expected?

Examples

16
3/16/2011

Example:
A gambler wants to test a die to see if it is not
fair.
H0: Die is fair. (p01=…p06=1/6)
He rolls the die 600 times and obtains the ff.
results:
Category Observed Expected
1 115 100
2 97 100
3 91 100
4 101 100
5 110 100
6 86 100
Total 600 600

##  Expected value = mean number of trials that

would result in a specific outcome if H0 were true.

##  Chi-square statistic- measures the closeness of the

expected value to the observed value
2
𝑘 (𝑂𝑖 −𝐸𝑖 )
 𝜒2 = 𝑖=1 𝐸 𝑖

17
3/16/2011

##  If 𝜒 2 is large, there is stronger evidence against H0.

 For k outcomes, there are k-1 degrees of freedom

##  The chi-square test provides a good estimate when

all the expected values are ≥ 5.

##  Chi square statistic for the example is 6.12.

P-value:
 Check if all expected values are ≥ 5.

##  Check table for chi-square value. The areas given across

the top are the areas to the right of the critical value.
 P-value for the example > 0.10. We therefore do not

reject H0.

18
3/16/2011

Example 1:
Rivets are manufactured for a certain purpose. The
length specification is 1.20-1.30 cm. It is thought that
90% of the rivets manufactured meet the specification,
while 5% are too short, and 5% are too long.

## In a random sample of 1000 rivets, 860 met the specs, 60

were too short, and 80 were too long. Can you conclude
that the true percentages differ from 90%, 5%, and 5%?
State the appropriate null hypothesis.
Compute the expected values under the null hypothesis.
Compute the value of the chi-square statistic.
Find the P-value. What do you conclude?

## Chi-Square test for homogeneity

 If you conduct several trials, you determine

outcomes.

## H0: The probabilities of the outcomes are the

same for each experiment.

19
3/16/2011

Example:
Four machines manufacture cylindrical steel pins. The pins are
subject to a diameter specification. A pin may meet the
specification, or it may be too thin or too thick. Pins are sampled
for each machine, and the number of pins in each category is
counted. The results are shown in the contingency table:

## Too thin OK Too thick Total

Machine 1 10 102 8 120
Cell: each row-column intersection
Machine 2 34 161 5 200

Machine 3 12 79 9 100

Machine 4 10 60 10 80

## Total 66 402 32 500

Example:
Four machines manufacture cylindrical steel pins. The pins are
subject to a diameter specification. A pin may meet the
specification, or it may be too thin or too thick. Pins are sampled
for each machine, and the number of pins in each category is
counted. The results are shown in the contingency table:

## Too thin OK Too thick Total

Machine 1 10 102 8 120
Marginal Totals
Machine 2 34 161 5 200

Machine 3 12 79 9 100

Machine 4 10 60 10 80

20
3/16/2011

## Notation for Observed values (i  rows, j 

columns)
H0: For each column j, p1j=…= pIj
O1. = sum of observed values in row i
O.j = sum of observed values in column j

## Row 1 O11 O12 … O1J O1.

Row 2 O21 O22 … O2J O2.
: : : : : :
Row I OI1 OI2 … OIJ OI.
Total O.1 O.2 … O.J O..

For cell ij,
𝑂𝑖. 𝑂.𝑗
𝐸𝑖𝑗 =
𝑂..
𝐼 𝐽
2
(𝑂𝑖𝑗 − 𝐸𝑖𝑗 )2
𝜒 =
𝐸𝑖𝑗
𝑖=1 𝑗=1

21
3/16/2011

## Example 2: Given the table below, test the null

hypothesis that the proportion of pins that are too
thin, OK, or too thick are the same for all the
machines.

## Too thin OK Too thick Total

Machine 1 10 102 8 120
Machine 2 34 161 5 200
Machine 3 12 79 9 100
Machine 4 10 60 10 80
Total 66 402 32 500

## Chi-square test for Independence

 In the previous example, the column totals were
random, while the row totals were fixed in

## Too thin OK Too thick Total

Machine 1 10 102 8 120
Machine 2 34 161 5 200
Machine 3 12 79 9 100
Machine 4 10 60 10 80
Total 66 402 32 500

##  For cases when both row and column totals are

random,ie., they are independent, the same
procedure applies.

22
3/16/2011

## A public opinion poll surveyed a simple random sample of

1000 voters. Respondents were classified by gender and by
voting preference Results are shown in the contingency
table below.

## Is there a gender gap? Do the men's voting preferences

differ significantly from the women's preferences?

Example 3:
Cylindrical steel pins are subject to a length and
diameter specification. With respect to length, a pin
may meet the specification, or it may be too long or
too short.

## A total of 1021 pins are sampled and categorized

wrt both length and diameter specification. The
results are presented in the table below.

## Test the null hypothesis that the proportion of pins

that are too thin, OK, or too thick wrt diameter
specification do not depend on the classification
wrt length specification..

23
3/16/2011

## Observed values for 1021 steel pins

Diameter
Length Too thin OK Too Total
thick
Too 13 117 4 134
Short
OK 62 664 80 806
Too 5 68 8 81
Long
Total 80 849 92 1021

Example 4:
For the given table of observed values,
Construct the corresponding table of expected
values.
If appropriate, perform the chi-square test for the
null hypothesis that the row and column outcomes
are independent. If not appropriate, explain why.

Observed Values
1 2 3
A 15 10 12
B 3 11 11
C 9 14 12

24
3/16/2011

End.

25