You are on page 1of 47

Inferential Statistics

Inferential Statistics

Source: http://www.discover6sigma.org/post/2005/12/statistics-simplified/
Main Objectives
• Estimation: To estimate the parameters on the
basis of sample observations through a
statistic.
• Hypothesis Testing: To compare these
parameters among themselves on the basis of
observations and their estimates.
What is a Statistical Hypothesis?
• A hypothesis is a claim (assumption) about a
population parameter.

• Example:
– Average height of students is greater than 165 cm.
– The mean monthly cell phone bill of this city is $42.
– Older workers are more loyal to a company
– Companies with more than $1 billion of assets spend a
higher percentage of their annual budget on advertising
than do companies with less than $1 billion of assets.
Example
Imagine an automatic bottling machine that
fills two-liter bottles with cola.
The amount of cola filled in every bottle on the
average is expected to be of course, 2 liters.
Suppose a consumer advocate suspects that the
average amount of cola is less than 2 liters and
to wants to test it.
Example (contd.)
In fact the company sells cola in bottles
labeled 2 liters.
This implies a claim by the company that on
the average each bottle contains at least 2
liters.
The consumer advocate suspects that the
claim made by the company about the
population is not correct!
Null Hypothesis
• A null hypothesis (H0) is an assertion about the
value of a population parameter.
• It is an assertion that we hold as true unless we
have sufficient statistical evidence to conclude
otherwise.
• Examples: (Claim about the population)
– On the average each bottle contains at least 2 liters.
We write the null hypothesis:
H0: µ >= 2
Null Hypothesis
• The term null hypothesis arises from earlier
agricultural and medical applications of
statistics. In order to test the effectiveness of a
new fertilizer or drug, the tested hypothesis
(the null hypothesis) was that it had no effect,
that is, there was no difference between
treated and untreated samples.
Alternative Hypothesis
• A alternative hypothesis (H1 or Ha) is the negation
of the null hypothesis.
• Examples: (Suspicion about the claim)
– We write the alternative hypothesis:
H 1: µ ≤ 2
H0 is clearly specified and of intrinsic interest,
whereas H1 serves only to indicate what types of
departure from H0 are of interest.
Example 1
• A researcher has a theory that the average age of managers in
a particular industry is over 35‑years‑old, and he wishes to
prove this. The null hypothesis to conduct a statistical test on
this theory would be ____________.

A. the population mean is < 35


B. the population mean is > 35
C. the population mean is = 35
D. the population mean is <= 35
Example 2
• A company produces an item that is supposed to have a six inch
hole punched in the center. A quality control inspector is concerned
that the machine which punches the hole is "out‑of‑control" (hole
is too large or too small). In an effort to test this, the inspector is
going to gather a sample punched by the machine and measure the
diameter of the hole. The alternative hypothesis used to statistical
test to determine if the machine is out‑of‑control is

A. the mean diameter is > 6 inches


B. the mean diameter is < 6 inches
C. the mean diameter is = 6 inches
D. the mean diameter is not equal to 6 inches
Example 3
• How will you set up the null and alternative
hypotheses, if you want to test the following
claims,
A. a vendor claims that his company fills any
accepted order, on the average, in at most six
working days.
B. A manufacturer of golf balls claim that the
variance of the weights of the company’s golf
balls is controlled to within 0.0028 oz2.
Identifying Null and Alternative Hypotheses

• If the null hypothesis is true, then no


corrective action would be necessary.
• If the null hypothesis is not true, then some
corrective action would be necessary.
Identifying Null and Alternative Hypotheses

• Let us another look to the bottling example:


• Assume that consumers are satisfied with the
bottle, but the owner suspects that the machine is
filling more than 2 liters on the average and thus
wasting cola. Then from the owner’s point of view,
no corrective action is necessary if the average is
less than or equal to 2 liters. Then,
• Null hypothesis: H0: µ ≤ 2
• Alternative hypothesis: H1: µ > 2
Identifying Null and Alternative Hypotheses

• Let us another look to the bottling example:


• Suppose the engineer in charge of the accuracy of
the machine wants to test the average amount
filled. The engineer have to take corrective action
when the average is either more than or less than
2 liters. Then,
• Null hypothesis: H0: µ = 2
• Alternative hypothesis: H1: µ ≠ 2
Hypothesis Testing Process
Population
Claim: Population
mean age is 50 years.
H0: µ = 50

Draw a sample

Question: Is X = 43
Sample likely if µ = 50?
Suppose the sample
mean age is 43. i.e. If NOT likely DO NOT
X =43 ACCEPT H0
Errors Committed by the Decision Taker

• No errors is committed when a good prospect


is accepted or a bad one is rejected.
• There is a small chance that a bad prospect is
accepted and a good one is rejected.
• Minimize the chances of such errors.
Type I and Type II Errors
• Type I Error
– Rejecting a true null hypothesis
– The probability of committing a Type I error is called , the
level of significance. i.e.
α = P(Reject H0|H0 is true)

• Type II Error
– Failing to reject a false null hypothesis
– The probability of committing a Type II error is called , i.e.,
 = P( Accept H0|H0 is false)
Decision Table
for Hypothesis Testing

Null True Null False

Fail to Correct Type II error


reject null Decision ()

Reject null Type I error Correct Decision


( )
Example 4
• Consider the use of metal detectors in airports to
test people for concealed weapons. In essence, this is
a form of hypothesis testing.
a. What are the null and alternative hypotheses?
b. What are type I and type II errors in this case?
c. Which type of error is more costly?
Process of Hypothesis Testing
• Task 1: Hypothesize
– Establish a null and alternative hypothesis.
• Task 2: Test
– Determine the appropriate statistical test.
– Set the value of α, Type I error rate.
– Establish the decision rule.
– Gather sample data.
– Analyze the data.
• Task 3: Take Statistical Action
– Reach a statistical conclusion.
• Task 4: determine the Business Implications
– Make a business decision.
Evidence Gathering
• After construction of null and alternative hypotheses,
the next step is to gather evidence.
• If we could measure the whole population and
calculate the exact value of the population
parameter in question, we would have perfect
evidence and be 100 % confident in our conclusion.
• Generally, evidence is gathered from a random
sample of the population.
• When we make inferences from sample data, we
cannot be 100% confident about it.
Accept-Reject Type Decisions Based on
Sample Data
• An inspector has to accept or reject a batch of parts
supplied by a vendor, usually based on test results of a
random sample.
• A recruiter has to accept or reject a job applicant,
usually based from evidence gathered from a resume
and interview.
• A bank manager has to accept or reject a loan
application, based on financial data on the application.
• A car buyer has to buy or not buy a car, usually based
on a test drive.
The p-value
• Let the null and alternative hypotheses are:
H0: µ ≥ 1000
H1: µ < 1000
A random sample of 30 yields a sample mean of only 999.
• The evidence goes against H0.
• If we reject H0, then there is a chance of committing a
type I error.
• If we accept H0, then there is a chance of committing a
type II error.
The p-value
• What is the probability that H0 can still be true
despite the evidence?
• More clearly, when the actual µ = 1000, and with
sample size 30, what is the probability of getting
a sample mean that is less than or equal to 999?
• Suppose the answer to the question is 0.26.
Statisticians call this probability the p-value.
• The p-value is a kind of credibility measure of H0
in the light of evidence.
The Significance Level
• For which p-value we would reject the H0?
• Here we need to establish a significance level,
α.
• Policy: when p-value is less than α, reject H0.
• Level of significance:
– Typical values are 10%, 5% and 1%.
The Test Statistic
• A test statistic is a sample statistic computed
from sample data. The value of the test
statistic is used in determining p-value.
• Examples:
• Z-test (Z statistic)
• t-test (t-statistic)
• F-test (F-statistic)
• χ2-test (χ2-statistic) (chi-square)
Z-test
• Cases in which the test statistic is Z
σ is known and the population is normal.
σ is known and the sample size is at least 30. (The
population need not be normal)
– Formula for calculating Z,
x
Z
  
 
 n
t-test
• Cases in which the test statistic is t
σ is unknown but the sample standard deviation (s)
is known and the population is normal.
– Formula for calculating t,
x
t
 s 
 
 n
p-value Calculation
• The p-value is the probability of obtaining a value of
the test statistic as extreme as, or more extreme
than, the actual value obtained, when the null
hypothesis is true.
Let the null and alternative hypotheses are:
H0: µ ≥ 1000 vs. H1: µ < 1000

A random sample of 100 yields a sample mean of only 999.


assume population standard deviation is known (say 5).
 X   999  1000 
p  P ( X  999)  P    P ( Z  2)
 / n 5 / 100 
Rejection Region
• The rejection region of a statistical hypothesis test is
the range of numbers that will lead us to reject the
null hypothesis in case the test statistic falls within
this range. The rejection region, also called the
critical region, is defined by the critical points. The
rejection region is defined so that, before the
sampling takes place, our test statistic will have a
probability  of falling within the rejection region if
the null hypothesis is true.
Non-rejection Region

• The Non-rejection region is the range of values


(also determined by the critical points) that will
lead us not to reject the null hypothesis if the test
statistic should fall within this region.
• The Non-rejection region is designed so that,
before the sampling takes place, our test statistic
will have a probability 1- of falling within the
acceptance region if the null hypothesis is true.
Rejection and Non Rejection Regions

Rejection Region
Rejection Region

α/2 Non Rejection Region α/2

= 50

Critical Value Critical Value


1-Tailed and 2-Tailed Tests
If action is to be taken if a parameter is less than or
equal to some value α, then the alternative hypothesis
is that the parameter is less than α, and the test is a
left-tailed test.
H0: µ > 50 vs. H1: µ ≤ 50

Level of significance = α
a Rejection
Represents region is
critical value shaded
0
1-Tailed and 2-Tailed Tests
If action is to be taken if a parameter is either greater
than or less than some value α, then the alternative
hypothesis is that the parameter is not equal to α, and
the test is a right-tailed test.
H0: µ < 50 vs. H1: µ ≥ 50

Represents
Level of significance = α critical value

a Rejection
region is
shaded
0
1-Tailed and 2-Tailed Tests
If action is to be taken if a parameter is either greater
than or less than some value α, then the alternative
hypothesis is that the parameter is not equal to α, and
the test is a two-tailed test.
H0: µ = 50 vs. H1: µ ≠ 50

Level of significance = α Represents


critical value
a a
/2 /2
Rejection
Two-tail test 0 region is
shaded
Six Steps in Hypothesis Testing

1. State the null hypothesis, H0 and the


alternative hypothesis, H1
2. Choose the level of significance, , and the
sample size, n
3. Determine the appropriate test statistic and
sampling distribution
4. Collect data and compute the value of the
test statistic
Final two Steps for p-value Approach
5. Determine the p-value based on your test
6. If the p-value is less than α then reject the
null hypothesis and if p-value is more than α
then do not reject the null hypothesis
.Express the managerial conclusion in the
context of the problem
Final two Steps for Critical Value Approach

5. Determine the critical values that divide the


rejection and nonrejection regions
6. If the test statistic falls into the non-
rejection region, do not reject the null
hypothesis H0. If the test statistic falls into
the rejection region, reject the null
hypothesis. Express the managerial
conclusion in the context of the problem
Example 5
•• Use
  the information given and test the hypotheses.
a) H0: μ ≤ 1200 vs. H1: μ > 1200,
α = .1, = 1215, n = 113, σ = 100

b) H0: μ = 16 vs. H1: μ ≠ 16,


α = .05, = 16.45, n = 20, s2 = 3.59, also assume x is
normally distributed.
Example 6
• A test of breaking strength of six ropes manufactured by a
company showed a mean breaking strength of 6425 lb and a
standard deviation of 120 lb. However, the manufacturer
claimed a mean breaking strength of 7500 lb.
a) Can we support the manufacturer’s claim at a level of
significance of 0.10?
b) What assumption did you make for this problem?
Example 7
• In an attempt to determine why customer service is important to
managers in the United Kingdom, researchers surveyed managing
directors of manufacturing plants in Scotland. One of the reasons
proposed was that customer service is a means of retaining customers.
On a scale from 1 to 5, with 1 being low and 5 being high, the survey
respondents rated this reason more highly than any of the others, with a
mean response of 4.30. Suppose U.S. researchers believe American
manufacturing managers would not rate this reason as highly and
conduct a hypothesis test to prove their theory. Alpha is set at .05. Data
are gathered and the following results are obtained. Use these data and
the eight steps of hypothesis testing to determine whether U.S.
managers rate this reason significantly lower than the 4.30 mean
ascertained in the United Kingdom. Assume from previous studies that
the population standard deviation is 0.574.

34554554444444454443444354454445
Example 8
• A manufacturer claims that the mean life of batteries
manufactured by his company is at least 44 months. A
random sample of 40 of these batteries was tested, resulting
in a sample mean life of 41 months with a sample standard
deviation of 16 months. Test at α = 0.01 whether the
manufacturer’s claim is correct.
Type II Errors
• Failure to reject null hypothesis means staying with the status
quo, not implementing a new process, not making
adjustments.
• If a new process, product, theory or adjustment is not
significantly better than what is currently accepted practice,
the decision maker makes a correct decision.
• If a new process, product, theory or adjustment would
significantly improves sales, business climate, costs, or morale
the decision maker makes error in judgement (Type II).
Type II Errors
• Type II errors can translate to
– Lost opportunities
– Poor product quality
– Failure to react to the marketplace
– The ability to react to change, new developments, or new
opportunities
Solving for Type II Errors
• Decide appropriate test statistics
• Determine the rejection region using a given alpha
• Find the probability that the observed test statistics does not
fall in the rejection region assuming H1 is true, i.e.,
β = P(Test statistics falls in the non-rejection region | H1 is true.)
Example 9
• Suppose a hypothesis states that the mean is exactly 50. if a
random sample of 35 items is taken to test this hypothesis,
what is the value of β if the population standard deviation is 7
and the alternative mean is 53? Use α = .01.

You might also like