You are on page 1of 87

INFERENTIAL STATISTICS

HYPOTHESIS TESTING
Inferential Statistics
+ offers varied tools and techniques that help
researcher draw valid and reliable
inferences or generalizations about the
population on a basis of a sample.

2
data

Are our inferences valid?…Best we can do is to calculate probability


about inferences
Two Areas of Inferential Statistics
+ Estimation
Point Estimation
Interval Estimation

+ Hypothesis Testing

4
Research Problem: How effective is a certain
drug in treating a disease?
Specific Objectives: This can be answered
by ESTIMATION
1. To estimate the population proportion of patients who will show
improvement after being treated with the certain drug.
This can be answered by
HYPOTHESIS TESTING
2. To determine whether treatment using the drug is better than the
existing treatment that is known to show improvement among 40%
of patients with the disease.

Question: How do we achieve these objectives


using inferential statistics?
What is Hypothesis Testing?
+ evaluates a conjecture about some
characteristic of the parent population based
upon the information contained in the
random sample.

+ Usually the conjecture concerns one of


the unknown parameters of the
population.
What is Hypothesis Testing?
• a statistical process of determining whether a
hypothesis made is reasonable or not, based
upon sample evidence
Goal of Hypothesis Testing
+ The goal of hypothesis testing in not to question
the computed value of the sample statistic but to
make a judgement about the difference between
the sample statistics and a hypothesized
population parameter.

8
What is a Hypothesis?
+ an assumption about a population or an assertion
about the possible value of a population parameter
+ a claim or statement about the population parameter
+ Examples of parameters are population mean
and population proportion
+ The parameter must be identified before analysis
Example of Hypothesis
+ The mean body temperature for patients
admitted to elective surgery is not equal to
37.0oC.

Note: The parameter of interest here is µ which is


the mean body temperature for patients
admitted to elective surgery.
Example of a Hypothesis
+ The proportion of registered voters in
Angeles City favoring Candidate A exceeds
0.60.

Note: The parameter of interest here is p which is the


proportion of registered voters in Angeles City
favoring Candidate A.
Hypothesis Testing
+ Example:
+ Hypothesis : “The mean of a population is 50”
+ Test: We decide to take 40 samples and
determine 𝑥ҧ to test if our hypothesis is valid.
Hypothesis Testing
• Null Hypothesis (𝐻0 )
– is also known as no difference relationship hypothesis
– it implies neutrality and objectivity, which must be
present in any research undertaking
Hypothesis Testing
• Alternative Hypothesis (𝐻𝑎 / 𝐻1 )
– is the opposite of the null hypothesis
– it specifies an existence of difference
– it is also called a predictive hypothesis which specifies
that one group is better than the other
Hypothesis Testing
• Rejection of a hypothesis
• is to conclude that the hypothesis is false
• Acceptance of a hypothesis (Failure to reject)
• implies that there is no sufficient evidence to believe
otherwise
• Critical region
• is a set of values of the test statistic that is chosen
before the experiment to define the conditions under
which the null hypothesis will be rejected
Parametric and Nonparametric statistics
+ Parametric statistical tests generally require interval or
ratio level data and assume that the scores were drawn
from a normally distributed population or that both sets of
scores were drawn from populations with the same
variance or spread of scores
+ Nonparametric methods do not make assumptions about
the shape of the population distribution. These are
typically less powerful and often need large samples
Things to keep in mind
+ Analyze a sample in an attempt to distinguish
between results that can easily occur and
results that are highly unlikely
We can explain the occurrence of highly unlikely
results by saying either that a rare event has indeed
occurred or that things aren’t as they are assumed to
be.
Note About Stating Your Own Hypotheses:
+ If you are conducting a research study and
you want to use a hypothesis test to
support your claim, the claim must be
stated in such a way that it becomes the
alternative hypothesis, so it cannot contain
the condition of equality.
Example in Stating your Hypothesis
If you believe that your brand of
refrigerator lasts longer than the mean
of 14 years for other brands, state the
claim that  > 14, where  is the mean
life of your refrigerators.

Ho:  = 14 vs. Ha:  > 14


Some Notes:
+ In this context of trying to support the goal of the
research, the alternative hypothesis is sometimes
referred to as the research hypothesis.

+ Also in this context, the null hypothesis is


assumed true for the purpose of conducting the
hypothesis test, but it is hoped that the
conclusion will be rejection of the null hypothesis
so that the research hypothesis is supported.
What is a Test of Significance?
+ A test of significance is a problem of deciding between the
null and the alternative hypotheses on the basis of the
information contained in a random sample.

+ The goal will be to reject Ho in favor of Ha, because the


alternative is the hypothesis that the researcher believes to
be true. If we are successful in rejecting Ho, we then declare
the results to be “significant”.

21
Note About Testing the Validity of
Someone Else’s Claim
Sometimes we test the validity of someone else’s claim,
such as the claim of the Coca Cola Bottling Company that
“the mean amount of Coke in cans is at least 355 ml,”
which becomes the null hypothesis of Ho:   355

In this context of testing the validity of someone else’s


claim, their original claim sometimes becomes the null
hypothesis (because it contains equality), and it
sometimes becomes the alternative hypothesis (because
it does not contain the equality).
Two Types of Errors

+ Type I Error

+ Type II Error
Type I Error
+ The mistake (error) of rejecting the null hypothesis when it
is true.

+ It is not a miscalculation or procedural misstep; it is an


actual error that can occur when a rare event happens by
chance.

+ The probability of rejecting the null hypothesis when it is


true is called the significance level ( ).

+ The value of  is typically predetermined, and very common


choices are  = 0.05 and  = 0.01.
Examples of Type I Error
1. The mistake of rejecting the null
hypothesis that the mean body
temperature is 37.0 when that mean is
really 37.0.

2. FDA allows the release of an ineffective


medicine.
Type II Error
+ The mistake of failing to reject the
null hypothesis when it is false.

+ The symbol  (beta) is used to


represent the probability of a type II
error.
Examples of Type II Errors
1. The mistake of failing to reject the
null hypothesis (  = 37.0) when it is
actually false (that is, the mean is not
37.0).

2. FDA does not allow the release of an


effective drug.
True Situation
Summary of
Possible The null The null
hypothesis hypothesi
Decisions in is true. s is false.
Hypothesis We decide TYPE I error CORRECT
Testing to reject the (rejecting a true decision
Decision null null
hypothesis. hypothesis)

We fail to CORRECT TYPE II error


reject the decision (failing to
null reject
hypothesis. a false null
hypothesis)
Analogy to Decisions in Hypothesis Testing
Trial
The Truth
Verdict Innocent Guilty

Innocent Correct Error

Guilty Error Correct


Controlling Type I and Type II Errors

+ The experimenter is free to determine . If the test leads to


the rejection of Ho, the researcher can then conclude that
there is sufficient evidence supporting Ha at  level of
significance.

+ Usually,  is unknown because it’s hard to calculate it. The


common solution to this difficulty is to “withhold judgment”
if the test leads to the failure to reject Ho.

+  and  are inversely related. For a fixed sample size n, as 


decreases  increases.
Controlling Type I and Type II Errors

+ In almost all statistical tests, both  and  can


be reduced by increasing the sample size.

• Because of the inverse relationship of


 and , setting a very small  should
also be avoided if the researcher
cannot afford a very large risk of
committing a Type II error.
Controlling Type I and Type II Errors
+ The choice of  usually depends on the
consequences associated with making a
Type I error.
Common Choices Consequences of
of  Type I error

0.01 or smaller very serious


0.05 moderately serious
0.10 not too serious

32
Controlling Type I and Type II Errors
+ The usual practice in research and industry is to
determine in advance the values of  and n, so the value
of  is determined.

+ Depending on the seriousness of a type I error, try to use


the largest  that you can tolerate.

+ For type I errors with more serious consequences, select


smaller values of  . Then choose a sample size n as large
as is reasonable, based on considerations of time, cost,
and other such relevant factors.
Example to illustrate Type I and Type II Errors
Consider M&Ms (produced by Mars, Inc.) and
Bufferin brand aspirin tablets (produced by Bristol-
Myers Products).

The M&M package contains 1498 candies. The mean


weight of the individual candies should be at least
0.9085 g., because the M&M package is labeled as
containing 1361 g.
Example to illustrate Type I and Type II Errors

The Bufferin package is labeled as holding 30 tablets,


each of which contains 325 mg of aspirin.

Because M&Ms are candies used for enjoyment


whereas Bufferin tablets are drugs used for treatment of
health problems, we are dealing with two very different
levels of seriousness.
Example to illustrate Type I and Type II Errors

If the M&Ms don’t have a population mean


weight of 0.9085 g, the consequences are
not very serious, but if the Bufferin tablets
don’t have a mean of 325 mg of aspirin, the
consequences could be very serious.
Example to illustrate Type I and Type II Errors

If the M&Ms have a mean that is too large,


Mars will lose some money but consumers will
not complain.

In contrast, if the Bufferin tablets have too


much aspirin, Bristol-Myers could be faced
with consumer lawsuits.
Example to illustrate Type I and Type II Errors
Consequently, in testing the claim that
 = 0.9085 g for M&Ms, we might choose
 = 0.05 and a sample size of n = 100.
In testing the claim of  = 325 mg for Bufferin tablets, we
might choose  = 0.01 and a sample size of n = 500.

The smaller significance level  and large sample size n are


chosen because of the more serious consequences associated
with the commercial drug.
The Test Statistic
• a statistic computed from the sample data that is especially
sensitive to the differences between Ho and Ha

• tend to take on certain values when Ho is true and different


values when Ha is true.

• The decision to reject Ho depends on the value of the test


statistic

• A decision rule based on the value of the test


statistic: Reject Ho if the computed value of the
test statistic falls in the region of rejection.
Region of Rejection or Critical Region- the set of all
values of the test statistic which will lead to the rejection of Ho

Factors that Determine the Region of


Rejection
▪ the behavior of the test statistic if the null
hypotheses were true
▪ the alternative hypothesis: the location of the region
of rejection depends on the form of Ha
▪ level of significance (): the smaller  is,
the smaller the region of rejection
Critical Value/s
+ the value or values that separate the
critical region from the values of the test
statistic that would not lead to rejection
of the null hypothesis.

+ It depends on the nature of the null


hypothesis, the relevant sampling
distribution, and the level of significance.
Types of Tests
+ Two-tailed Test. If we are primarily concerned with deciding
whether the true value of a population parameter is different
from a specified value, then the test should be two-tailed. For
the case of the mean, we say Ha:   0.
+ Left-tailed Test. If we are primarily concerned with deciding
whether the true value of a parameter is less than a specified
value, then the test should be left-tailed. For the case of the
proportion, we say Ha: P  P0.
+ Right-tailed Test. If we are primarily concerned with deciding
whether the true value of a parameter is greater than a
specified value, then we should use the right-tailed test. For
the case of the standard deviation, we say Ha:   0.
The p-value - the smallest level of significance at
which Ho will be rejected based on the
information contained in the sample

An Alternative Form of Decision Rule


(based on the p-value)

Reject Ho if the p-value is less than or


equal to the level of significance ().
Example of Making Decisions Using the p-value
If the level of significance =0.05,

p-value Decision

0.01 Reject Ho.


0.05 Reject Ho.
0.10 Do not reject Ho
Conclusions in Hypothesis Testing
1. Fail to reject the null hypothesis Ho.
2. Reject the null hypothesis Ho.
Notes:
+ Some texts say “accept the null hypothesis”
instead of “fail to reject the null hypothesis.”

+ Whether we use the term accept or fail to reject,


we should recognize that we are not proving the
null hypothesis; we are merely saying that the
sample evidence is not strong enough to warrant
rejection of the null hypothesis.
Wording of Final Conclusion
Start

Yes
Does (Reject (This is the
the Ho) “The sample data only case in
original Do you reject support the claim which the
claim contain Ho? that….(original claim).” original
the condition claim is
of supported.)
equality
No
“The sample does not
No (Original claim (Fail to provide sufficient evidence to
does not contain Reject support the claim
equality and Ho) that….(original claim).”
becomes Ha)
Wording of Final Conclusion
Start
“The sample provides (This is the
Yes sufficient evidence to only case in
Does warrant rejection of the which the
(Reject claim that….(original original
the Ho)
original claim).” claim is
Do you reject rejected.)
claim contain Ho?
the condition
of
equality
No
“The sample does not
Yes (Original claim (Fail to provide sufficient sample
contains equality Reject evidence to warrant rejection
and becomes Ho) Ho) of the claim that….(original
claim).”
Example in Making Final Conclusion
+ If you want to justify the claim that the
mean body temperature is different from
37.0oC, then make the claim that
  37.0. This claim will be an alternative
hypothesis that will be supported if you
reject the null hypothesis of Ho:  = 37.0.
Example in Making Final Conclusion

+ If, on the other hand, you claim that the mean


body temperature is 37.0oC, that is  = 37.0,
you will either reject or fail to reject the claim;
in either case, you will not support the original
claim.
50
Steps in Hypothesis Testing
+ Step 1: Formulate the Null Hypothesis
+ Denoted by H0
+ Is the hypothesis we hope to reject
+ This is the hypothesis used for testing and is the starting
point of the testing process
+ Must always express the idea of a no significant
difference or relationship
+ It is a precise statement of equality, such as =10.
+ We always first consider that the null hypothesis is true
unless proven otherwise by sample evidence.
Steps in Hypothesis Testing
+ Assuming k to be any constant, we have the
following possible forms of the null hypothesis:
+ H0 :  = K
+ H0 :  = K
+ As a precaution, we never make hypothesis
tests regarding the statistic. Hence, the null
hypothesis H0 : x = K is not acceptable. We only
hypothesize about the value of the population.
Steps in Hypothesis Testing
+ Step 2: Formulate the Alternative Hypothesis
+ Denoted by H1
+ Is opposite the null hypothesis
+ It specifies existence of a difference or a
relationship
+ The acceptance of H1 would mean that the
difference between the statistic and
population parameter hypothesized is too
great for us to allow the acceptance of H0.
Steps in Hypothesis Testing
+ Assuming k to be any constant, we have the
following possible forms of the alternative
hypothesis:
+ H1 :  ≠ K
+ H1 :  ≠ K
+ H1 :  < K or H1 :  > K
+ H1 :  < K or H1 :  > K
+ If H1 is in the form of H1 :  ≠ K, we have a
two-tailed test.
Steps in Hypothesis Testing
H0:  = K
H1:  ≠ K
Two-tail Test


H0 is rejected H0 is rejected
H0 is not rejected
One-tailed and Two-tailed test
• Two-tailed test is used when the rejection region is
located on both tails of the distribution.
Steps in Hypothesis Testing
H0:  = K
H1:  < K
One-tail Test

H0 is rejected H0 is not rejected


One-tailed and Two-tailed test
• One-tailed test is used when the rejection region is
located at only one side of the test statistic.
Steps in Hypothesis Testing
H0:  = K
H1:  > K
One-tail Test


H0 is not rejected H0 is rejected
One-tailed and Two-tailed test
Steps in Hypothesis Testing
+ Step 3: Specify the level of significance, 
+ Specifies the area within H1 is accepted.
+ 1- is the Level of confidence
+  is the Level of significance
+  divides the graph into 2 regions, the
region of the acceptance of H0 and the
region of the acceptance of H1 (critical
region).
+ It is customary to use an  of 0.05 or 0.01.
Level of Significance
H0:  = K
H1:  ≠ K
Two-tail Test

1-

/2 /2

H0 is rejected H0 is not rejected H0 is rejected


Level of Significance
H0:  = K
H1:  < K
One-tail Test

1-

H0 is rejected H0 is not rejected


Level of Significance
H0:  = K
H1:  > K
One-tail Test

1-

H0 is not rejected H0 is rejected


Steps in Hypothesis Testing
+ Step 4: Decide which sampling distribution to
choose
+ Determine the appropriate test statistic
whose sampling distribution is known
under the assumption that H0 is true.
Parameter Tested Test Statistic Used
 z or t
 x2
Steps in Hypothesis Testing
+ Step 5: Determine the critical value and the critical
region
+ The critical value is the dividing point between
the acceptance and rejection points.
+ The critical region is the region within which H0
is rejected.
+ The critical region (CR) depends upon the value
of the level of significance selected in Step 3.
Critical Regions CR: Z < -Z /2
H0:  = K Z > Z 1-/2
H1:  ≠ K
Two-tail Test

1-

/2 /2

H0 is rejected -Z /2 H0 is not rejected Z 1-/2 H0 is rejected


Critical Regions CR: Z < -Z 
H0:  = K
H1:  < K
One-tail Test

1-

H0 is rejected -Z  H0 is not rejected


Critical Regions CR: Z > Z 1-
H0:  = K
H1:  > K
One-tail Test

1-

H0 is not rejected Z 1- H0 is rejected


Steps in Hypothesis Testing
+ Step 6: Compute the value of the statistic
+ Using the data from a sample of size n, compute
the value of the test statistic z, t or x2 whichever
was selected in Step 4 using the appropriate
formula for each case.
+ Step 7: State the conclusion and make the decision
+ Determine whether the test statistic is in the
critical region or not. If the test statistic is in the
CR, we fail to reject H1. Otherwise, H0 is selected.
Test of Means

Gives us an indication of the true


average of a population.
Test of Means (Summary)
H0:  = K
H1:  ≠ K,  < K or  > K
 : 0
CR: Z < -Z /2 or Z < -Z  or Z > Z 1-
Z > Z 1-/2
t < -t /2 or t < -t  or t> t
t > t /2
Test of Means (Summary)
Computations:
Used if n  30
z= x- or z= x-
/ n s/ n
Used if n < 30
t= x-
s/ n
Conclusion:
Tests on the Mean of a Normal Distribution, Variance Known
Tests on the Mean of a Normal Distribution, Variance Unknown
Example 1:
+ A new process will be installed if its mean
processing time is at most 20 minutes. The new
procedure was tried. In a random sample of 50
trials, an average processing time of 22.2 minutes
with a standard deviation of 4.3 minutes was
obtained. At a level of significance =0.05, should
the new process be installed?
Example 2:
+ A company that makes chocolates claims that the
mean weight of a bag of chocolates is 240 grams
with a standard deviation of 20.5 grams. Using a
0.05 significance level, would you agree with the
company if a random sample of 50 bags of
chocolates was found to have a mean weight of
230 grams?
Example 3:
+ A random sample of 100 recorded deaths in
United States during the past year showed that
an average life span of 71.8 years. Assuming a
population standard deviation of 8.9 years,
does this deem to indicate that the mean life
span today is greater than 70 years? Use a
0.05 level of significance.

78
Example 4:
+ A manufacturer of sports equipment has developed a
new synthetic fishing line that the company claims has a
mean breaking strength of 8 kilograms with a standard
deviation of 0.5 kilogram. Test the hypothesis that 𝜇 = 8
kilograms against the alternative that 𝜇 ≠ 8 kilograms if
a random sample of 50 lines is tested and found to have
a mean breaking strength of 7.8 kilograms. Use a 0.01
level of significance.

79
Example 5:
+ The Edison Electric Institute has published figures on the
annual number of kilowatt-hours expended by various
home appliances. It is claimed that a vacuum cleaner
expends an average of 46 kilowatt-hours per year. If a
random sample of 12 homes included in a planned study
indicates that vacuum cleaners expend an average of 42
kilowatt-hours with a standard deviation of 11.9 kilowatt-
hours, does this suggest at the 0.05 level of significance
that vacuum cleaners expend, on the average, less than 46
kilowatt-hours annually?
Example 6:
+ A new process for producing synthetic diamonds can be
operated at a profitable level only if the average weight of
the diamonds is greater than 0.5K. To evaluate the
profitability of the process, 6 diamonds are generated
with a mean and a standard deviation of 0.53 and 0.0559
respectively. Do the 6 diamonds’ measurements present
sufficient evidence to indicate that the average weight of
the diamond produced by the process is in excess of 0.5K?
Example 7:
+ Historically, evening long-distance calls from a
particular city have averaged 15.2 minutes per
call. In a random sample of 35 calls, the
sample mean time was 14.3 minutes. Assume
the standard deviation is known to be 5
minutes. Using a 0.05 level of significance, is
there sufficient evidence to conclude that the
average evening long-distance call has
decreased?
82
Seatwork / Assignment
+ It is claimed that automobiles are driven on average more than 20,000
kilometers per year. To test this claim, 100 randomly selected
automobile owners are asked to keep a record of the kilometers they
travel. Would you agree with this claim if the random sample showed
an average of 23,500 kilometers and a standard deviation of 3900
kilometers? Use a 0.02 level of significance.
+ In a research report, Richard H. Weindruch of the UCLA Medical
School claims that mice with an average life span of 32 months will live
to be about 40 months old when 40% of the calories in their diet are
replaced by vitamins and protein. Is there any reason to believe that
𝜇 < 40 if 64 mice that are replaced on this diet have an average life of
38 months with a standard deviation of 5.8 months? Use a 0.05 level of
significance.

83
Tests on the Variance and
Standard Deviation of a
Normal Distribution

Hypothesis Tests on the Variance


85
Example:
+ An automated filling machine is used to fill bottles with liquid
detergent. A random sample of 20 bottles results in a sample
variance of fill volume of s2 = 0.0153 (fluid ounces)2. If the
variance of fill volume exceeds 0.01 (fluid ounces)2, an
unacceptable proportion of bottles will be underfilled or
overfilled. Is there evidence in the sample data to suggest
that the manufacturer has a problem with underfilled or
overfilled bottles? Use α = 0.05, and assume that fill volume
has a normal distribution.

86
Example:
+ The population standard deviation of strengths of steel bars
produced by a large manufacturer is 2.95. In order to meet
tighter specifications engineers are trying to reduce the
variability of the process. A sample of 28 bars gives a sample
standard deviation of 2.65. Assume that the strengths of
steel bars are normally distributed. Is there evidence at the
5% level of significance that the standard deviation has
decreased?

87

You might also like