Decision Making: Statistical Hypothesis Testing and Significance Criteria

Decision Making
and Tests of Hypotheses

Statistical Hypothesis
• In many situations we are forced to make decisions based on available limited statistical data.
• This is a complex process because of natural randomness of the data.
• Because decisions are basically comparisons, the process of a decision requires two basic
things: A Hypothesis and a Significance criterion
• A statistical hypothesis is a statement made about the parameters of a population.
• Level of significance is an agreed probability value that is considered significant enough to
accept or reject the hypothesis
• Example: Let’s define a fair coin to be the coin whose probability of having a head or tail to be
50% in a sequence of trials. Say we tossed the coin 20 times, and obtained 16 heads.
Obviously 16/20 is far from the 50%. So we are likely to say the coin is not fair, even though in
reality such situation can happen. However, as the number of trials increases. the probability of
such situation decreases (unless the coin is indeed not fair).
• In this example, the hypothesis implied was : The mean value of the coin in any future
population is 50% (μ = 50%) . The criterion was: If the mean value is far from the 50% we reject
the hypothesis. The significance level is in defining what “far” means. Most of the times we
define far by having a probability of less than 5% for the “estimated” mean to exceed the
hypothesized value.
Statistical Hypothesis
• To
systematize the decision process, two types of hypothesis must be defined:
1. The Null Hypothesis: ( called Ho) , here we propose that no change has occurred, or no
variation from the original situation has occurred. (In the previous example: Ho : μ = 50%)
2. The Alternate Hypothesis: (called HA) , any other statement indicating a chance or variation
( for example: HA: μ ≠ 50%, or HA: μ > 50%)
• When formulating hypothesis, it is very important to make sure that the two systems are
mutually exclusive (i.e. cannot happen together).
• The level of significance upon which we accept or reject Ho depends on the question or data
or importance of the decision. Generally, 5% or 1% are used. In this course if no information
is given about the significance level, you can assume it to be 5%.
• Example: A material is tested and its mean strength was found to be . The material is modified and re-tested
again, and the new mean strength was . A question is: Did the strength improve? A null hypothesis is: No it did
not ( i.e. Ho : μ = ), The alternate hypothesis can have many forms: (HA: μ ≠ , HA: μ > , or HA: μ < ). Based on
defining the HA you need to define a criterion which helps you decide how significant you consider the
probability of having a mean value different than . If the probability of having a mean value bigger than is
more than 5% we say that the new is not different from , and we say we accept Ho based on a 5% confidence.
Type I , Type II Errors, and the Power of the Test
• There are two types of errors that can happen during a decision process.
• Type I Error (α-error), If you reject a hypothesis when it is in fact true.
• Type II Error (β-error), If you accept a hypothesis when it is in fact wrong.
• Based on this, we can have 4 situation as shown in the table.
• Probability of making Type I error
is called “The level of Significance
= α” [ i.e. P(Type I Error) = α ]
• Probability of making Type II error
is called β. [ P(Type II Error) = β ]
• The Power of the Test is defined to be the probability of not
making Type II Error, i.e. : Power = 1 – β
Example:
• 40
samples were tested for a crushing strength of concrete and the mean compressive
strength was 55 MPa, and standard deviation was 10 MPa. The Project Specifications require
the strength to be at least 53 MPa.
a) Test the whether the strength of the concrete is acceptable or not.
b) Calculate the probability of an α error for the test above { i.e. the P-Value for the test} and
the new required specification mean value above which a 95% confidence is achieved.
c) Calculate the probability of a β error if the true mean was i) 52 MPa, ii) 54 MPa
a) Our null hypothesis Ho is: Indeed the mean value is 53 or higher. { Ho : μ>53 }. The alternate
hypothesis is { HA : μ < 53}. Let’s find the probability of Ho  P(μ > 53)=Pwhich is smaller than
95% and thus, we reject the null hypothesis ( i.e. the mean value does not have a significant
probability of being bigger than 53 MPa.
b) The P-value is simply α = 1 – P(x>53)=1 – 0.897 =10.3%.
The α-error occurs when we reject Ho when it is in fact true. This happens when the value of μ
is so small that we say it is “almost impossible” to happen. The “impossible” here is equivalent
to having a probability of less than 5% of being smaller than x { Or having a probability of 95%
of being bigger than x } . Thus, for α = 5% = P(μ < x), this gives x =a
Example / Continue
• c ) The β error happens when you accept Ho while it is in fact wrong. Since
we now know the true mean, we can decide when we are wrong or correct
about the “estimated mean” .
i) Based on the old mean (55), we accepted Ho when μ. But now the true mean
is 52 MPa { i.e. Ho : μ>53 is wrong}, so the probability of accepting Ho when the
mean was wrong is P(μ) but using the new “correct” mean value of 52. So, this
gives
ii) If the true mean is 54 MPa, then the null hypothesis (Ho : μ>53) is correct ,
so: and the power of the test is 1. ( i.e. the test is 100% effective in accept the
null hypothesis).
• Of course, the “true” mean here is supposed to be the “real” population
mean.
Cases for Hypothesis Tests and the P-Value
• Depending on the context in question, there can be three cases for testing H o:
1. Ho: x > xo , This is called one-sided test, and this is testable by verifying that P(x > x o ) > 1
–α
2. Ho: x < xo , This is also called one-sided test, and this is testable by verifying that P(x <
xo ) > 1-α { or P(x > xo ) < α }
3. Ho: x = xo, This is called two-sided test, and is verifiable by checking if P( -xo < x < xo ) = 1
– α , { or P (x < -xo) > α/2 or P (x > xo) > α/2 }
Similar to 3. above, we can have a hypothesis Ho: x1<x<x2, and this is verifiable that the
probability of such a case is significant.
• Another common approach is to calculate the values which satisfy the significance criteria
and the compare it to the claimed (hypothesized) mean values.
• The P-Value for the test is measure of the significance level itself for accepting Ho based
on the claimed mean value, i.e. the actual α value. For the case 1 above for example, α =
P-value = 1 – P(x > xo )
Sample Size Based on an Acceptable β Error
• Let’s
say we have sample of size n and mean of and stand. Dev. taken from a normal
population. Suppose that we have the hypotheses Ho : μ = μo , and HA : μ ≠ μo .
• To test Ho, we must have , thus we accept Ho if μo or μo
• Suppose that { based on more information} the true mean was ≠ μo and that Ho was therefore wrong. Based
on this new information, we can compute the probability of making a wrong decision ( Type II Error) because
now we have the “true” distribution for the mean (with ).
• Let’s say we will allow a variation in the mean by a magnitude δ = - μo , therefore, the mean
of the sampling distribution is shifted by d, thus, the standardized normal shift is , and let
the allowed probability of Type II error computed using the new data be here will be the
shifted normalized mean as compared to the original value based on which we accepted
Ho.i.e, . From this, we can compute the required sample size to achieve both and d as
Example
• In a lab experiment, 30 specimens were tested for a material with a mean of
25 kN and standard deviation of 15 kN.
• A) Check, using 95% confidence, if the true mean value of the material is at
least 20 kN.
• B) If the new target mean is now 25 kN, what is the sample size needed to
achieve a error of 5% in obtaining it by the experiment.
A) Our Null Hypothesis is Ho: μ > 20 , and this is testable by checking if the
probability of such hypothesis is significant: P(X>20)>0.95, this is equivalent
to P(Z>(20-25)/(15/)) = P(Z>-1.8)= 96%. Thus, we accept Ho.
B) Here, the allowed variation in the mean = 25-20=5 , and the acceptable
error in making a false decision (Beta) = 0.05, thus, , s=15, thus specimens.
Operating Characteristic Curves
• The process of determining
the sample size for a given
tolerance in both the mean
and probability of Type II
error can be converted into
practical curves.
• These curves are function of
relative tolerance in the mean
d = |δ|/σ and the probability
of Ho being true ( i.e. 1– β ).
• These curves are useful for
designing experiments based
on an acceptable level of
error for detecting a mean
value.

Decision Making: Statistical Hypothesis Testing and Significance Criteria

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Decision Making: Statistical Hypothesis Testing and Significance Criteria

Uploaded by

Copyright:

Available Formats

Decision Making

and Tests of Hypotheses

You might also like