You are on page 1of 27

Lecture 2

Hypothesis Testing II

Niza
Talukder
Type 1 and Type II

• When we perform a statistical test we hope that our decision will be correct, but there is
always a possibility of making an incorrect conclusion. There are two possible errors that can
be made in hypothesis test.

• Definition:
type I error (also known as false positive) : rejecting the null when it is true.

type II error (also known as false negative) : accepting the null when it is false

• Risk of type 1 error is often called the alpha risk, α. This means the probability of making
this error is simply equal to the significance level given that the null is true. The lower the α,
the lower the probability of making the error.

• Type 2 error is the chance of accepting the null when it is false. The error is denoted as the β
 • Consider a criminal trial. We test the hypothesis :
: defendant did not commit the crime
: defendant committed the crime
We only reject the null if we have a very strong evidence against it.
Type I error: Convicting a person who in reality did not commit the crime.
Type II error: acquitting a person who in reality committed the crime.

• Examples:
1) It has been shown many times that on a certain memory test, recognition is substantially better than
recall. However, the probability value for the data from your sample was 0.12, so you were unable
to reject the null hypothesis that recall and recognition produce the same results. What type of error
did you make?

Answer: Type II - there is really a difference in the population between recognition and recall but we
did not find a significant difference in sample. So our null is false. In spite of our null being false, we
failed to reject it. Thus we made a type II.

2) If the null hypothesis is false, you cannot make which kind of error? 
Answer: Type I because the null has to be true to make this type of error.
Example 1
1) A bag of potato chips is packaged by weight. A total of nine bags are purchased, weighed
and the mean weight is found to be 10.5 ounces. Suppose 0.6 ounces is the standard
deviation of the population of all such bags of chips and the stated weight on them is 11
ounces.  

a) Does the sample support hypothesis that the true population mean is less than 11 ounces?
Set a level of significance at 0.01.

b) What is the probability of Type I error?

Answer: a) You need to do the workout as shown in class.

b) A type I error occurs when we reject a null hypothesis that is true. The probability of such an
error is equal to the significance level. In this case significance level is equal to 0.01; thus the
probability of a type I error is 0.01.
 • Power: Probability of rejecting a null when it is false
Formula: 1-β

where β is the probability of making type II error

• Difference between alpha, and p-value

If null is true, alpha is the probability of rejecting the null; alpha sets the standard for how extreme the
data must be before we can reject the null hypothesis.
If the null is true, p-value is the probability of getting the test value or a more extreme test value. Test
value here means, z stat, t-stat or chi-squared statistic. (Recall the example done in class and refer to
explanation II)

We compare the p-value with the alpha to determine whether the observed data is statistically
significantly different from the null hypothesis. Reject null if p < α
 
Test of a mean of a normal population when population variance is unknown

Difference between z stat and t stat


A z-score and t-score are both used in hypothesis testing
• Z stat is preferred when sample size is greater than 30. A z-score tells you how many standard deviations
from the mean your result is
• T-score is made without knowledge of the population standard deviation and mean
• In the previous examples, we used sample standard deviation to estimate population standard deviation. We
say that the sample standard deviation is a good approximation for the population standard deviation when
the sample size is large.
• The general rule of thumb for when to use a t-stat is when the sample size meets the following two
requirements:

1) When the sample size is small (less than 30)


2) σ (population standard deviation) is unknown.

• t= estimator – hypothesized value


T – stat is characterized by degrees of freedom, v = n – 1
DOF refer to the values that have the freedom to vary; or number of independent quantities that can
be assigned to a statistical distribution.
Decision Rule
Example 2

The time needed for college students to complete a certain maze follows a normal
distribution with a mean of 45 seconds. To see if the mean time µ (in seconds) is changed by
vigorous exercise, we have a group of nine college students exercise vigorously for 30 minutes
and then complete the maze. The sample mean and standard deviation of the collected data is 49.2
seconds and 3.5 seconds respectively. Use these data to perform an appropriate test of hypothesis
at 5% significance level.
Example 3

A beer distributor claims that a new display featuring a life-size picture of a well-known
rock singer will increase product sales in supermarkets by an average of 50 cases in a
week. For a random sample of 20 high volume liquor outlets, the average sales increase
was 41.3 cases, and the sample standard deviation was 12.2 cases. Test at the 5% level
the null hypothesis that the population mean sales increase is at least 50 cases, stating
any assumptions you make.
Test of the population proportion (large sample)

We often want to test hypothesis about proportion of members of a large population with some particular
attribute. Inference about population proportion is based on the proportion of individual in a random
sample who posses the attribute we are interested in. For example:

1. In a group of 371 LSE students, 42 were left-handed. Is this significantly lower than the proportion of
all British who are left-handed, which is .12?

2. In a group of 371 students, 45 chose the number seven when picking a number between one and twenty
“at random”. Does this provide convincing statistical evidence of bias in favour of the number seven,
given that the proportion of students picking seven is significantly higher than 1/20 = .05?

3. A university has found over the years that out of all the students who are offered admission, the
proportion who accept is .70. After a new director of admissions is hired, the university wants to check if
the proportion of students accepting has changed significantly. Suppose they offer admission to 1200
students and 888 accept. Is this evidence of a change from the status quo?
 • Let p denote the proportion of individuals or objects in a
population who possess a specified property . Then is the proportion in a random sample of n
observations. If the null hypothesis is that the population proportion is equal to some specific
value , it follows that when this hypothesis is true, random variable

  ^𝑝 𝑥 − 𝑝0
√ 𝑝 0 (1− 𝑝0 )/ 𝑛
Follows a standard normal distribution. The appropriate tests for the population proportion is
shown in the next slide. You will find this in Newbold’s book
Example 4
Market Research, Inc., wants to know if shoppers are sensitive to the prices of items
sold in a supermarket. A random sample of 802 shoppers was obtained, and 378 of
those supermarket shoppers were able to state the correct price of an item immediately
after putting it into their cart. Test at the 7% level the null hypothesis that at least onehalf
of all shoppers are able to state the correct price.

Answer: -1.64

Since -1.64 is less than the critical value -1.28, we reject the null at 10% significance level. Thus
there is a strong evidence that less than one half of the shoppers can correctly state the price of an
item.
P-value of the test is 0.0505 which is 5.05%. The probability is small, thus there is a strong
evidence against the null.

Note: when p-value is less than significance level, we say that the result we found is statistically
significant. In other words, there is significant evidence against the null hypothesis.
Example 5

In a random sample of 361 owners of small businesses that had gone


into bankruptcy, 105 reported conducting no marketing studies prior to
opening the business. Test the hypothesis that at most 25% of all
members of this population conducted no marketing studies before
opening their businesses. Use α = 0.05.
 Chi Square distribution:

• In addition to the need for finding information about the population means, there are a number of
situations where we want to determine if the population variance is a particular value or set
of values. In modern quality-control work, this need is particularly important because a
process that, for example, has an excessively large variance can produce many defective
items.

• We shall now emphasize on the procedures for testing the population variance based
on the sample variance computed using a random sample of n observations from a
normally distributed population.

• The chi-square distribution is constructed so that the total area under the curve is equal to 1.

• Member of chi square is characterized by a single parameter called the degrees of freedom, v. A
random variable having distribution with v degrees of freedom will be denoted as . Mean and
variance is equal to the number of degrees of freedom and twice the number of degrees of freedom.
• Chi Square distributions are positively skewed, with the degree of skew decreasing with
increasing degrees of freedom. As the degrees of freedom increases, the Chi Square
distribution approaches a normal distribution.

Recall degrees of freedom:  mathematical restriction that needs to be used when estimating one
statistic from an estimate of another; number of values in our calculation that can be varied

• It is defined over positive values since variance cannot be negative.


The Chi-Square Statistic
Suppose we conduct the following statistical experiment. We select a random sample of
size n from a normal population, having a standard deviation equal to σ. We find that the
standard deviation in our sample is equal to s. Given these data, we can define a statistic,
called chi-square, using the following equation:

 =

This follows a chi square distribution with n-1 degrees of freedom.


 Hypothesis test of the variance of a normal population

• If we want to test a null hypothesis that the population variance is equal to some
specified value , that is

then when this hypothesis is true, the random variable

 =

follows a chi square distribution with n-1 degrees of freedom.


 Example 6:

The quality control manager of Stonehead Chemicals has asked you to determine if the
variance of impurities in its shipments of fertilizer is within the established standard.
This standard states that for 100-pound bags of fertilizer, the variance in the pounds of
impurities cannot exceed 4. A random sample of 20 bags is obtained, and the pounds of
impurities are measured for each bag. The sample variance is computed to be 6.62. In this
problem we are testing the null hypothesis

:≤4
: 4

Do you have enough evidence to reject the null? Test at 10%


Example 7

One way to evaluate the effectiveness of a teaching assistant is to examine the scores
achieved by his or her students on an examination at the end of the
course. Obviously, the mean score is of interest. However, the variance also contains
useful information—some teachers have a style that works very well with more-able
students but is unsuccessful with less-able or poorly motivated students. A professor
sets a standard examination at the end of each semester for all sections of a course.
The variance of the scores on this test is typically very close to 300. A new teaching
assistant has a class of 30 students whose test scores had a variance of 480. Regarding
these students’ test scores as a random sample from a normal population, test, against a
two-sided alternative, the null hypothesis that the population variance of their scores is
300.

You might also like