You are on page 1of 58

# Probability, Statistics and

Random Processes
IC 210

Hypothesis Testing-1
Reference: Introductory statistics
By Prem S. Mann available on Moodle Chapter 9

Inferential Statistics

## Researchers use inferential statistics to address two

Estimate the value of population parameters
Hypothesis testing

Statistics:
1. Model
2. Estimation
3. Hypothesis test

X i ~ N ( , 2 ), i 1, 2, , n iid.

x ,

2 s 2

0 , 2 02

Hypothesis testing
The purpose of hypothesis testing is to determine whether there is
enough statistical evidence in favor of a certain belief about a parameter.
For Example:
A software company may claim that, on average, it cans contain 12
ounces of soda. A government agency may want to test whether or not
such cans do contain, on average, 12 ounces of soda. Here we are to
test a hypothesis about the population mean .
According to some survey 75% of the total charitable contributions in
2008 were given by individuals. An economist want to check if this
percentage is still true for this year. Here we are to test a hypothesis

Hypothesis testing

## Hypothesis testing is designed to detect significant

differences: differences that did not occur by random
chance.

## In the one sample case: we compare a random sample

(from a large group) to a population.

## We compare a sample statistic to a population

parameter to see if there is a significant difference.

## A criminal trial is an example of hypothesis testing without the

statistics.
Based in the available evidence, the judge or jury will make one
of the two possible decisions.

## 1. The defendant is innocent or not guilty

2. The defendant is guilty
At the outset of the trial, the person is presumed not guilty. The
prosecutors efforts are to prove that the person has committed
the crime and, hence is guilty.

## Nonstatistical Hypothesis Testing

In statistics, the person is not guilty is called the Null Hypothesis.
And the person is guilty is called the alternate hypothesis.
The null hypothesis is denoted by H0:
H0: The person is not guilty
The alternative hypothesis is denoted by H1:
H1: The person is guilty
In the beginning of the trial it is assumed that the person is not
guilty. null hypothesis is usually the hypothesis that is assumed to
be true to be begin with.
6

## In statistics, the null hypothesis states that a given claim

(or statement) about a population parameter is true.
Therefore, convicting the defendant is called rejecting the null
hypothesis in favor of the alternative hypothesis. That is, the
jury is saying that there is enough evidence to conclude that
the defendant is guilty (i.e., there is enough evidence to
support the alternative hypothesis).

## Example soft drink

Soft drink company claim that, on average, its can contain

## 12 ounces of soda. In reality, this claim may not be true.

However we will initially assume that the companys claim
is true ( thats the company is not guilty of cheating and
lying).
To test that the claim of the soft-drink company, the null
hypothesis is that the companys claim is true.

## H0: =12 ounces

The null hypothesis can also be written as 12 ounces, boz
companys claim will still be true.
H1: <12 ounces
8

## How do we judge the plausibility of

the null hypothesis?

## The sample mean should be plausible under the

sampling distribution of the mean.

p( X )

Implausible

X X X

Fairly plausible
Highly plausible

## The further the observed value is from the mean of

the expected distribution, the more significant the
difference

## The plausibility of the null hypothesis is judged by computing the

probability p of observing a sample mean that is at least as
deviant from the population mean as the value we have observed.

p( X )

p
10

## This computation is simplified by converting to z-scores.

Under the assumption of normality, we can determine
this probability from a standard normal table.

p( z )

X
z
X

p
11

## Two Types of Error (in nonstatistical example)

The person has not committed the crime but is declared

## guilty. In this case, court has made an error by punishing

an innocent person. In statistics, this kind of error is
called a type I or an (alpha) error.

12

## The person has committed the crime, but because of

lack of evidence, is declared not guilty. In this case, court
has committed an error by setting a guilty person free.

## Two Types of Error (statistical

example)
A type I error will occur when H 0 is actually true (that is, the cans

## do contain on average 12 ounces of soda. But it just happen that

we draw a sample with a mean which is much less than 12 ounces
and we wrongfully reject the null hypothesis H 0.

## The value of , called the significance level of the test, represents

the probability of making a type I error . In other words, is the
probability of rejecting the null hypothesis, when in fact it is true.
= P(Ho is rejected Ho is true)

Note : the size of the rejection region depends on the value assigned
to

## Two Types of Error (statistical

example)
A type II error will occur when the null hypothesis is actually false

(that is, the soda contained in all cans, on average, is less than
12 ounces), but it happens by chance that we draw a sample with
a mean that is close to or greater than 12 ounces and we
wrongfully accepted it.
The value of represents the probability of making a type II error.
It represents the probability that Ho is not rejected when Ho is
false.
= P(Ho is not rejected Ho is false)

## The value of 1- is called the power of the test. It represents

14the probability of not making a type II error.

H0: Innocent
Hypothesis Test

Jury Trial
Actual Situation
Verdict

Innocent

Guilty

Actual Situation
Decision

H 0 True

Accept
Innocent

Guilty

Correct

Error

Error

Correct

Reject
H

1-
Type I
Error

False
Positive

( )

H 0 False
Type II
Error (

Power
(1 - )

False
Negative

## Type I and Type II Errors

Type I error (false rejection error) the probability (equal to
) associated with rejecting a true null hypothesis.
Type II error (false acceptance error) the probability
associated with failing to reject a false null hypothesis.
Actual Situation
Researchers Decision

## Accept the Null

Hypothesis

p (accept H 0 | H 0 true)

p (accept H0 | H0 false)

## Reject the Null

Hypothesis

p (reject H 0 | H 0 true)

p (reject H0 | H0 false)

1 (power)

## The two probabilities are inversely

related. Decreasing one increases the
16
other, for a fixed sample size.

Note
By rejecting H0, we are saying that the difference between
the value of stated in H0 and the value of obtained from
the sample is too large to have occurred because of the
sampling error alone. Consequently, this difference is real.
By not rejecting H0, we are saying that the difference
between the value of stated in H0 and the value of
obtained from the sample is small and it may have
occurred because of the sampling error alone.

17

Tailed Tests

## Two-tailed hypothesis test A hypothesis test in which the region of

rejection falls equally within both tails of the sampling distribution .

## One-tailed hypothesis test A hypothesis test in which the alternative

is stated in such a way that the probability of making a Type I error is
entirely in one tail of a sampling distribution.

## Right-tailed test A one-tailed test in which the sample outcome is

hypothesized to be at the right tail of the sampling distribution.

## Two -Tailed Tests

Example: According to a survey conducted in 2008, a sample
of six graders in schools weighed an average of 18.4
pounds. Some magzine wants to check whether or not this
mean changed since that survey
Ho: the mean weight has not changed =18.4
H1: the mean weight has changed 18.4

Right-tailed test
Example: The average price of homes in New Jersey was
\$461,216 in 2007. Suppose a real estate researcher wants to
check whether the current mean price of homes in this Town is
higher than \$461,216 .
Ho: =\$ 461.216
H1: >\$ 461.216

20

Left-tailed test
Example: The company claims that their soft-drink cans, on
average, contain 12 ounces of soda. However, if these cans
contain less than the claimed amount of soda, then the company
can be accused of cheating. Suppose a consumer agency wants
to test whether the mean amount of soda per can is less than 12
ounces.
H0: = 12 ounces = mean is equal to 12 ounces
H1: < 12 ounces =The mean is less
than 12 ounces
21

## One-tail vs. Two-tail Test

Hypothesis tests
Type I and type II errors
Type I error: H0 rejected, when H0 is true.
Type II error: H0 not rejected, when H0 is false.
Significance level: a is the probability of committing a
Type I error.
One-sided test

23

Two-sided test

/2

Production

## The machine that produces metal cylinders is set to

make cylinders with a diameter of 50 mm.

## The two-sided hypotheses of interest are

H0 : = 50 versus
HA : 50
where the null hypothesis states that the machine is
calibrated correctly.

## A manufacturer claim : its cars achieve an average of

at least 35 miles per gallon in highway driving.

are
H0 : 35 versus
H1 : < 35

## The null hypothesis states that the manufacturers

claim regarding the fuel efficiency of its cars is correct.

## There are two approaches to test whether

the sample mean supports the alternative
hypothesis (H1)
The

## rejection region method

The p-value method

26

## The rejection region is a range of values such that if

the test statistic falls within that range, the null
hypothesis is rejected in favour of the alternative
hypothesis.

27

## Construct appropriate hypotheses

Determine a test statistics to be used
Determine the critical value
Compare the test statistic with the critical value. Reject
the null hypothesis if the former is greater than the
latter.
Make an appropriate conclusion.

28

X 265
Calculating Test Statistics

## For one sample tests, use Z test

statistic if population is Normal, is
known, or if sample size is large
For one sample tests, use T static if
population distribution is not known or
if sample size is small (less than 30)

x
N

sX
sx
N

X
zc
x
zc 1.80

Procedure
First we find the critical value(s) of z from the normal
distribution table for the given significance level.
Then we find the value of the test statistic z for the observed
value of the sample statistic.
Finally we compare these two values and make a decision.
Remember, if the test is one-tailed, there is only one critical
value of z, and it is obtained by using the value of which gives
the area in the left or right tail of the normal distribution curve
depending on whether the test is left-tailed or right-tailed,
respectively. However, if the test is two-tailed, there are two
critical values of z and they are obtained by using area in each
30
tail of the normal distribution curve.

Mean ()

## Hypothesis Setups for Testing a

Proportion (p)

Problem : A used car dealer says that the mean price of a 1995
Ford F-150 Super Cab is at least \$16,500. You suspect this claim is
incorrect and find that a random sample of 14 similar vehicles has a
mean price of \$15,700 and a standard deviation of \$1250. Is there
enough evidence to reject the dealers claim at = 0.05?

Solution:
The claim is the mean price is at least \$16,500.
Ho: \$16,500 (Claim) and H1 : < \$16,500

## Because the test is a left-tailed test, the level of significance is 0.05.

There are d.f. = 14 1 = 13 degrees of freedom and the critical value
is t (from table )= -1.771.
The rejection region is t < -1.771. Using the t-test, the standardized
test statistic is:
x 15,700 16,500
to

2.39
s
1250
n
14

## Since t0 < t, we reject

The graph shows the location of the rejection region and the standardized
test statistic, t. Because t0 is in the rejection region, you should decide to
reject the null hypothesis. There is enough evidence at the 5% level of
significance to reject the claim that the mean price of a 1995 Ford F-150
Super Cab is at least \$16,500.

## Example : An industrial company claims that the mean pH

level of the water in a nearby river is 6.8. You randomly
select 19 water samples and measure the pH of each. The
sample mean and standard deviation are 6.7 and 0.24
respectively.
Is there enough evidence to reject the
companys claim at = 0.05? Assume the population is
normally distributed.

The claim is the mean pH level is 6.8. So, the null and alternative
hypotheses are:
Ho: = 6.8 (Claim) and Ha : 6.8
Because the test is a two-tailed test, the level of significance is = 0.05.
There are d.f. = 19 1 = 18 degrees of freedom and the critical value is
-t = -2.101 and t = 2.101 The rejection regions are t < -2.101 and t >
2.101. Using the t-test, the standardized test statistic is:

x 6.7 6.8
to

1.82
s
0.24
n
19
The graph shows the location of the rejection region and the standardized
test statistic, t. Because t0 is not in the rejection region, you should decide
not to reject the null hypothesis. There is not enough evidence at the 5%
level of significance to reject the claim that the mean pH is 6.8.

t distribution table

Probability Values
Z statistic (obtained) The test statistic
computed by converting a sample statistic
(such as the mean) to a Z score. The
formula for obtaining Z varies from test to
test.
P value The probability associated with the
obtained value of Z.

## The p-Value Approach

In this procedure, we find a probability value such that a
given null hypothesis is rejected for any (significance level)
greater than this value and it is not rejected for any less
than this
value.
In this approach, we calculate the p-value for the test,
which is defined as the smallest level of significance at
which the given null hypothesis is rejected.
Using this p-value, we state the decision. If we have a
predetermined value of , then we compare the value of p
39with and make a decision.

Probability Values

Probability Values

## Alpha ( ) The level of probability at which

the null hypothesis is rejected. It is
customary to set alpha at the .05, .01, or .001
level.

## Example: Normal Body Temperature

What is normal body temperature? Is it actually
37.6oC (on average)?
State the null and alternative hypotheses
H0: = 37.6oC
Ha: 37.6oC

## Example Normal Body Temp

(cont)
Data: random sample of n = 18 normal body temps
37.2
36.4

36.8
36.6

38.0
37.4

37.6
37.0

37.2
38.2

36.8
37.6

37.4
36.1

38.7
36.2

37.2
37.5

Variable
n
Temperature 18

Mean
37.22

SD
0.68

SE
0.161

to P
2.38 0.029

to

s
standard error
n

## STUDENTS t DISTRIBUTION TABLE

Degrees of
freedom

Probability (p value)
0.10
0.025
0.01

1
5
10
17
20
24
25

6.314
2.015
1.813
1.740
1.725
1.711
1.708
1.645

12.706
2.571
2.228
2.110
2.086
2.064
2.060
1.960

63.657
4.032
3.169
2.898
2.845
2.797
2.787
2.576

## Example Normal Body Temp (cont)

Find the p-value
Df = n 1 = 18 1 = 17
Rejection
region

p-value = 0.029
From t Table: t17,.025= 2.11
-2.11

calculated t0 =2.38

Since t0 > t
Reject the null hypothesis

+2.11
t

t0

## Example Normal Body Temp (cont)

Decide whether or not the result is statistically
significant based on the p-value
Using = 0.05 as the level of significance criterion,
the results are statistically significant because
0.029 is less than 0.05. In other words, we can reject
the null hypothesis.

## Report the Conclusion

We can conclude, based on these data, that the
mean temperature in the human population
does not equal 37.6.

Exampleusing p value

1.

2.

3.

## We want to see whether our data confirm a specific

hypothesis
Example: NYC Blackout Baby Boom
Data is births per day from two weeks in August 1966
Test against usual birth rate in NYC (430 births/day)
Need a Null Hypothesis and an Alternative Hypothesis
Calculate the test statistic:
Test statistic summarizes the difference between data
Find the p-value for the test statistic:
How probable is your data if the null hypothesis is true?

## Null Hypothesis (H0):

no effect or no change in the population
Alternative hypothesis (Ha):
real difference or real change in the population
If there is a large discrepancy between data and null
hypothesis, then we will reject the null hypothesis
NYC dataset: = mean birth rate in Aug. 1966
Null hypothesis is that blackout has no effect on birth
rate, so August 1966 should be the same as any
other month
H0: = 430 (usual birth rate for NYC)
Ha: 430

Test Statistic

## The test statistic measures the difference between

the observed data and the null hypothesis
How many standard deviations is our observed
sample value from the hypothesized value?

## For our birth rate dataset, the observed sample mean

is 433.6 and our hypothesized mean is 430

p-value

## p-value is the probability that we observed such an

extreme sample value if our null hypothesis is true
If null hypothesis is true, then test statistic T follows
a standard normal distribution

prob = 0.367

prob = 0.367
T = -0.342

T = 0.342

## If our alternative hypothesis was one-sided

(Ha: >430), then our p-value would be 0.367

Since are alternative hypothesis was two-sided our pvalue is the sum of both tail probabilities (0.734)

Statistical Significance

## Is test statistic T=0.342 statistically significant?

If the p-value is smaller than , we say the difference is
statistically significant at level
The -level is also used as a threshold for rejecting the
null hypothesis (most common = 0.05)
If the p-value < , we reject the null hypothesis that
there is no change or difference
The p-value = 0.734 for the NYC data, so we can not
reject the null hypothesis at -level of 0.05
Difference between null hypothesis and our data is not
statistically significant
Data do not support the idea that there was a
different birth rate than usual for the first two weeks
of August, 1966

## There is a close connection between confidence

intervals and two-sided hypothesis tests
100C % confidence interval is contains likely values
for a population parameter, like the pop. mean
Interval is centered around sample mean
Width of interval is a multiple of
A -level hypothesis test rejects the null hypothesis
that = 0 if the test statistic T has a p-value less
than

## If our confidence level C is equal to 1 - where is

the level of the hypothesis test, then we have the
following connection between tests and intervals:
A two-sided hypothesis test rejects the null
hypothesis ( = 0) if our hypothesized value 0
falls outside the confidence interval for

## So, if we have already calculated a confidence interval

for , then we can test any hypothesized value 0 just
by whether or not 0 is in the interval!

## Difference between our sample mean and the

population mean 0 = 430 had a p-value of 0.734, so
we did not reject the null hypothesis at -level of 0.05
We could have also calculated a 100(1-) % = 95 %
confidence interval:

## Since our hypothesized 0 = 430 is within our interval

of likely values, we do not reject the null hypothesis.
If hypothesis was 0 = 410, then we would reject it!

## Let be the mean calcium intake for people below the

poverty line
Null hypothesis is that calcium intake for people below
poverty line is not different from RDA: 0 = 850 mg/day

## To calculate test statistic, we need to know the

population standard deviation of daily calcium intake.
From previous study, we know = 188 mg

## Need p-value: if 0 = 850, what is the probability we get a

sample mean as extreme (or more) than 747 ?

## We have two-sided alternative, so p-value includes standard

normal probabilities on both sides:

prob = 0.010

prob = 0.010
T = -2.32

T = 2.32

Looking up probability in table, we see that the two-sided pvalue is 0.010+0.010 = 0.02
Since the p-value is less than 0.05, we can reject the null
hypothesis

Conclusion: people below the poverty line have significantly (at a =0.05
level) lower calcium intake than the RDA

## Alternatively, we calculate a confidence interval for

the calcium intake of people below poverty line
Use confidence level 100C = 100(1-) = 95%
95% confidence level means critical value Z*=1.96

## Since our hypothesized value 0 = 850 mg is not in

the 95% confidence interval, we can reject that
hypothesis right away!

## Statistical significance does not necessarily mean

real significance

## Lack of significance does not necessarily mean that

the null hypothesis is true

a low p-value

## If sample size is small, there could be a real difference, but

we are not able to detect it

## Presence of outliers, low sample sizes, etc. make our

assumptions less realistic
We will try to address some of these problems next class