24 views

Original Title: Lectue 11_Hypothesis Testing

Uploaded by AjayMeena

- CAPITULO 8- EJERCICIOS
- Vi Test of Hypothesis 1 on Mean
- Chapter 4- Hypothesis Testing
- Statistics Equations and Answers - Qs
- Chapt 11 Testing of Hypothesis
- Chap9
- week 7 (2)
- Hypothesis Testing Betsy Farber
- Hypothesis111 - Copy.pdf
- Hypothesis
- Hypothesis Testing
- Chapter6.pptx
- intro to hypothesis testing
- Week 8b - Hypothesis Testing
- Ch8
- A Gentle Introduction to Statistical Hypothesis Tests
- SPSS Analysis of Problem
- ch7 205 (1)
- Nepalese Stock Market
- Chapter5_Biostats

You are on page 1of 58

Random Processes

IC 210

Hypothesis Testing-1

Reference: Introductory statistics

By Prem S. Mann available on Moodle Chapter 9

Inferential Statistics

broad goals:

Estimate the value of population parameters

Hypothesis testing

Statistics:

1. Model

2. Estimation

3. Hypothesis test

X i ~ N ( , 2 ), i 1, 2, , n iid.

x ,

2 s 2

0 , 2 02

Hypothesis testing

The purpose of hypothesis testing is to determine whether there is

enough statistical evidence in favor of a certain belief about a parameter.

For Example:

A software company may claim that, on average, it cans contain 12

ounces of soda. A government agency may want to test whether or not

such cans do contain, on average, 12 ounces of soda. Here we are to

test a hypothesis about the population mean .

According to some survey 75% of the total charitable contributions in

2008 were given by individuals. An economist want to check if this

percentage is still true for this year. Here we are to test a hypothesis

about population proportion p.

Hypothesis testing

differences: differences that did not occur by random

chance.

(from a large group) to a population.

parameter to see if there is a significant difference.

statistics.

Based in the available evidence, the judge or jury will make one

of the two possible decisions.

2. The defendant is guilty

At the outset of the trial, the person is presumed not guilty. The

prosecutors efforts are to prove that the person has committed

the crime and, hence is guilty.

In statistics, the person is not guilty is called the Null Hypothesis.

And the person is guilty is called the alternate hypothesis.

The null hypothesis is denoted by H0:

H0: The person is not guilty

The alternative hypothesis is denoted by H1:

H1: The person is guilty

In the beginning of the trial it is assumed that the person is not

guilty. null hypothesis is usually the hypothesis that is assumed to

be true to be begin with.

6

(or statement) about a population parameter is true.

Therefore, convicting the defendant is called rejecting the null

hypothesis in favor of the alternative hypothesis. That is, the

jury is saying that there is enough evidence to conclude that

the defendant is guilty (i.e., there is enough evidence to

support the alternative hypothesis).

Soft drink company claim that, on average, its can contain

However we will initially assume that the companys claim

is true ( thats the company is not guilty of cheating and

lying).

To test that the claim of the soft-drink company, the null

hypothesis is that the companys claim is true.

The null hypothesis can also be written as 12 ounces, boz

companys claim will still be true.

H1: <12 ounces

8

the null hypothesis?

sampling distribution of the mean.

p( X )

Implausible

X X X

Fairly plausible

Highly plausible

the expected distribution, the more significant the

difference

probability p of observing a sample mean that is at least as

deviant from the population mean as the value we have observed.

p( X )

p

10

Under the assumption of normality, we can determine

this probability from a standard normal table.

p( z )

X

z

X

p

11

The person has not committed the crime but is declared

an innocent person. In statistics, this kind of error is

called a type I or an (alpha) error.

12

lack of evidence, is declared not guilty. In this case, court

has committed an error by setting a guilty person free.

example)

A type I error will occur when H 0 is actually true (that is, the cans

we draw a sample with a mean which is much less than 12 ounces

and we wrongfully reject the null hypothesis H 0.

the probability of making a type I error . In other words, is the

probability of rejecting the null hypothesis, when in fact it is true.

= P(Ho is rejected Ho is true)

Note : the size of the rejection region depends on the value assigned

to

example)

A type II error will occur when the null hypothesis is actually false

(that is, the soda contained in all cans, on average, is less than

12 ounces), but it happens by chance that we draw a sample with

a mean that is close to or greater than 12 ounces and we

wrongfully accepted it.

The value of represents the probability of making a type II error.

It represents the probability that Ho is not rejected when Ho is

false.

= P(Ho is not rejected Ho is false)

14the probability of not making a type II error.

H0: Innocent

Hypothesis Test

Jury Trial

Actual Situation

Verdict

Innocent

Guilty

Actual Situation

Decision

H 0 True

Accept

Innocent

Guilty

Correct

Error

Error

Correct

Reject

H

1-

Type I

Error

False

Positive

( )

H 0 False

Type II

Error (

Power

(1 - )

False

Negative

Type I error (false rejection error) the probability (equal to

) associated with rejecting a true null hypothesis.

Type II error (false acceptance error) the probability

associated with failing to reject a false null hypothesis.

Actual Situation

Researchers Decision

Hypothesis

p (accept H 0 | H 0 true)

p (accept H0 | H0 false)

Hypothesis

p (reject H 0 | H 0 true)

p (reject H0 | H0 false)

1 (power)

related. Decreasing one increases the

16

other, for a fixed sample size.

Note

By rejecting H0, we are saying that the difference between

the value of stated in H0 and the value of obtained from

the sample is too large to have occurred because of the

sampling error alone. Consequently, this difference is real.

By not rejecting H0, we are saying that the difference

between the value of stated in H0 and the value of

obtained from the sample is small and it may have

occurred because of the sampling error alone.

17

Tailed Tests

rejection falls equally within both tails of the sampling distribution .

is stated in such a way that the probability of making a Type I error is

entirely in one tail of a sampling distribution.

hypothesized to be at the right tail of the sampling distribution.

sign in the alternative hypothesis.

Example: According to a survey conducted in 2008, a sample

of six graders in schools weighed an average of 18.4

pounds. Some magzine wants to check whether or not this

mean changed since that survey

Ho: the mean weight has not changed =18.4

H1: the mean weight has changed 18.4

Right-tailed test

Example: The average price of homes in New Jersey was

$461,216 in 2007. Suppose a real estate researcher wants to

check whether the current mean price of homes in this Town is

higher than $461,216 .

Ho: =$ 461.216

H1: >$ 461.216

20

Left-tailed test

Example: The company claims that their soft-drink cans, on

average, contain 12 ounces of soda. However, if these cans

contain less than the claimed amount of soda, then the company

can be accused of cheating. Suppose a consumer agency wants

to test whether the mean amount of soda per can is less than 12

ounces.

H0: = 12 ounces = mean is equal to 12 ounces

H1: < 12 ounces =The mean is less

than 12 ounces

21

Hypothesis tests

Type I and type II errors

Type I error: H0 rejected, when H0 is true.

Type II error: H0 not rejected, when H0 is false.

Significance level: a is the probability of committing a

Type I error.

One-sided test

23

Two-sided test

/2

Production

make cylinders with a diameter of 50 mm.

H0 : = 50 versus

HA : 50

where the null hypothesis states that the machine is

calibrated correctly.

at least 35 miles per gallon in highway driving.

are

H0 : 35 versus

H1 : < 35

claim regarding the fuel efficiency of its cars is correct.

the sample mean supports the alternative

hypothesis (H1)

The

The p-value method

26

the test statistic falls within that range, the null

hypothesis is rejected in favour of the alternative

hypothesis.

27

Determine a test statistics to be used

Determine the critical value

Compare the test statistic with the critical value. Reject

the null hypothesis if the former is greater than the

latter.

Make an appropriate conclusion.

28

X 265

Calculating Test Statistics

statistic if population is Normal, is

known, or if sample size is large

For one sample tests, use T static if

population distribution is not known or

if sample size is small (less than 30)

x

N

sX

sx

N

X

zc

x

zc 1.80

Procedure

First we find the critical value(s) of z from the normal

distribution table for the given significance level.

Then we find the value of the test statistic z for the observed

value of the sample statistic.

Finally we compare these two values and make a decision.

Remember, if the test is one-tailed, there is only one critical

value of z, and it is obtained by using the value of which gives

the area in the left or right tail of the normal distribution curve

depending on whether the test is left-tailed or right-tailed,

respectively. However, if the test is two-tailed, there are two

critical values of z and they are obtained by using area in each

30

tail of the normal distribution curve.

Mean ()

Proportion (p)

Problem : A used car dealer says that the mean price of a 1995

Ford F-150 Super Cab is at least $16,500. You suspect this claim is

incorrect and find that a random sample of 14 similar vehicles has a

mean price of $15,700 and a standard deviation of $1250. Is there

enough evidence to reject the dealers claim at = 0.05?

Solution:

The claim is the mean price is at least $16,500.

Ho: $16,500 (Claim) and H1 : < $16,500

There are d.f. = 14 1 = 13 degrees of freedom and the critical value

is t (from table )= -1.771.

The rejection region is t < -1.771. Using the t-test, the standardized

test statistic is:

x 15,700 16,500

to

2.39

s

1250

n

14

The graph shows the location of the rejection region and the standardized

test statistic, t. Because t0 is in the rejection region, you should decide to

reject the null hypothesis. There is enough evidence at the 5% level of

significance to reject the claim that the mean price of a 1995 Ford F-150

Super Cab is at least $16,500.

level of the water in a nearby river is 6.8. You randomly

select 19 water samples and measure the pH of each. The

sample mean and standard deviation are 6.7 and 0.24

respectively.

Is there enough evidence to reject the

companys claim at = 0.05? Assume the population is

normally distributed.

The claim is the mean pH level is 6.8. So, the null and alternative

hypotheses are:

Ho: = 6.8 (Claim) and Ha : 6.8

Because the test is a two-tailed test, the level of significance is = 0.05.

There are d.f. = 19 1 = 18 degrees of freedom and the critical value is

-t = -2.101 and t = 2.101 The rejection regions are t < -2.101 and t >

2.101. Using the t-test, the standardized test statistic is:

x 6.7 6.8

to

1.82

s

0.24

n

19

The graph shows the location of the rejection region and the standardized

test statistic, t. Because t0 is not in the rejection region, you should decide

not to reject the null hypothesis. There is not enough evidence at the 5%

level of significance to reject the claim that the mean pH is 6.8.

t distribution table

Probability Values

Z statistic (obtained) The test statistic

computed by converting a sample statistic

(such as the mean) to a Z score. The

formula for obtaining Z varies from test to

test.

P value The probability associated with the

obtained value of Z.

In this procedure, we find a probability value such that a

given null hypothesis is rejected for any (significance level)

greater than this value and it is not rejected for any less

than this

value.

In this approach, we calculate the p-value for the test,

which is defined as the smallest level of significance at

which the given null hypothesis is rejected.

Using this p-value, we state the decision. If we have a

predetermined value of , then we compare the value of p

39with and make a decision.

Probability Values

Probability Values

the null hypothesis is rejected. It is

customary to set alpha at the .05, .01, or .001

level.

What is normal body temperature? Is it actually

37.6oC (on average)?

State the null and alternative hypotheses

H0: = 37.6oC

Ha: 37.6oC

(cont)

Data: random sample of n = 18 normal body temps

37.2

36.4

36.8

36.6

38.0

37.4

37.6

37.0

37.2

38.2

36.8

37.6

37.4

36.1

38.7

36.2

37.2

37.5

Variable

n

Temperature 18

Mean

37.22

SD

0.68

SE

0.161

to P

2.38 0.029

to

s

standard error

n

Degrees of

freedom

Probability (p value)

0.10

0.025

0.01

1

5

10

17

20

24

25

6.314

2.015

1.813

1.740

1.725

1.711

1.708

1.645

12.706

2.571

2.228

2.110

2.086

2.064

2.060

1.960

63.657

4.032

3.169

2.898

2.845

2.797

2.787

2.576

Find the p-value

Df = n 1 = 18 1 = 17

Rejection

region

p-value = 0.029

From t Table: t17,.025= 2.11

-2.11

calculated t0 =2.38

Since t0 > t

Reject the null hypothesis

+2.11

t

t0

Decide whether or not the result is statistically

significant based on the p-value

Using = 0.05 as the level of significance criterion,

the results are statistically significant because

0.029 is less than 0.05. In other words, we can reject

the null hypothesis.

We can conclude, based on these data, that the

mean temperature in the human population

does not equal 37.6.

Exampleusing p value

1.

2.

3.

hypothesis

Example: NYC Blackout Baby Boom

Data is births per day from two weeks in August 1966

Test against usual birth rate in NYC (430 births/day)

Formulate your hypotheses:

Need a Null Hypothesis and an Alternative Hypothesis

Calculate the test statistic:

Test statistic summarizes the difference between data

and your null hypothesis

Find the p-value for the test statistic:

How probable is your data if the null hypothesis is true?

no effect or no change in the population

Alternative hypothesis (Ha):

real difference or real change in the population

If there is a large discrepancy between data and null

hypothesis, then we will reject the null hypothesis

NYC dataset: = mean birth rate in Aug. 1966

Null hypothesis is that blackout has no effect on birth

rate, so August 1966 should be the same as any

other month

H0: = 430 (usual birth rate for NYC)

Ha: 430

Test Statistic

the observed data and the null hypothesis

How many standard deviations is our observed

sample value from the hypothesized value?

is 433.6 and our hypothesized mean is 430

p-value

extreme sample value if our null hypothesis is true

If null hypothesis is true, then test statistic T follows

a standard normal distribution

prob = 0.367

prob = 0.367

T = -0.342

T = 0.342

(Ha: >430), then our p-value would be 0.367

Since are alternative hypothesis was two-sided our pvalue is the sum of both tail probabilities (0.734)

Statistical Significance

If the p-value is smaller than , we say the difference is

statistically significant at level

The -level is also used as a threshold for rejecting the

null hypothesis (most common = 0.05)

If the p-value < , we reject the null hypothesis that

there is no change or difference

The p-value = 0.734 for the NYC data, so we can not

reject the null hypothesis at -level of 0.05

Difference between null hypothesis and our data is not

statistically significant

Data do not support the idea that there was a

different birth rate than usual for the first two weeks

of August, 1966

intervals and two-sided hypothesis tests

100C % confidence interval is contains likely values

for a population parameter, like the pop. mean

Interval is centered around sample mean

Width of interval is a multiple of

A -level hypothesis test rejects the null hypothesis

that = 0 if the test statistic T has a p-value less

than

the level of the hypothesis test, then we have the

following connection between tests and intervals:

A two-sided hypothesis test rejects the null

hypothesis ( = 0) if our hypothesized value 0

falls outside the confidence interval for

for , then we can test any hypothesized value 0 just

by whether or not 0 is in the interval!

population mean 0 = 430 had a p-value of 0.734, so

we did not reject the null hypothesis at -level of 0.05

We could have also calculated a 100(1-) % = 95 %

confidence interval:

of likely values, we do not reject the null hypothesis.

If hypothesis was 0 = 410, then we would reject it!

poverty line

Null hypothesis is that calcium intake for people below

poverty line is not different from RDA: 0 = 850 mg/day

population standard deviation of daily calcium intake.

From previous study, we know = 188 mg

sample mean as extreme (or more) than 747 ?

normal probabilities on both sides:

prob = 0.010

prob = 0.010

T = -2.32

T = 2.32

Looking up probability in table, we see that the two-sided pvalue is 0.010+0.010 = 0.02

Since the p-value is less than 0.05, we can reject the null

hypothesis

Conclusion: people below the poverty line have significantly (at a =0.05

level) lower calcium intake than the RDA

the calcium intake of people below poverty line

Use confidence level 100C = 100(1-) = 95%

95% confidence level means critical value Z*=1.96

the 95% confidence interval, we can reject that

hypothesis right away!

real significance

the null hypothesis is true

a low p-value

we are not able to detect it

assumptions less realistic

We will try to address some of these problems next class

- CAPITULO 8- EJERCICIOSUploaded bySarielys De Jesús Alvarez
- Vi Test of Hypothesis 1 on MeanUploaded bysgultom
- Chapter 4- Hypothesis TestingUploaded byRazman Bijan
- Statistics Equations and Answers - QsUploaded byalberthawking
- Chapt 11 Testing of HypothesisUploaded byAnkit Lakhotia
- Chap9Uploaded byPriscilia Foo
- week 7 (2)Uploaded byFahad Almitiry
- Hypothesis Testing Betsy FarberUploaded byMichael Mcneil
- Hypothesis111 - Copy.pdfUploaded byjonnydeep1970virgilio.it
- HypothesisUploaded byIma Timah
- Hypothesis TestingUploaded bydevesh_mendiratta_61
- Chapter6.pptxUploaded byRobert
- intro to hypothesis testingUploaded byrahulrockon
- Week 8b - Hypothesis TestingUploaded by_vanityk
- Ch8Uploaded byMuhUlinYuliansyah
- A Gentle Introduction to Statistical Hypothesis TestsUploaded bygong688665
- SPSS Analysis of ProblemUploaded bybelovedblue
- ch7 205 (1)Uploaded byBheemanagouda Biradar
- Nepalese Stock MarketUploaded byBhuwan
- Chapter5_BiostatsUploaded byRige
- The Analysis of Professional Competencies of a Lecturer in Adult Education SPRINGER PLUSUploaded byAntonio Portela
- PUBH 6000 Statistical Inference - Spring 2017 - Handout (4)Uploaded byaastha93
- last project for math 1040Uploaded byapi-303044832
- Homework 5Uploaded bytheboac
- Ch 9 Hypothesis Testing Cheat SheetUploaded byDaniel Putzke
- d9be7amizone Hypothesis TestingUploaded bymacpiyush
- Stat Chapter 11Uploaded byLeann Taguilaso
- 16-Research Paper From Dr AyazUploaded byJose Silva
- Quant 11Uploaded bybusybeefreedom
- 6 TESTING OF HYPOTHESIS1Uploaded bySanjeev Dalmia

- 4 student - ANOVA.pdfUploaded byChun Mun Chia
- Jon C. Pevehouse-Democracy From Above_ Regional Organizations and Democratization-Cambridge University Press (2005)Uploaded byLady Paul Sy
- Chi Square and McNemar TestUploaded byjimoh olamidayo Micheal
- IMT 24 Quantitative Techniques M1Uploaded byDivyangi Walia
- statrep2Uploaded byJose Stevens
- chi squreUploaded byFaisal Neyazi
- Nickerson Confirmation BiasUploaded byZeghaider
- les5e_ptb_07Uploaded byJiger Shah
- 4. Impact of Hostel Students’ Satisfaction on TheirUploaded byAzzyHazimah
- Lesson12.pdfUploaded byhuyhoaius9038
- upload01.pdfUploaded byhiteshcparmar
- D 3775 03.pdfUploaded byshegarpositron
- MSc Sociology.pdfUploaded byMN Irshad
- Analyzing the Existing Thinking Status among the Employees of Tehran Province Education Organization Based on the Comprehensive Strategic Thinking ModelUploaded byTI Journals Publishing
- Two Means TestUploaded bySarith Sagar
- Answers-Review-Questions-Econometrics.pdfUploaded byJohn Paul Tuohy
- Cotton THESISUploaded byshivakumarhd
- PRINT2Uploaded byAnonymous rNmOyeE
- EstherTrullolsUploaded byMihai Florin
- 22 Social Networks and EntrepreneurshipUploaded byRizwan Ahmed
- Levine SmumeUploaded bysandaru07malinga
- term project - stats - google driveUploaded byapi-242223433
- 5 Thesis FormateUploaded byMona Arian
- Causes for Tooth ExtractionUploaded by87sumi
- Evaluation of School Health Instruction in Primary Schools in Jos, North- Central NigeriaUploaded byInternational Organization of Scientific Research (IOSR)
- AnovaUploaded byHyman Jay Hernandez Blanco
- MSc in Clinical Epidemiology Course Outline 2015Uploaded byNeo Mervyn Monaheng
- SAFETY AWARENESS AT WORKPLACE A CASE STUDY AT CELCOM AXIATA BERHAD Mar1.Indany Achenk Abdullah AsepatoriUploaded byfatinzalila
- Non Parametric TestsUploaded byKrislyn Ann Austria Alde
- Population DensityUploaded byhabibakhtarzi