You are on page 1of 28

MAT20306 - Advanced Statistics

Lecture 1: Confidence intervals


Hypothesis testing

Dr. Jos Hageman, Biometris, Wageningen University

Biometris
Quantitative Methods brought to Life
Today’s lecture

 Refresher session on
● Hypothesis testing: t-tests
● One sample t-test
● Paired t-test
● Two samples t-test
● Confidence intervals

 In detail two samples t-test

Biometris
2
Quantitative Methods brought to Life
Tape worms in sheep (Ex. 6.3, p. 298 O&L)
Researchers want to investigate the effectiveness of a new
drug for tape worms in the stomach of sheep.
A random sample of 24 sheep are randomly divided into
two groups.
One group receives the new drug, the other group
receives no treatment.
After 6 months, the lambs are slaughtered and the worms
are counted.

3
Setup

• We want to compare two populations :


population of sheep receiving the new drug,
population of sheep receiving no treatment.

• The sheep are the experimental units.

• The response is the number of tape worms in the sheep.

• We cannot look at all sheep, but make a guess about the difference
between the populations looking at two random samples from the
populations.

• We have one random sample of sheep receiving the new drug, and
another random sample of sheep receiving no treatment.
4
and

• We want to compare the population means and


for the new drug
for no treatment

• The research hypothesis is that and are different, we except


to be less that

• This will be the alternative hypothesis, so

• The null hypothesis will be .

• When we reject we say that we have shown (proven) that the


research hypothesis is true, i.e. we have shown that the number of
tape worms is lower compared to untreated sheep.
5
Statistical model

• The samples are assumed to be from two normal populations:

, for sheep receiving the new drug


, for receiving no treatment

• Responses ... from the 24 sheep are independent


(random sample of sheep, randomly assigned to the groups).

• Note that equal variance for the two distributions is assumed.

• So, we assume: - normality


- equal variance
- independence

6
The test statistic

• The test statistic measures how well the data match up with

• Where , where is the pooled variance estimate:

• Very negative t suggests , so reject .

7
Outcome of the test statistic in the sheep example

14.11

8
The rejection region

distribution of t under

the area of outcomes of t


Area: α
that are rare under :

usually

0
Reject when outcome of test statistic t is too small (very
negative).

The green area form the rejection region: the outcomes of t


that lead to rejection of .
9
The t distrubution

William Gosset
& Ronald Fisher

13 June 1876 – 16 October 1937 17 February 1890 – 29 July 1962


To determine the rejection region we need to know the
distribution of t under , to decide which values of t are rare.

Under : t follows a t-distribution

with

In general for sample sizes and


10
The t-distribution and the rejection region

Rejection region: all outcomes of t smaller than –1.717

Outcome was -2.272, so is rejected: we have shown


that the number of tapeworms is lower in the animals
treated with the new drug.

11
The P-value

P-value = probability under for the outcome of test


statistic t and anything more extreme (supporting ).

P-value =

P-value < 0.05, so we reject


12
One sided vs two sided

• We expect the drug to be better than no treatment at


all, so:
(one sided HA)

• What if we had no expectation?

(two sided HA)

13
two sided HA: critical region

distribution of t under

area area two areas of outcomes of t


α/2 α/2 that are rare under :
and

usually
0
Reject when outcome of test statistic t is too large (very
positive), or too small (very negative).

Together, the red and green areas form the rejection region;
the outcomes of t that lead to rejection of .
14
two sided HA: critical region

Rejection region:
all outcomes of t smaller than –2.074, and
all outcomes of t larger than +2.074.

CR: (-∞, –2.074) and (2.074, ∞)

15
two sided HA: p-value

P-value = probability under for the outcome of test


statistic t and anything more extreme (supporting ).

Left tail

Right tail

P-value =
16
Equal variances or not?

What if variances of both samples are not equal?

We cannot pool the variances in Sp.

Use different expression for the standard error.

18
SPSS
( y1  y 2 ) 2 P (| t |  | t obs |
t 
1 1 two sided P  value 1 1
sp (  ) sp (  )
n1 n2 n1 n 2

H 0 :  12   2
2

H A :  12   22

y1  y 2
y1  y 2
t' s 12 s 22
s 12 s 22 
 2 P (| t ' |  | t ' obs |) n1 n2
n1 n2
two sided P  value

19
General structure of t statistic

estimate – value from


t=
standard error
• estimate is e.g. a difference between two sample means
• value from is often zero (but not always).
• standard error is standard deviation of the estimator.

t-test will appear in this course in many situations, e.g.

• testing a slope in regression


• comparing means in ANOVA

20
Other t-tests: one sample

One sample t-test


one random sample, interest in single population mean μ.

Example: sample from population of Dutch people,


response y = daily salt consumption of a person (gr),
μ = population mean for daily salt consumption of Dutch people.

one random sample of size , interest in mean μ

, t-distr. df = ( )

21
Other t-tests: paired data
Paired t-test
one random sample, two (dependent) observations per unit
interest in difference between population means ,

Example: sample of patients with blood pressure disorder


response y = blood pressure, measured before and after medication
and are the population means before and after medication

one random sample of size , pairs of observations, so observations


interest in difference between means μ1 - μ2
calculate differences d within each pair

Calculate differences , one-sample t-test , using

, t-distr. df = ( )

22
Steps in testing
1. mention H0 and Ha
2. mention the test statistic
3. mention the behaviour of the test statistic,
e.g. under H0 a t distribution and large values expected under Ha,
4. mention the type of critical region (or rejection region, RR): left, right or two-sided

5. give the critical region for a given α (often α = 0.05), or 5alt. value of test statistic
6. give the value of the test statistic, or 6alt. determine the P-value
7. is this value in the critical region? or 7alt. P-value below α?

8. conclude whether H0 is rejected or not,


9. also formulate the conclusion in words: alternative steps when,
more commonly, the P-
H0 is (not) rejected, Ha is (not) proven,
value is used, rather than
it is (not) shown that … (Ha in words) the critical region 23
A confidence interval (CI)

• A confidence interval is a range of “likely” values for a


parameter.
• They are values that we want to consider with some
“confidence”.
• The confidence level is often 0.95.

• The width of an interval reflects the accuracy :


narrow interval → accurate estimate.

• Bounds of the interval are random (depend on the


sample we draw).
• Interval constructed in such a way that the probability
that the parameter is inside the interval equals e.g. 0.95.

25
Confidence interval and t-test

• Confidence interval consists of all values for e.g.


that are likely on the basis of the data observed.
• These are all parameter values not rejected by the t-test.
• This is something entirely different from the rejection
region!

Often a confidence interval has the following structure:

(estimate ± constant * standard error).

here from a t-distribution


26
Confidence interval for the sheep example

(estimate ± constant * standard error)

t-distr.
df = 22
estimate = 26.58 - 39.67 = -13.03 constant = 2.074

0.95-confidence interval:

27
Confidence interval for the sheep example

(estimate ± constant * standard error)

0.95-CI:

Are the following null hypotheses rejected or not?

28

You might also like