t-test

Central Tendency & Dispersion

1

Independent 2 Samples t-Test

1

When to use the independent samples

t-test

The independent samples t-test is probably the

single most widely used test in statistics.

It is used to compare differences between separate

groups.

In social sciences, these groups are often composed

by randomly assigning research participants to

conditions.

However, this test can also be used to explore

differences in naturally occurring groups.

For example, we may be interested in differences of

emotional intelligence between males and females.

Distribution of Differences

between means

H

0

:

1

=

2

H

1

:

1

=

2

Variance Sum Law: The variance of the sum or

difference of two independent variables is

equal to the sum of their variances.

3

The t statistic

4

Pooling Variances

The previous equation is appropriate when

sample size are equal, it can be improved for

unequal sample sizes.

This equation will provide a better estimate of

the population variances.

One of the assumption for the t test is that the

variances are equal (homogeneity of

variance)

5

Pooling Variances

If we want a better estimate of , namely

and , it seems appropriate to attain an

average of these two values.

But a simple average is not suitable because it

gives equal weight to both values. (not

suitable because sample size not the same)

6

11/19/2012

2

The t equation again

7

Degrees of freedom (df)

df = (n

1

1) + (n

2

1)

8

Example 1

Kalau kita bahagikan dua kumpulan kepada 2

jenis diet yang berbeza:

diet nasi lemak

diet teh tarik

Subjek dimasukkan secara rawak dalam

kump diet nasi lemak dan kump teh tarik

untuk satu minggu.

Ini mungkin tidak beretika kerana nasi

lemak mestilah makan bersama teh tarik!

Tetapi ini hanyalah contoh.

Example 1 (cont.)

Pada akhir minggu, kita mengukur perubahan

berat badan.

Diet yang mana menyebabkan peningkatan

berat badan yang lebih?

Maka, hipotesis nol ialah:

Ho: wt. gain diet nasi lemak =wt. gain diet teh

tarik

Example 1 (cont.)

Why?

The null hypothesis is the opposite of

what we hope to find.

In this case, our research hypothesis is

that there ARE differences between the 2

diets.

Therefore, our null hypothesis is that

there are NO differences between these 2

diets.

11

6 Langkah Ujian Hipotesis

1. Tulis Hipotesis

2. Tetapkan alpha ()

3. Buat pengiraan

4. Dapatkan critical value

5. Lakarkan kawasan penolakan hipotesis nol

6. Buat Keputusan dan tulis kesimpulan

11/19/2012

3

Formula

The formula for

the independent samples t-test is:

13

, df = (n

1

-1) + (n

2

-1)

Example 1 (cont.)

The first step in calculating the

independent samples t-test is to calculate

the variance and mean in each condition.

In the previous example, there are a total

of 10 people, with 5 in each condition.

Since there are different people in each

condition, these samples are

independent of one another;

giving rise to the name of the test.

14

Example 1 (cont.)

The variances and means are calculated

separately for each condition

(nasi lemak and teh tarik).

In short, we take each observed weight gain

for the nasi lemak condition, subtract it from

the mean gain of the nasi lemak dieters

and square the result.

15

Example 3.1 (cont.)

X

1

: nasi

lemak

X

2

: teh tarik

1 3 1 1

2 4 0 0

2 4 0 0

2 4 0 0

3 5 1 1

2 4

0.5 0.5

16

=

1

2

1 1

) (

=

2

2

2 2

) (

=

1

) (

2

2

n

s

x

Column 3 Column 4

Formula

The formula for

the independent samples t-test is:

17

, df = (n

1

-1) + (n

2

-1)

Example 3.1 (cont.)

From the calculations previously, we have

everything that is needed to find the t.

18

, df = (5-1) + (5-1) = 8

After calculating the t value, we need to know

if it is large enough to reject the null hypothesis.

11/19/2012

4

Some theory

The t is calculated under the

assumption, called the null hypothesis,

that there are no differences between the

nasi lemak and teh tarik diet.

If this were true, when we repeatedly

sample 10 people from the population

and put them in our 2 diets, most often

we would calculate a t of 0.

19

Some theory - Why?

Look again at the formula for the t.

Most often the numerator (X

1

-X

2

) will be

0, because the mean of the two

conditions should be the same under the

null hypothesis.

That is, weight gain is the same under

both the nasi lemak and teh tarik diet.

20

Some theory - Why (cont.)

Sometimes the weight gain might be a bit

higher under the nasi lemak diet, leading

to a positive t value.

In other samples of 10 people, weight

gain might be a little higher under the teh

tarik diet, leading to a negative t value.

The important point, however, is that

under the null hypothesis we should

expect that most t values that we

compute are close to 0.

21

Some theory (cont.)

Our computed t-value is not 0, but it is in fact negative

(t(8) = -4.47).

Although the t-value is negative, this should not bother

us.

Remember that the t-value is only - 4.47 because we

named the nasi lemak diet X

1

and the teh tarik diet X

2

.

This is, of course, completely arbitrary.

If we had reversed our order of calculation, with the nasi

lemak diet as X

2

and the teh tarik diet as X

1

, then our

calculated t-value would be positive 4.47.

22

Example 1 (again) Calculations

The calculated t-value is 4.47 (notice, Ive

eliminated the unnecessary - sign), and the

degrees of freedom are 8.

In the research question we did not specify

which diet should cause more weight gain,

therefore this t-test is a so-called 2-tailed t.

23

Example 1 (again) Calculations

In the last step, we need to find the critical

value for a 2-tailed t with 8 degrees of

freedom.

This is available from tables that are in the

back of any Statistics textbook.

Look in the back for Critical Values of the t-

distribution, or something similar.

The value you should find is:

C.V.

t(8), 2-tailed

= 2.31.

24

11/19/2012

5

Example 1 (cont.)

The calculated t-value of 4.47 is larger in magnitude

than the C.V. of 2.31, therefore we can reject the null

hypothesis.

Even for a results section of journal article, this

language is a bit too formal and general. It is more

important to state the research result, namely:

Participants on the teh tarik diet (M= 4.00)

gained significantly more weight than those

on the nasi lemak diet (M= 2.00), t(8) = 4.47,

p < 0.05.

25

Example 1 (concluding comment)

Repeat from previous slide:

Participants on the teh tarik diet (M= 4.00)

gained significantly more weight than those

on the nasi lemak diet (M= 2.00), t(8) = 4.47,

p < 0.05.

Making this conclusion requires inspection of

the t tables.

26

Example 2

IQ score after training is given to a special

class (smart students) and normal class

students.

27

Special Class Normal Class

mean 24.0 16.5

Var 148.87 139.16

n 35 29

The F Max Test

Test for differences in variances.

Assumptions:

Data sampled randomly

Data are normally distributed

28

The F Max Test

Tetapkan hipotesis.

df = n-1

k = bilangan kumpulans

Kirakan F

max

dan bandingkan dengan F

max

kritikal anda.

Buat kesimpulan.

Nak guna pool variance atau tidak?

29

95% Confidence Interval

30

11/19/2012

6

Example 3 - of the two-sample t,

Empathy by College Major

Suppose we have a professionally developed test of

empathy. The test has people view film clips and

guess what people in the clips are feeling. Scores

come from comparing what people guess to what the

people in the films said they felt at the time. We

want to know whether Psychology majors have

higher scores on average to this test than do Physics

majors. No direction, we just want to know if there

is a difference. So we find some (N=15) of each

major and give each the test. Results look like this:

Empathy Scores

Person Psychology Physics

1 10 8

2 12 14

3 13 12

4 10 8

5 8 12

6 15 9

7 13 10

8 14 11

9 10 12

10 12 13

11 10 8

12 12 14

13 13 12

14 10 8

15 8 12

Output SPSS

33

Check your answers

34

Now Lets use SPSS to run our

Analysis

35

Example 1 Using SPSS

As long as this p-value falls below the

standard of 0.05, we can declare a

significant difference between our mean

values.

Since .002 is below .05 we can conclude:

Participants on the teh tarik diet (M= 4.00)

gained significantly more weight than

those on the nasi lemak diet (M= 2.00),

t(8) = 4.47, p < 0.01 (two-tailed).

36

11/19/2012

7

Example 1 Using SPSS (cont.)

Repeat from previous slide:

Participants on the the tarik diet (M= 4.00)

gained significantly more weight than those on

the nasi lemak diet (M= 2.00), t(8) = 4.47, p <

0.01 (two-tailed).

In APA style we normally only

display significance to 2 significant digits.

Therefore, the probability is displayed as

p<0.01, which is the smallest probability

within this range of accuracy.

37

Example 3.1 Using SPSS (cont.)

The SPSS output also displays Levenes Test for

Equality of Variances (see the first 2 columns in

second table on slide 30).

Why?

Strictly speaking, the t-test is only valid if we have

approximately equal variances within each of our

two groups.

In our example, this was not a problem because the

2 variances were exactly equal (Variance nasi lemak

= 0.04 and Variance teh tarik= 0.04).

38

Example 3.1 Using SPSS (cont.)

However, if this test is significant, meaning

that the p-value given is less than 0.05, then

we should choose the bottom line when

interpreting our results.

This bottom line makes slight adjustments to

the t-test to account

for problems when there are not

equal variances in both conditions.

39

END.

40

Dependent 2 Sample t test

41

Introduction

So what if we have two related data set?

Pre and post test data?

Level of love felt among husband and wife?

Repeated measures

Matched/related samples

Twins, husband-wife, father-son, mother-

daughter, mother-son

Two scores for one case.

42

11/19/2012

8

When the dependent sample test

is used

When comparing matched samples or

repeated scores.

Instead of the raw scores, we use the

DIFFERENCE SCORE (D).

43

The t equation again

44

Degrees of freedom (df)

df = number of pairs 1

45

Contoh 1

Suatu kajian terapi untuk masalah anorexia

telah dijalankan. Sampel kajian adalah 17

budak perempuan. Berat badan telah

dicatatkan sebelumdan selepas menjalani

terapi tersebut. Data adalah seperti berikut:

46

Before After Diff Score

Mean 83.23 90.49 7.26

S 5.02 8.48 7.16

6 Langkah Ujian Hipotesis

1. Tulis Hipotesis

2. Tetapkan alpha ()

3. Buat pengiraan

4. Dapatkan critical value

5. Lakarkan kawasan penolakan hipotesis nol

6. Buat Keputusan dan tulis kesimpulan

Hipotesis

48

11/19/2012

9

Tetapkan alpha

= 0.05

49

Buat Pengiraan

50

Dapatkan critical Value

Lakarkan kawasan penolakan nol hipotesis dan

buat keputusan.

51

Laporkan keputusan anda

52

Latihan 1

Subject Before After Diff score

A 10 14

B 15 13

C 12 15

D 11 12

Mean

S

53

