You are on page 1of 51

Testing the difference of means

(T-test)
Dr. Lloyd C. Bautista
BOL Plebiscite
Have you ever wondered the
DIFFERENCES OF MEANS between
the groups voting in the BOL
plebiscite?
Muslims vs Non-Muslims; Male
vs Female; Old vs. Young; or
Working vs. Non-working.
 We collect a sample data from the same population
and determine the acceptance or rejection of
the null hypothesis (H0).
 Remember the null hypothesis (H0) states that: (μA
= μB) “NO DIFFERENCE…if there are any differences,
discrepancies, or suspiciously outlying results they
are purely due to sampling errors".
 Accepting the alternative hypothesis (Ha) states
that: (μA ≠ μB) “YES DIFFERENCE… the difference
between samples are too large to ignore and
statistically significant.”
Comparing two means
• Is there a difference between the Annual Incomes of college degree
and graduate degree graduates?
• Does compensation prior to make decisions have an effect on
corruption index?
• Is the training program effective in increasing residents before and
after the program?
 Let us go back to BOL.
 There are two sample groups drawn from same
voters’ population – 30 Female respondents (F)
were evaluated with respect to their extent of
support to BOL Law vis-à-vis 30 Males (M).
 If the(mean) score of Females is 45 while Males ( is
40, there is a difference of 5.
 Question here is whether or not the difference of
5 is statistically significant to reject the NULL
HYPOTHESIS. Or put it in another way is the
sample population large enough to indicate true
difference!
 We can get thousands of pairs (M vs F) to obtain
frequency distribution. But this is inefficient.
Testing our Hypothesis
 Null hypothesis (H0) means that (μA = μB)… “THERE IS NO
DIFFERENCE BETWEEN THE POPULATION MEAN & SAMPLE MEAN OF
AGES AMONG FEMALE AND MALE VOTERS…if there are any
differences, discrepancies, or suspiciously outlying results they are
purely due to sampling errors".
 Alternative hypothesis (Ha)  means that : (μA ≠ μB) “THERE IS A BIG
DIFFERENCE, or the difference between population and samples are
too large to ignore and statistically significant.
TABLE A. Percentage of Area under the Normal Curve
z Area between Mean and z Area beyond z
2.50 49.38 0.62

𝑥𝐹 − 𝑥𝑀
𝑧=
𝜎 𝑥 −𝑥 𝐹 𝑀

45 − 40
𝑧=
2 .62%
.62%

𝑧=+2.5 49.38%

= Mean of the Female


0
z = 2.50
= Mean of the Male
= Standard deviation of sampling distribution of
differences between mean
𝑊𝑖𝑡h 𝑎 𝑍𝑠𝑐𝑜𝑟𝑒=+2.5

 The mean difference of 5 or more


has a probability chance of 1.24% in
total (=.62% + .62%).
 Probability is only 1 out of 100.
 It might be just a matter of sampling
error that there is a difference of Retain the Ho
means between ages of voters .62%
between Female and Male. .62%
49.38% 49.38%
 WE FAIL TO REJECT the null
hypothesis (Retain the Ho), we might
also commit an error since these are 0
just samples. z = 2.50
𝑧=+2.5

 Suppose we came up with a


Standard Error at 5% (2.5%+
2.5%)
 WE can now REJECT the null
hypothesis since z = 2.50 is
.2.50%% .2.50%
farther from mean at level of
.62% Reject the Ho significance of .10%.
.62%  Z score of 2.50 falls within the
49.38% 49.38% Rejection area rejection area.
 We will explain this further…
0 z critical = 1.96 z = 2.50
Probability (sampling distribution) is
analogous to frequency distribution
Sampling distribution is like a frequency distribution. We know that
frequency distribution in large numbers becomes a NORMAL CURVE.
Thus, no need to get thousands of pairs.
As a normal curve, the probability of Ho (Null) becomes more and
more unlikely as we move farther away from the means of difference
(0)
Normal curve
Sometimes called normal distribution.
Mean, mode and median are all equal.
Normal curve is symmetrical at the center
(around the mean). 68.26%
Standard deviation is the spread of the
distribution.
95.44% 2.5%
A smaller σ indicates the data clustered 2.5%
around the mean and normal curve is
steeper. A larger σ indicates that data is 95.74%
spread out and curve is flatter.
Total area under the standard normal curve - 3σ - 2σ - 1σ 0 + 1σ + 2σ + 3σ

is 100% cases or 1
100% or 1
Rules
Standard of Error = 4.4%

2.28%
0.62%
47.72%

Z = + 2.0
Z = +2.5

The rules are simple: REMEMBER THIS.


• If the Z is far away from zero (0), the probability is HIGH and therefore we can
reject the NULL HYPOTHESIS. Difference between means is too LARGE to ignore.
• If Z is close to zero (0), then the probability of large mean difference is SLIM. We
retain the NULL HYPOTHESIS. This means that the difference in means in sample
population can be a result of sampling error.
• Note the critical areas for rejection.
We want to ‘test’ the DIFFERENCE
BETWEEN MEANS of two independent
population samples?
T-test
 Two samples are collected from the same population, and their sample means
are computed.
 The difference between means can be slim or large.
 The assumption is that they are equal (Ho= no effect) since they come from
the same population
 Example is a new cancer drug that improves life expectancy.
 The control group lived for more than 5 years, while the experimental group
lived from more than 6 years.
 Result might show drug was effective.
 T-test proves if the results are repeatable for the population or merely chance.
T-test
 We compare the DIFFERENCE between the sample means we
collected to the DIFFERENCE between the sample means we expect to
get if Ho was true.
 If the DIFFERENCE is statistically large, we reject the Ho and retain.
We can also accept the Ha.
Standard error of the Sample Means vs.
Standard error of the Sample Means difference
Ages of Male workers vs Female workers
Male Female

23 23
25 25
21 +2 21
-1
22 22
22 22
19 19
+3

Standard error = standard deviation of the Standard error = standard deviation of the
Sample Means from the Population Mean Sample Means DIFFERENCE from the
Population Mean DIFFERENCE
Case study
 Remember your survey forms. We want to test the difference of
means between two barangays with respect to their perception on
item 11.a. “The presence and visibility of law enforcers in the
community is adequate.” (Rate the perception with 1 as the least and
10 as the highest.
We want to know if the difference between the means of perception
on item 11.a. on DETERRENCE is significant or not.
 Null hypothesis (μA = μB)… “THERE IS NO DIFFERENCE BETWEEN
THE MEANS…if there are any differences, discrepancies, or
suspiciously outlying results they are purely due to
sampling errors".
 Alternative hypothesis (Ha)  means that : (μA ≠ μB) “THERE IS A
BIG DIFFERENCE, or the difference between population and
samples are too large to ignore and statistically significant.
Brgy Upper Hills Brgy Lower Hills
N 10 10
Mean 7.8 4
s2(variance) 0.56 1.8


s (stdev) 0.75 1.3

∑ 𝒇𝒙
s2 = 𝑠=
∑ ( 𝑥 − 𝑥) 2

𝒙=
𝑵 𝑁
mean Standard deviation
variance

 Go to 103_T-Test_visibility_11a
Find the STANDARD ERROR OF THE DIFFERENCE between two Sample means (pooled)

𝑠 𝑥 1 − 𝑥2 = (√
𝑁 1 𝑠1 2+𝑁 2 𝑠2 2
𝑁 1 +𝑁 2 −2

𝑁 1 +𝑁 2
𝑁1 𝑁 2 )( )
𝑠 𝑥 1 − 𝑥2 = (√ 10 ( .56 ) +10(1.8)
10+10 − 2
∗ )(
10+10
(10)(10) )
Brgy Upper Hills Brgy Lower Hills
𝑠 𝑥 1 − 𝑥2 =0.80 N 10 10
Mean 7.8 4
s2(variance) 0.56 1.8
s (stdev) 0.75 1.3
Find the t ratio by dividing the difference between means ( ) by the standard error of difference between means

𝑥1 − 𝑥 2 This is difference between means (7.8 and 4)


𝑡=
𝑠 𝑥1− 𝑥2

7.8 − 4
𝑡=
0.80
𝑡=4 .75 T test of difference between
means of independent samples
Find df
df =
df =

df =
 The calculated t (two-tailed test) is 4.75 with degrees of freedom = 18 and α= .05.
 Since calculated t is MORE THAN the t critical value table of 2.101 WE REJECT THE NULL HYPOTHESIS, which
means the difference between the means is STATISTICALLY SIGNIFICANT to ignore.
 See the graph below.
TABLE C. Critical Values of t
df Level of significance for two-tailed test (α)
0.05
18 2.101

Reject the Ho
α/2 =2.5%
2.5% 47.50%

Rejection area

0 t calculated = 4.75
t critical = 2.101
Workshop 1
 Go back to your survey forms. We want to test the difference of
means between two barangays – Upper Hills and Lower Hills - with
respect to their perception on item 12.b. “Law enforcers respect
human rights and due process.” (Rate the perception with 1 as the
least and 10 as the highest.
We want to know if the difference between the means of perception
on item 12.b is significant or not.

 Go to 103_T-Test Workshop_human rights_12b


PNP (N1 =10) BUREAU OF FIRE
Respondent X1 X12 Respondent X2 X22
Ignacio 5.5 Martinez 5.2

Dela Cruz 5.6 Adique 5.8


Evangelista 5.8 Zebago 5.6
Mati 5.9 Baguilad 5.7
Aberin 5.6 Hall 5.8
Enriquez 5.7 German 5.9
Benitez 5.4 Fabian 5.4
Rojas 5.6 Sison 5.3
Albay 5.4 Magno 4.11
Villagracia 5.3 Maambot 5.3

N 10 10
2 2
ΣX1 ΣX1 ΣX 2 ΣX2

X̅1 X̅12 X̅2 X̅22

Is the difference between ‘means’ in the heights of PNP


personnel and bureau of fire statistically significant?
PNP (N1 =10) BF (N1 =10)
Respondent X1 (X - X̅) (X - X̅)2 Respondent X1 (X - X̅) (X - X̅)2
Ignacio 5.5 5.50 30.25 Martinez 5.2 5.20 27.04
Dela Cruz 5.6 5.60 31.36 Adique 5.8 5.80 33.64
Evangelista 5.8 Zebago 5.6
Mati 5.9 Baguilad 5.7
Aberin 5.6 Hall 5.8
Enriquez 5.7 German 5.9
Benitez 5.4 Fabian 5.4
Rojas 5.6 Sison 5.3
Albay 5.4 Magno 4.11
Villagracia 5.3 Maambot 5.3
2 2
Σ(X - X̅) Σ(X - X̅) Variance Σ(X - X̅) Σ(X - X̅) Variance

N 10 S2 0.000 N 10 S2 0.000
ΣX1 55.8 s 0.000 StDev ΣX1 54.11 s 0.000 StDev

X̅1 X̅1

PNP Bureau of Fire


N 10 10
Mean 5.58 5.411
s2(variance) 
s (stdev)
Find the STANDARD ERROR OF THE DIFFERENCE between means (pooled)

𝑠 𝑥 1 − 𝑥2 = (√
𝑁 1 𝑠1 2+𝑁 2 𝑠2 2 𝑁 1+ 𝑁 2
𝑁 1 +𝑁 2 −2 𝑁1 𝑁2 )( )
𝑠 𝑥 1 − 𝑥2 = (√ 10 ( .0316 ) +10(.2423) 10+10
10+10 − 2 )(
(10)(10) )
PNP Bureau of Fire
0.174 N 10 10
Mean 5.58 5.411
s2(variance) 0.0316 0.2423
s (stdev) 0.178 0.492
Find the t ratio by dividing the difference between means ( ) by the standard error of difference between means

𝑥1 − 𝑥 2
𝑡=
𝑠 𝑥1− 𝑥2

5.58 − 5.411 .169


𝑡= ¿
.174
.174
𝑡 =.969
Find df
df =
df =

df =
 The calculated t (two-tailed test) is .969 with degrees of freedom = 18 and α= .05.
 Since calculated t is LESS THAN the t critical value table of 2.101 WE RETAIN THE NULL HYPOTHESIS, which
means the difference between the means MIGHT BE SIMPLY A SAMPLING ERROR.
 See the graph below. TABLE C. Critical Values of t
df Level of significance for two-tailed test (α)
0.05
18 2.101

Retain the Ho
α/2 =2.5%
2.5%
47.50%
Rejection area

0
t calculated = 0.969 t critical = 2.101
What if we are testing the difference
between means of two different
populations?
 It is used to determine if the sample variances are so dissimilar that we reject that the population
variance are the same. We can call this non-pooled testing of the differences between means.
Observe the different population size.
 We want to know whether there is difference between means of the day-required for processing
fire safety permits in two separate provinces – Bulacan and Batangas. The researcher took random
samples of applicants for fire safety permits in each of the province from January to June. Here is
the result.
Bulacan Batangas
N 36 23
Mean 6.5 5.6
s2(variance) 7.8 3.6
s (stdev) 2.8 1.8
Find the STANDARD ERROR OF THE DIFFERENCE between means (Non-Pooled)


2 2
𝑠 𝑠 1 2
𝑠 𝑥 1 − 𝑥2 = +
𝑁 1 −1 𝑁 2 − 1


Different populations

7.8 3.6
𝑠 𝑥 1 − 𝑥2 = +
36 − 1 2 3 −1
Bulacan Batangas

𝑠 𝑥 1 − 𝑥2 =. 626 N
Mean
36
6.5
23
5.6
s2(variance) 7.8 3.6
s (stdev) 2.8 1.8
Find the t ratio by dividing the difference between means ( ) by the standard error of difference between means

𝑥1 − 𝑥 2
𝑡=
𝑠 𝑥1− 𝑥2 𝑠 𝑥 1 − 𝑥2 =. 626

6.5 − 5.6
𝑡=
.626 Bulacan Batangas
N 36 23

𝑡=1.457 Mean
s2(variance)
6.5
7.8
5.6
3.6
Find df df = s (stdev) 2.8 1.8
We use smaller of the two sample size.
 The calculated t (two-tailed test) is 1.46 with degrees of freedom =23 and α= .05. Since it did not
exceed the t table of 2.069 (it is closer to the mean difference (0), then we retain the null
hypothesis, which means there is no difference between the mean hours level of Bulacan and
Batangas.
 The difference might only be sampling error.
 However, if we think we might commit Error I (retain a false null hypothesis), we can increase the
level of significance to .20, wherein the t table is 1.319. Since calculated t (1.457) is now greater
than 1.319, we can reject the null hypothesis. Critical values of t Table
df Level of significance for two-tailed test (α)
0.05
Reject Ho
23 2.069
α/2 =10%
Retain Ho df Level of significance for two-tailed test (α)
α/2 =2.5% 0.20
23 1.319
47.50%
Rejection area

0
t critical = 1.319 t calculated = 1.46 t critical = 2.069
Concept of Significance Levels
 We need a cut-off to determine if the probability of difference is
significant or too large to ignore.
 When we say that the DIFFERENCE BETWEEN MEANS is
statistically significant, it means the difference is too real or large
enough to be generalized from the population.
 Remember, there can be instances when in large samples in a
population, small difference can be statistically significant while small
samples with large difference might be a sampling error.
α/2 =2.5%
α/2 =2.5%

α/2 =0.5% α/2 =0.5%

47.50%

- 2.58 - 1.96 0 +1.96 +2.58

 With SD = 1.96, there is 5% chance that the  With SD = 2.58, there is 1 chance out of 100
sampled difference fall at or beyond this point. that the sampled difference could happen due
 Our level of significance opens us to the chance to sampling error.
of making an error.
 The more stringent our α the farther out in the
tail
Workshop 2
 Go back to your survey forms. We want to test the difference of
means between two barangays – A-1 and Lower Hills - with respect to
their perception on item 13.a. “Law enforcers can put offenders
behind bars.” (Rate the perception with 1 as the least and 10 as the
highest. They have different populations.
We want to know if the difference between the means of perception
on item 13.a is significant or not.

 Go to 103_T-Test _arrest_non-pooled
What if we are testing the difference
between related means (like before-
and-after)?
 This is a T-test of difference between Means for Same Sample Measured Twice.
 Usually, this is a before-and-after comparison.
 Example a sample of informal settlers were asked on their level of satisfaction after they were
transferred by NHA to another housing settlement. 1 being lowest satisfaction and 4 being the
highest. The mean before the program is 2.33 and after is 1.33. Is there a significant difference?
Before After Difference
program after program D = (X1 - X2)
Null hypothesis (H0):
Respondent X1 X2 X1 - X2 D2 (μ1 = μ2) The degree of satisfaction
1 2 1 1 1
2 1 2 -1 1
does not differ before and after.
3 3 1 2 4
Research hypothesis (Ha):
4 3 1 2 4
5 1 2 -1 1 (μ1 ≠ μ2) The degree of satisfaction
6 4 1 3 9 differs before and after.

2
ΣX1 14 ΣX2 8 ΣD 20
X̅1 2.33 X̅12 1.33 X̅D2 3.33

N 6
Find the STANDARD DEVIATION OF THE DIFFERENCES between related means


2 = 20
Σ𝐷
𝑆 𝐷= −¿(𝑥 ¿ ¿2−𝑥1 )2/¿¿¿ = 60

𝑁 = 2.33

= 1.33

20
6 √
𝑆 𝐷= −¿(1.33¿¿ −2.33)2/¿¿¿

𝑆 𝐷=√ 3 .33 −1
Use the means, not variances

𝑆 𝐷=1.526
Find the STANDARD ERROR OF THE DIFFERENCES between means of related samples

𝑆𝐷
𝑆 𝐷=
√ 𝑁 −1
1.5 26
𝑆 𝐷=
√ 6 −1
1.5 26
𝑆 𝐷=
2 .236

𝑆 𝐷=0.682
Find the T-TEST OF THE DIFFERENCES between means of related samples

Difference between means


t

𝑡=1.47
Degree of freedom

df
 The calculated t (test of difference between means of related sample) is 1.47 with degrees of
freedom = 5 and α= .05. Since it did not exceed the t table of 2.571 (it is closer to the mean
difference (0), then we retain the null hypothesis, which means there is no statistical difference
between the mean.
 The difference might only be sampling error.
 See the graph below. TABLE C. Critical Values of t
df Level of significance for two-tailed test ( α)
0.05
5 2.571

Retain Ho
α/2 =2.5%
2.5%
47.50%
Rejection area

0
t calculated = 1.47 t critical = 2.571
Calculating the t statistic in computer
• When we use an estimate of the SE, we do not use the z distribution
• We use the t distribution and calculate the t statistic

We calculate the probability (p) of obtaining


the t statistics under the assumption that Ho
(no differences) is true.

If p < .05, we reject Ho.


Workshop 3
 We want to test the difference of means between the perception of
the security and safety climate before-and-after the Intensified Crime
Reduction Campaign (Rate the perception with 1 as the least and 10
as the highest.

 Go to 103_T-Test Before&After_exercise.
What if we are testing the difference
means between proportion?
 This is a T-test of difference between Proportion.
 We want to determine the proportion of male and female lawmakers who support the bill
increasing the age of criminal responsibility.

MALE FEMALE TOTAL


X1 X2
N 180 150 330
Support the RH 81 48 129 Null hypothesis (H0):
Proportion who supports 0.45 0.32 0.39 (μ1 = μ2) The proportion of men and
women lawmakers who are in favour
*Observe that the sample populations are different. of the bill are equal.
Research hypothesis (Ha):
(μ1 ≠ μ2) The proportion of men and
= .45 women lawmakers who are in favour
of the bill are NOT equal.
= .32
= .39
Find the STANDARD ERROR OF THE DIFFERENCES between SAMPLE PROPORTIONS

√ ( )
= .39
𝑁1+ 𝑁2
𝑆 𝑝1 −𝑝 = 𝑃 3 ( 1 − 𝑃 3 ) = 180
2
𝑁1𝑁2 = 150


𝑆 𝑝1 −𝑝 = .39 ( 1 −.39 )
2

𝑺 𝒑 𝟏− 𝒑 =. 𝟎𝟓𝟑𝟗
180
(
+150
180 𝑋 150 )
𝟐
Find the Z-TEST OF THE DIFFERENCES between SAMPLE PROPORTIONS

𝑝 1− 𝑝 2 = .45
𝑧=
𝑆 𝑝1 −𝑝 2
= .32

.45 − .32 𝑺 𝒑 𝟏− 𝒑 =. 𝟎𝟓𝟑𝟗


𝑧=
𝟐

.0539
.13
𝑧=
.0539

𝑧=2.4118
 The calculated z (test of difference between sample proportions) is 2.41 with degrees of freedom Ꝏ and
α= .05. Since IT IS MORE THAN the t table of 1.960 (df is Ꝏ), then we REJECT the null hypothesis, which
means there is statistical difference between the PROPORTION mean.
 See the graph below.
Critical values of t Table
df Level of significance for two-tailed test ( α)
0.05
ꝏ 1.96

Reject the Ho
α/2 =2.5%
2.5%
47.50%
Rejection area

0
t critical = 1.96 z calculated = 2.41

You might also like