Use
Test for equality of two means
E.g., compare two groups of subjects given
different treatments
Test for value of a single mean
E.g., test to see if a single group of subjects
differs from a known value
Also matched sample test where a single
group is compared before and after treatment
(test for zero treatment effect)
Advanced
Tests of significance of correlation/regression
coefficients.
Students t statistic
Assumptions
Parent population is normal
Sample observations (subjects) are
independent.
Robustness
To normality: Affects Type I error and power
and may lead to inappropriate
interpretation. In real life, we cant expect
exactly normal data but it should not be too
much skewed
Students t statistic
Formula (single group)
Let x
1
, x
2
, .x
n
be a random sample from a normal
population with mean and variance
2
, then the
following statistic is distributed as Students t with
(n1) degrees of freedom.
n s
x
t
/
=
Students t statistic
Formula (two groups)
Case 1: Two matched samples
The following statistic follows t distribution with n1 d.f.
Where, d is the difference of two matched samples and S
d
is
the standard deviation of the variable d.
n s
d
t
d
/
=
Students t statistic
Formula (two groups)
Case 2: Equal Population Standard Deviations:
The following statistic is distributed as t distribution with (n1+n2 2)
d.f.
The pooled standard deviation,
n1 and n2 are the sample sizes and S
1
and S
2
are the sample
standard deviations of two groups.
2 1
2 1
1 1
) (
n n
S
x x
t
p
+
=
2
) 1 ( ) 1 (
2 1
2
2 2
2
1 1
+
+
=
n n
S n S n
S
p
Students t statistic
Formula (two groups)
Case 3: Unequal population standard deviations
The following statistic follows t distribution.
The d.f. of this statistic is,
2
2
2
1
2
1
2 1 2 1
) ( ) (
n
s
n
s
x x
t
+
=
( )
1
) / (
1
) / (
/ /
2
2
2
2
2
1
2
1
2
1
2
2
2
2 1
2
1
+
=
n
n s
n
n s
n s n s
v
Students t statistic
Onesided
There can only be on direction of effect
The investigator is only interested in one
direction of effect.
Greater power to detect difference in
expected direction
Twosided
Difference could go in either direction
More conservative
Students t statistic
One group Two groups
One sided
A single mean differs
from a known value in a
specific direction. e.g.
mean > 0
Two means differ from
one another in a
specific direction. e.g.,
mean
2
< mean
1
Two sided
A single mean differs
from a known value in
either direction. e.g.,
mean 0
Two means are not
equal. That is, mean
1
mean
2
Students t statistic
SPSS
One Group: Analyze>Compare Means> One
Sample T Test
Two Groups (Matched Samples):
Analyze>Compare Means> Paired Samples T
Test
Two Groups: Analyze>Compare Means>
Independent Samples T Test
Students t statistic
R
The default ttest is
t.test(x, y = NULL, alternative = "two.sided", mu = 0,
paired = False, var.equal = FALSE, conf.level = 0.95)
Where x and y are two data for two numeric variables.
We need to change only default settings matching with
the case we want to perform. For example,
One Group: t.test(x, alternative=greater, mu=30)
Two Groups (Matched Samples): t.test(x, y,
alternative= "less", mu = 0, paired = TRUE,)
Two Groups: t.test(x,y, alternative=greater,
mu=0, var.equal = TRUE)
Students tstatistic
MS Excel (in Tools > Data Analysis)
One Group: Not available
Two Groups (Matched Samples):
tTest: Paired two sample for mean
Two Groups (Independent Samples):
tTest: TwoSample Assuming Equal Variances
tTest: TwoSample Assuming Unequal Variances
Example 1
Consider the heights of children 4 to 12 years
old in dataset 1 of our course website
(variable hgt). Suppose we want to test if
the average height () for this age group in
the population is 50 inches, using our sample
of 60 children. We will use 5% level of
significance.
This is a onesample, twosided test.
Example 1
Hypotheses:
H
0
: = 50
H
a
: 50
Computation in Excel:
Excel does not have a 1sample test, but we can
fool it.
Create a dummy column parallel to the hgt column
with an equal number of cells, all set to 0.0
Run the Matched sample test using hgt and the
dummy column and 50 as the hypothesized mean
difference.
The pvalue for two tail test is 0.0092
Example 1
Using SPSS:
Analyze> Caompare Means >One Sample T
Test > Select hgt > Test value: 50 > ok
Pvalue is .009
Using R,
t.test(df1$hgt, mu=50)
Twotail pvalue is .0092
Example 2
Suppose we want to compare the height of two
groups (hgt in each sex from dataset).
H
0
: Mean heights are equal for the two sexes.
H
a
: Mean heights are not equal
Using MSExcel:
Sort data by sex (data>sort>by:sex)
In Data Analysis ttest:Twosample Assuming equal variance
select the range of hgt for all sex = f as Variable 1 Range
select the range of hgt for all sex = m as Variable 2 Range
Pvalue for twosided test = 0.205
Example 2
Using SPSS:
Analyze>Compare Means>IndependentSamples T
test>
Select hgt as a Test Variable
Select sex as a Grouping Variable
In Define Groups, type f for Group 1 and m for Group
2
Click Continue then OK
It gives us the pvalue 0.205. We can assume equal
variance as the pvalue of F statistic for testing
equality of variances is 0.845.
Sign Test (Nonparametric)
Use:
(1) Compare the median of a single group with a
specified value (instead of single sample ttest).
(2) Compare medians of two matched groups
(instead of Two matched samples ttest)
Test Statistic:
Number of positive difference of (medianc). The
number of positive difference follows a Binomial
distribution.
Sign Test (Nonparametric)
SPSS: Analyze> Nonparametric Tests>
Binomial
R: sign.test(x, y = NULL, md = 0,
alternative = "two.sided", conf.level =
0.95)
For testing the median (md) of a single
sample, use data only for one variable.
To compare paired data, use two paired
variables.
NB: This test requires the BSDA package
Wilcoxon SignedRank Test:
USE:
Compares medians of two paired samples.
Test Statistic:
Consider n pairs of data of two variables x
and Y, then the following statistic is known
as Wilcoxon signed rank statistic.
WS = Sum of the rank of positive
differences after assigning ranks to the
absolute value of differences.
Wilcoxon RankSum Test
Use: Compares medians of two
independent groups.
Test Statistic:
Let, X and Y be two samples of sizes m and
n. Suppose N=m+n. Compute the rank of all
N observations. Then, the statistic,
W
m
= Sum of the ranks of all observations of
variable X.
Wilcoxon SignedRank Test &
Wilcoxon RankSum Test
SPSS:
Two Matched Groups: Analyze>
Nonparametric Tests> 2 Related Samples
Two Groups: Analyze> Nonparametric
Tests> 2 Independent Samples
Wilcoxon SignedRank Test:
/Wilcoxon RankSum Test
R:
The default test is
wilcox.test(x, y, alternative = "two.sided", mu
= 0, paired = FALSE, exact = FALSE, conf.int
= FALSE, conf.level = 0.95)
Two matched Groups: wilcox.test(x, y, alternative =
less", paired = TRUE)
Two Groups: wilcox.test(x, y, alternative =
greater)
Subject Hours of Sleep Difference Rank
Ignoring Sign
Drug Placebo
1 6.1 5.2 0.9 3.5
2 7.0 7.9 0.9 3.5
3 8.2 3.9 4.3 10
4 7.6 4.7 2.9 7
5 6.5 5.3 1.2 5
6 8.4 5.4 3.0 8
7 6.9 4.2 2.7 6
8 6.7 6.1 0.6 2
9 7.4 3.8 3.6 9
10 5.8 6.3 0.5 1
3
rd
& 4
th
ranks are tied hence averaged.
Pvalue of this test is 0.02. Hence the test is significant at any level more
than 2%, indicating the drug is more effective than placebo.
Example 3 (two matched samples)
Proportion Tests
Use
Test for equality of two Proportions
E.g. proportions of subjects in two treatment
groups who benefited from treatment.
Test for the value of a single proportion
E.g., to test if the proportion of smokers in a
population is some specified value (less than 1)
Proportion Tests
Formula
One Group:
Two Groups:
n
p p
p p
z
) 1 (
0 0
0
=
. where
)
1 1
)( 1 (
2 1
2 1
2 1
2 1
n n
x x
p
n n
p p
p p
z
+
+
=
+
=
Proportion Test
SPSS:
One Group: Analyze> Nonparametric Tests> Binomial
Two Groups?
R:
The default tests are:
One Group: binom.test(x, n, p = 0.5, alternative =
"two.sided", conf.level = 0.95)
Two Groups: prop.test(c(x,y), c(m,n), p = NULL,
alternative = "two.sided", conf.level = 0.95, correct
= TRUE)
X, Y are the number of successes and m and n
are the sample sizes
Example 4: Proportion of males in
Dataset 1
R:
n=60 and there are 30 males
binom.test(30,60) returns a pvalue of 1.0.
SPSS:
recode sex as numeric 
Transform> Recode>Into Different Variables> Make
all selections there and click on Change after
recoding character variable into numeric.
Analyze> Nonparametric test> Binomial> select Test
variable> Test proportion
Set null hypothesis = 0.5
The pvalue = 1.0
Chisquare statistic
USE
Testing the population variance
2
=
0
2
.
Testing the goodness of fit.
Testing the independence/ association of attributes
Assumptions
Sample observations should be independent.
Cell frequencies should be >= 5.
Total observed and expected frequencies are
equal
Chisquare statistic
Formula: If x
i
(i=1,2,n) are independent
and normally distributed with mean and
standard deviation , then,
If we dont know , then we estimate it using
a sample mean and then,
d.f. n on with distributi a is
2
1
2
_
o
=

.

\

n
i
i
x
d.f. 1)  (n on with distributi a is
2
1
2
_
o
=

.

\

n
i
i
x x
Chisquare statistic
For a contingency table we use the following
chi square test statistic,
Frequency Expected
Frequency Observed
d.f. 1)  (n with as d distribute ,
) (
2
1
2
2
=
=
=
i
i
n
i
i
i i
E
O
E
E O
_ _
Chisquare statistic
SPSS:
Analyze> Descriptive stat> Crosstabs>
statistics> Chisquare
Select variables.
Click on Cell button to select items you
want in cells, rows, and columns.
Example 5 (class demonstration)
Make a contingency table using two variables
sex and grp from our dataset.
Analyze> Descriptive statistics> crosstabs>
select variables for rows and columns
Statistics> Chisquare> Continue> Cells>
selection> ok.
It will give us a contingency table and pvalue
of Pearson Chisquare Tests.
For this particular case, the pvalue of
PearsonChisquare test is 0.549 and d.f. is 2.
Fstatistic
Use:
Testing the equality of population
variances.
Testing the significance of difference of
several means in analysis of variance.
Fstatistic
Let X and Y be two independent Chisquare variables with
n
1
and n
2
d.f. respectively, then the following statistic
follows a F distribution with n
1
and n
2
d.f.
Let, X and Y are two independent normal variables with
sample sizes n
1
and n
2
. Then the following statistic follows
a F distribution with n
1
and n
2
d.f.
Where, s
x
2
and s
y
2
are sample variances of X and Y.
2 /
1 /
2 1
,
n Y
n X
F
n n
=
2
2
,
2 1
y
x
n n
s
s
F =
Fstatistic
Hypotheses:
H
0
:
1
=
2
=. =
n
H
a
:
1
2
.
n
Comparison will be done using analysis of
variance (ANOVA) technique. ANOVA uses F
statistic for this comparison. The ANOVA
technique will be covered in another class
session.
Much more than documents.
Discover everything Scribd has to offer, including books and audiobooks from major publishers.
Cancel anytime.