You are on page 1of 21

Hotelling’s T-Square

(two-group MANOVA)

Haramaya University
March 30,2109
Two-group Manova (1)
 Two groups of subjects
 experimental, comparison; or
 male, female
 Several DVs (correlated and conceptually related)
 Could be different dimensions of a construct, like an
achievement test
 Conceptual knowledge
 Procedural knowledge

 Could be different scales of an assessment, like school


climate
 Democracy

 Teacher training scale

 Teacher morale, etc.


Two-group Manova (2)
 Test of interest is called Hotelling’s T2
 Special case of MANOVA for two groups
 Like how the t-test is special case of ANOVA
for two-groups
 Multivariate extension of the two-sample or
independent-groups t-test
 Why Manova instead of separate Anova’s?
(here, that would be separate independent-
groups t-tests)
MANOVA vs. ANOVA
 Correlation among DVs – needs to be accounted for
in statistical tests
 Multiple univariate tests inflate  – we want an omnibus
test (of all DVs simultaneously), with follow-ups that can
be adjusted using Bonferroni (or other methods)
 Groups might be different on the collective set of DVs, but
we might not notice this if each DV is looked at
individually
 Multivariate tests look at “joint” differences
 Allows for clarification of HOW the groups might differ
on the DVs
Review two-sample t-test
H0: 1 = 2
Ha: 1 = 2
 One continuous DV, collected from both groups
 Q: do boys and girls differ on math achievement?
 Assumptions
 Normality

 Independence

 HOV: implies  1   2   p
2 2 2
t-test statistic
(Y1  Y2 )  ( 1   2 ) (Y1  Y2 )
tobs  
1 1 1 1
S p2 (  ) S p2 (  )
n1 n 2 n1 n 2
where

S 2
( n  1)  S 2 ( n2  1)
2
Pooled or common
Sp 
2 1 1
within-groups variance
n1  n2  2

And we compare t-obs to t-crit=tn1+n2-2, 


What does univariate data look like?
 Assuming two
independent groups of  y11 
individuals
 y 
 21 
 Let yij = ith persons
  
score in the jth group
y 
Y( n1 n 2 ) x1   y 
 N1 = sample size in n1,1
group 1
 12 
 N2 = sample size in  y22 
group 2  
  
y 
 n 2, 2 
Univariate Predictions

 Y1  Used to estimate
 1 
   
 Y2   2 

A vector of means such that T = (1 2)

H0: 1 = 2
Multivariate? Imagine 2 DVs
Yijq = ith persons score in jth group on the qth DV

DV1 DV2

 y111 y112 
 y211 y212 
 Group 1
   
 yn1,1,1 yn1,1, 2 
Y( n1 n 2) x 2   
 y121 y122 
 y221 y222 
  Group 2
   
 yn 2, 2,1 yn 2, 2, 2 

Multivariate Predictions
DV1 DV2 DV1 DV2
 Y11 Y12   11 12  Group 1
Y2 x 2    2 x 2   
 Y21 Y22    21  22  Group 2

• Each group has a row vector that contains the means for each DV.
• Use the transpose of the row vectors to indicate how we compare
means across the groups on the collective set of DVs.

 11    21  H 0 : 1   2
H 0 :     
 12    22  Group 1 Group 2
Group 1 Group 2
Q DVs, 2 groups
 In general, this looks
like: centroids

   
   
 
 11   21 
H 0 :  12     22  H 0 : 1   2
   
   
   
 1Q   2Q 
Each vector contains means for Q DVs. These mean column vectors
are called “centroids.”

Centroids mark a single point in Q-space for each group. So essentially, we


want to know if each group occupies nearly the same point in space.
Assumptions for the multivariate test?
 Multivariate normality
 Want DVs jointly distributed as normal, or a normal
distribution within groups.
 Bivariate normality between pairs of DVs, seen through
elliptical pattern on scatterplot.
 Multivariate normality means that all pairs of DVs are
jointly normal
 Independence of observations (of course, or else we
are doing the wrong analysis)
 HOC = Homogeneity of Variance AND Covariance
across the groups
 Very hard to meet this assumption
 Box-M test is very sensitive to non-normality
What is HOC?
 IV’s are categorical (i.e., groups, or factors
with J levels)
 Imagine just 2 DVs
 Total SSCP for Y?
 Y is a matrix of size (n1+n2+…nJ) X 2
 To get sums of squares and sums of cross
products we look at differences from the grand
means
SSCPY

 SS (Y1 ) SCP (Y1 , Y2 )  2X2 matrix for 2 DVs,


 
 SCP(Y2 , Y1 ) SS (Y2 )  Subscript on Y refers
to Q
Where
nj Groups don’t matter, just
J
SS (Y1 )   ( yij1  Y..1 ) 2 looking at all Yq data
together
i 1 j 1

and
nj J
SCP (Y1 , Y2 )   ( yij1  Y..1 )( yij 2  Y..2 )
i 1 j 1
SSCP for each group?
 Imagine X has 3 levels (or groups)
 Treatment1, treatment2, comparison
 Then HOC says that partitioning the total SSCP for Y
can be used to reasonably create a “pooled” SSCP
(similar to univariate case (S2p))
 Why is HOC important?
 Makes detection of differences fair, in the sense that
differences in means could truly be called differences
in means and not be the result of an artifact of one
group having greater - or less - variability than another
group!
 If one group has too much spread or variability relative
to the other groups, may not be able to detect
differences in means.
SSCP (groups) = Wj
• Within each group, look at differences from group means.
• Notice size (order) of W’s remains the same as the total SSCP matrix
(still only 2 DVs!)

 SS (Y1 ) SCP (Y1 , Y2 ) 


SSCPY , group1  W1   
 SCP (Y2 , Y1 ) SS (Y2 ) 
 SS (Y1 ) SCP (Y1 , Y2 ) 
SSCPY , group 2  W2   
 SCP (Y2 , Y1 ) SS (Y2 ) 
 SS (Y1 ) SCP (Y1 , Y2 ) 
SSCPY , group 3  W3   
 SCP (Y2 , Y1 ) SS (Y2 ) 
Pooled SSCP = W
 W = Pooled SSCP = W1 + W2 + W3
 (plus … WJ if there were J groups)

 Box-M test asks if the Wj’s are all similar (i.e., all
contribute equally to W)
 Actual test uses covariance matrices instead of SSCP
matrices 1
 Covariance matrix is  W j  ˆ j
df j

 To compare overall variability across the Wj’s, we use


the determinants
 Box M compares the determinants of the separate W’s
(or the covariances really) to the determinant of the
pooled W.
Hotellings T2
 Matrix representation of the univariate two-sample t
(independent groups t-test) – note subscripts refer to
groups
( y1  y2 )
t
2 1 1
Sp(  )
n1 n 2
1
( y1  y2 ) 2
 1 1 
t2   ( y1  y2 )   S p2 (  ) ( y1  y2 )
1 1  n1 n 2 
S p2 (  )
n1 n 2


n1  n 2
n1  n 2
 
( y1  y2 )  S p2
1
( y1  y2 )
Hotellings T2
 Multivariate Case: Still want to subtract group means but
now on several variables (subscripts refer to groups).

n1  n2 ˆ 1
T 2
(Y 1  Y 2 )  (W )(Y 1  Y2 )
n1  n2
Column vectors of means for
each group

ˆ W1  W2 W
W   (note: inverse is of a matrix)
n1  n2  2 df
Note: for three or more groups, add all the Wj’s together to find W; df is
sum of (nj-1).
Test using Hotellings T2
 Hotelling showed that T2 can be transformed to an exact F.

n1  n2  (Q  1) 2
Fobs   T ~ FQ , N (Q 1)
(n1  n2  2)  Q
 Omnibus test is Hotellings T2
 Need follow-up tests to see where (which DV) the specific
group differences might be
 With only two groups, obvious choice are the familiar
independent t-tests
 Note: some textbooks use “p” to represent the # of variables;
here I’ve used “q” to be consistent with the terminology we’ve
used for # of DVs
Omnibus test

 11   12  H 0 : 1   2
H 0 :     
  21    22  Group 1 Group 2
Group 1 Group 2

• In this example there are only 2 DVs (because there are only 2
rows in each column vector).
• This test can be used regardless of the number of DVs (more
than 2)
• T2 Test was developed for situations of only 2 GROUPS
• Use Wilk’s Lambda as general test: 2 or more groups!

You might also like