You are on page 1of 3

Box’s M

Box’s M tests the null that the variance/covariance matrices are equal across two or more
groups. Such equality is assumed when doing discriminant function analysis or MANOVA. I shall
illustrate here using data from the Open Sex Role Inventory.

A discriminant function analysis was employed to predict gender (female, male, other) from
scores on Femininity and Masculinity. Here are the variance/covariance matrices from SAS Proc
Corr:

Group = Female

Covariance Matrix, DF = 2211


Masculinity Femininity
Masculinity Masculinity 0.4295678228 -.0059879205
Femininity Femininity -.0059879205 0.3028151803

Group = Male

Covariance Matrix, DF = 1585


Masculinity Femininity
Masculinity Masculinity 0.4945919608 -.0767190632
Femininity Femininity -.0767190632 0.4353688155

Group = Other

Covariance Matrix, DF = 2211


Masculinity Femininity
Masculinity Masculinity 0.4295678228 -.0059879205
Femininity Femininity -.0059879205 0.3028151803

SPSS arranges them a bit differently:

Covariance Matrices
gender Masculinity Femininity
Masculinity .495 -.077
Male
Femininity -.077 .435
Masculinity .430 -.006
Female
Femininity -.006 .303
Masculinity .484 .003
Other
Femininity .003 .348
2

Box’s M compares the natural logs of the variance/covariance matrices. Here they are for
these data:

Log Determinants
gender Rank Log
Determinant
Male 2 -1.563
Female 2 -2.040
Other 2 -1.780
Pooled within-groups 2 -1.815

By my eye, the determinants do not differ greatly here, but Box’s M is a powerful statistic, and
we have large sample here (N = 4102), so even small differences in these determinants will be
statistically significant, as they are:

Test Results

Box's M 89.421

Approx. 14.881

df1 6
F
df2 5072276.876

Sig. .000

Tests null hypothesis of equal population

covariance matrices.

What to do? Tabachnik and Fidel (and others) have the following recommendations:
 If the sample sizes are equal, or nearly so, don’t worry about it. Our sample sizes here differ
greatly: 2,212, 1586, and 303.
 If the cells with the larger sample sizes also have the larger variances and covariances, the p
values will be conservative – that is, it will be more difficult to reject the null, and if you can
reject the null, you can do so with confidence.
 If the cells with the smaller samples sizes have the larger variances and covariances, the p
values will be liberal, and rejection of the null are suspect, especially when p is not much lower
than .05.
 You could randomly discard data from the cells with large sample sizes, but that will affect
power.
 You could simply resort to using Pillai’s trace instead of Wilk’s lambda as the test statistic.
Box-M.docx
3

Here are the variances and covariance by group. In parenthesis is the rank for the
variance/covariance, where 1 = smallest and 3 = largest.

Group Variance Femininity Variance Covariance


Masculinity
Female .303 (1) .430 (2) -.006 (2)
Male .495 (3) .435 (3) -.077 (3)
Other .484 (2) .348 (1) .003 (1)

The group with the smaller samples sizes (Other) tend to have the smaller variances and
covariances, so our p values should be conservative, and we should not fret about rejecting the null.

SPSS Discriminant does not provide Pillai’s trace, but MANOVA does, so we simply run a
MANOVA to the Pillai trace in place of the Wilks lambda.

Multivariate Testsa
Effect Value F Hypothesis df Error df Sig.
Pillai's Trace .265 313.273 4.000 8198.000 .000
Wilks' Lambda .737 337.122b 4.000 8196.000 .000
gender
Hotelling's Trace .353 361.201 4.000 8194.000 .000
Roy's Largest Root .342 701.925c 2.000 4099.000 .000

PS – the assumption of equal variance/covariance matrices is necessary because part of the


discriminant function analysis or MANOVA involves pooling the within-cells variance/covariance
matrices, just like we pooled variances when doing t tests and ANOVAs.

Pooled Within-Groups Matricesa


Masculinity Femininity
Masculinity .459 -.033
Covariance
Femininity -.033 .357
a. The covariance matrix has 4099 degrees of
freedom.

Return to 2 Group DFA

Box-M.docx

You might also like