You are on page 1of 5

327.

Query: Non-Additivity in Two-Way Classifications with Missing Values


Author(s): D. A. Preece
Source: Biometrics, Vol. 28, No. 2 (Jun., 1972), pp. 574-577
Published by: International Biometric Society
Stable URL: http://www.jstor.org/stable/2556169
Accessed: 22-06-2016 23:03 UTC
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
http://about.jstor.org/terms

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted
digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about
JSTOR, please contact support@jstor.org.

Wiley, International Biometric Society are collaborating with JSTOR to digitize, preserve and extend
access to Biometrics

This content downloaded from 164.41.4.26 on Wed, 22 Jun 2016 23:03:04 UTC
All use subject to http://about.jstor.org/terms

574

BIOMETRICS,

JUNE

1972

327 QUERY: Non-additivity in Two-way Classifications


with Missing Values

Milliken and Gravbill [1971] give a procedure (let us call it MG) for
calculating the sum of squares for Tukey's one degree of freedom for nonadditivity in two-way (say blocks X treatments) classifications with missing

values. Intuition suggests the following alternative procedure (AP) which


seems to be simpler in concept:
1. Estimate the missing values by a standard method for the model

Yii =--: M + Oi + 'r; + eii . (1)


2. Estimate block and treatment parameters from the augmented data,

i.e. from the non-missing values plus the estimated values. (This gives MG's

parameters oi and .)
3. Calculate Tukey's sum of squares using Bi, , and the augmented
data.

AP, if satisfactory, would seem to be more suited than MG for inclusion


in computer programs. This is because all the AP calculations are on the

same variate (at first incomplete, then augmented), whereas MG first analyses

the incomplete variate comprising the values yii , and then the newly-defined
incomplete variate comprising the values jfi . Thus, without elaborate
programming, MG requires storage for an extra variate. Also, if incomplete

variates are analysed by iterative estimation of missing values, then MG


requires two sets of iterations, whereas AP needs only one.
Thus one is led to ask
(a) Are MG and AP equivalent?

and, if they are not,

(b) Is AP a good approximation to MG? Does AP have any merit?

ANSWER:

Let the plots be so ordered that those with missing values come last.

Use Milliken and Graybill's notations yii , b, t, C, i, ri, f, zi , and Q1 without any changes of meaning. Thus the MG sum of squares for Tukey's one

degree of freedom is
Q1 = ( E gYiizii)2/ Z2
(i, j) cC (i, j) cC

Let y, be the vector of the observations yii, and let z1 be that of the
quantities zii . Let

y= Y and z= |zJ

This content downloaded from 164.41.4.26 on Wed, 22 Jun 2016 23:03:04 UTC
All use subject to http://about.jstor.org/terms

QUERIES

AND

NOTES

575

where 92 is the vector of estimated missing values (estimated, that is, on the
model without any interaction term) and 0 is a zero vector of the same size.
Let

be the vector (of the same size as y and z) whose elements are ,Biri . Thus
the AP sum of squares for Tukey's one degree of freedom is

Q2 = (Y'g)2/g'g.
We now find a relationship between Q1 and Q2 . We use matrix notation
based on the notations of Plackett [1960] and Draper [1961].
If variate y had had no missing values, and we had used model (1) for
the analysis, the residuals would have been Hy, say, where H is a symmetric

idempotent matrix independent of the data. Returning to the incomplete


variate y, it is not hard to show that
Hg

g.

(2)

Partition H comformably with y, z, and g as


H= [ H= 1
-H21 H22J

We assume that H22 is non-singular, which implies that the parameters


of (1) can be estimated from the non-missing values (see Preece [1971]).
It follows that H22 is positive definite.
After a little algebra, equation (2) gives

f ? H21H;2f2 = (H11 - H/HlH21)f.


Also

Y2 = HH21y1.
Thus

EYiizii = ylZl = y'Z

(i,ij)?C

= y'H
H-22H21 f,

= y'(Hl- H-1H 2H21) f


, f~
H/1H_'d2
= -y1(
f ++ y2f1H
22 f2)

=y f+Y A, yt
and similarly

E = g'g- - H-1 .

(i ,i) 2 2

This content downloaded from 164.41.4.26 on Wed, 22 Jun 2016 23:03:04 UTC
All use subject to http://about.jstor.org/terms

576

BIOMETRICS,

JUNE

1972

Thus the AP sum of squares for Tukey's one degree of freedom is


(, 2 ( E ~YsiZii)2
Q2 -2yg _ __ __ __ __3_ _

z=i)

(3)

which is less than or equal to Qt, the value produced by MG. Thus AP

would be a satisfactory approximation to MG for small 12H2112 . The null


hypothesis that the data are additive will be rejected by AP only if also
rejected by MG.

The disturbance term ?>Hf2f2 in (3) is unlikely to be small when there


are more than a very few missing values, in particular when the data come

from an incomplete block design. (Milliken and Graybill use 'missing data'
to cover any incomplete two-way classification, whether the incompleteness

is planned or due to mishap.) Thus we are led to the question: is AP a good


approximation to MG when there is only one missing value?
If the single missing value is for i = i', j = j', then

fH 112 = [btb - 1) (t - 1)].


For Milliken and Graybill's illustrative example [1971] with a single missing
value,
L?212?2 = 0.119987

whence

Q2 = 0.661344/(1.796836 + 0.119987) = 0.3450;


thus the error sum of squares (2 D.F.) for AP is 0.3708 - 0.3450 = 0.0258

and the corresponding F value for non-additivity is 0.3450/0.0129 = 26.7,

which is about one-tenth of the F value given by MG. Thus there is here

a very large discrepancy between the two methods, even though iIHA' 2 is
small in comparison with Z(i,i)tC Z2. 2 The discrepancy arises because each
of Q1 and Q2 constitutes such a large part of the residual sum of squares with
3 D.F.

An example with a larger value of bt (so that the proportion of observa-

tions missing is smaller), and with A,B and -r large, is afforded by the data
8

12

14

19

22

10 16 22 24 28
15 21 24 31 34

19 27 32 42 Missing.
These data (devised for easy calculation) have large marginal effects, and
a clear Tukey-type interaction. The MG F value for non-additivity is 32.3,
whereas the corresponding AP value is 13.3. The estimated missing value
used by AP is 40 (exactly). Allowance for the interaction, when fitting the

missing value, is to be expected to correspond to a larger estimate than 40.


Indeed, MG is equivalent to insertion of an estimate of approximately 42.3.
The examples just given indicate that AP cannot be recommended.

This content downloaded from 164.41.4.26 on Wed, 22 Jun 2016 23:03:04 UTC
All use subject to http://about.jstor.org/terms

QUERIES

AND

NOTES

577

However, our algebra suggests that, when there is only one missing value,
and when calculations are to be done on a desk calculating machine, the
formula

Q1 -_ (y'g)2
gg - [b- ri )]/[(b- 1)(t - 1)]
might to advantage be used either instead of Milliken and Graybill's formula
or for checking the calculations.

REFERENCES
Draper, N. R. [1961]. Missing values in response surface designs. Technometrics 3, 389-98.
Milliken, G. A. and Graybill, F. A. [1971]. Tests for interaction in the two-way model with
missing data. Biometrics 27, 1079-83.

Plackett, R. L. [1960]. Principles of Regression Analysis. Clarendon Press, Oxford.


Preece, D. A. [1971]. Iterative procedures for missing values in experiments. Technometrics
13, 743-53.

Tukey. J. W. [19491. One degree of freedom for nonadditivity. Biometrics 5, 232-42.


D. A. PREECE

University of Kent

Canterbury, England

Key Words: Incomplete data; Missing values; Non-additivity; Tukey's one degree of
freedom; Two-way classifications.

328 NOTE: The Use of Non-Parametric Methods in the Statistical Analysis


of the Two-Period Change-Over Design
GARY G. KOCH
Department of Biostatistics, University of North Carolina,

Chapel Hill, North Carolina 27514, U. S. A.

SUMMARY

The two-period change-over design is often used in clinical trials in which subjects

serve as their own controls. This paper is concerned with the statistical analysis of data
arising from such subjects when assumptions like variance homogeneity and normality
do not necessarily apply. Test procedures for hypotheses concerning direct effects and
residual effects of treatments and period effects are formulated in terms of Wilcoxon

statistics as calculated on appropriate within subject linear functions of the observations.


Thus they may be readily applied to small sample-data.

1. A TWO-PERIOD CHANGE-OVER EXPERIMENT

Let us consider the data from an experiment undertaken at the Dental

Research Center, University of North Carolina. The design was as follows:

This content downloaded from 164.41.4.26 on Wed, 22 Jun 2016 23:03:04 UTC
All use subject to http://about.jstor.org/terms

You might also like