Professional Documents
Culture Documents
John W. Tukey was an atypical intellectual for our way through the most general of these meth-
times, a thinker of surprising inventiveness. He spent ods, beginning with the simplest and adding
his whole academic career at Princeton University, complexity step by step.
almost from the start dividing his time between Bell
The chapter headings included:
Labs and the University. Each term he taught courses
for undergraduates and doctoral students. He cherished General outline; Combining independent
teaching and was beloved and highly respected by his results and assessing their significance;
students. Combination of directionalities and indica-
He had many interests outside of science, but was tions of directionality; Combining tests of
such a consummate statistician that we thought it might fit; Combining values to get a value; Com-
bring him closer to you, the reader, if we offer you a bining values to get an interval; Combin-
glimpse of one of his courses. ing intervals to get a value: group by group
(scared, Paull-2, Paull-2F and PL combina-
A COURSE ON COMBINING DATASETS tions); Externally weighted combination of
In the spring term of 1982, John W. Tukey taught intervals to get an interval; Which combina-
a graduate course entitled “Combining Datasets.” At tion when?
that time, John’s teaching method was firmly estab- Chapter 12 discussed a statistical problem, the com-
lished. John wrote transparencies by hand and Eileen bination of intervals to get a value, that had fasci-
Olszewski, for many years his secretary at Princeton, nated John over a long period and the remainder of
typed them for distribution to the students. During lec- this section consists essentially of an extract of the
tures John used two projectors with each slide being course notes.
first discussed and then moved to the second projector.
Combining Datasets was announced as follows: In this chapter and the next, we start from
intervals—essentially values with estimated
The purpose of this course will be to re- uncertainties—and combine them to get a
view methods of combining results. This value. Essentially, then, our problem is to
is, the simplest problems of data analysis— choose the weights with which our values
treating single batches, regression, and are to be combined, in the first instance on
analysis of variance—all involve a single
the basis of the given interval lengths (and
(often internally structured) body of data.
in the second, if we go over to resistive
The next natural step is to put together—
combination, with a view to a more precise
to combine—the results of such separate
result).
analyses.
The crucial consideration, which we met
We do this in many ways, using as little,
casually above, but which now drives and
from each individual data body, as an ap-
determines our choice of method, is the
parent direction and as much as an esti-
question of whether the values to be com-
mated amount; together with an estimated
bined estimate the same thing or whether
variance for that estimate, and an indicated
each value to be combined estimates a dif-
number of degrees of freedom for that es-
ferent thing—and we want to estimate a
timated variance. We will try to work our
summary of the different things that are to
be combined. Exhibit 1 illustrates one ex-
Stephan Morgenthaler is Professor, Institute of
treme, where it is clear that both:
Mathematics, Swiss Federal Institute of Technology
(EPFL), 1015 Lausanne, Switzerland (e-mail: stephan. • the intervals are NOT estimating the same
morgenthaler@epfl.ch). thing, and
342
JOHN W. TUKEY AS TEACHER 343
E XIBIT 1. E XIBIT 2.
If we weight by 1/si2 , some si2 will be much follow some stretched tail distribution,
too small, so that 1/si2 will be much too big, each si2 will be more widely dispersed than
so that we will have done just exactly what a multiple of χf2 /f . We get part way there
we should not! by allowing for χ 2 /f variability. Even part
Partial Weighting way is good! (We return, in the next section,
to a way of going further.)
For the last quarter century, the standard
response to this challenge has been partial The Order Statistics of χf2 /f . The simplest
weighting as expounded in Cochran’s paper way to deal with an unknown multiple
of 1954 in which after a careful inquiry of χf2 /f is to focus on ratios, and adjust
Cochran confirmed the usefulness of the by division. Suppose we have a number
following procedure: of si2 , for convenience based on a common
number of df. How might we bring them
• find a naive weight for each series;
more nearly to a common value?
• order the series by these naive weights;
If they are indeed from the distribution of
• take 1/2 to 2/3 of the series with the
highest naive weights (no recipe for how kχf2 /f
to pick the exact fraction);
for a common k, and if we order them, the
• for the series in this low-variability group,
ordered values will behave like order statis-
use the mean of all their naive weights as
tics from kχf2 /f , or, equivalently, as k times
their weight, or, better, do this with the
reciprocals; the order statistics of a sample from χf2 /f ,
• for the remaining series, use their naive so that it is natural to look at
weights. si2 /c(i|n, f )
This procedure of “partial weighting” meets where c(i|n, f ) is a typical value for the ith
the major difficulties head on, and over- order statistic of n from χf2 /f . These ought
comes them. So long as most series will all to behave as though they were estimat-
tend to have about the same accuracy, par- ing k—which in our context is the com-
tial weighting will ensure that no individual mon σ 2 from which all si2 came.
series gets a catastrophically high weight. It A reasonable approximation, good enough
is not surprising that it has been a standard for many purposes, to the c(i|n, f ) can be
for so many years. had from
If we want to look for difficulties, we are
−1 3i − 1
likely to focus upon: c(i|n) = Gau
3n + 1
1. the absence of a recipe for the size of the
lower group, and and
3
2. the possible misweighting of the series 2 2
NOT in the lower group. c(i|n, f ) = 1 − + c(i|n) ,
9f 9f
Grouped Weighting
f ≥ 3,
Let us plan to take the natural variation
so that we have numbers for c(i|n, f ) when
of the naive weights (which we suppose
we want them.
to be 1/si2 ’s) into account. How can we
One way to proceed would be to replace the
do this? In the overutopian case, where
naive weight
the series summaries come from individual
measurements that follow a Gaussian distri- 1
wi∗ =
bution, each si2 is distributed like a multiple si2
of χf2 /f . We could certainly try to allow for
by the pushed-back weights
this much variability.
In the real world, where series summaries 1 c(i|n, f )
Wi∗ = = .
come from individual measurements that si2 /c(i|n, f ) si2
JOHN W. TUKEY AS TEACHER 345