Professional Documents
Culture Documents
Judea Pearl
Cognitive Systems Laboratory
Computer Science Department
University of California, Los Angeles, CA 90024
judea@cs.ucla.edu
if it is identied and not over-identied, that is, for this desirable feature may vary from parameter to pa-
every we can nd an instantiation mi such that
0 rameter became deeply entrenched in the literature.
(mi ) = .0 Paralleling the desirability of over-identi
ed models,
most researchers expect over-identi
ed parameters to
De
nition 3 highlights the desirable aspect of over- be more robust than just-identi
ed parameters. Typi-
identi
cation { testability. It is only by violating its cal of this expectation is the economists' search for two
implied constraints that we can falsify a model, and it or more instrumental variables for a given parameter
is only by escaping the threat of such violation that a Bowden and Turkington, 1984].
model attains our con
dence, and we can then state The intuition behind this expectation is compelling.
that the model and some of its implications (or claims) Indeed, if two distinct sets of assumptions yields two
are corroborated by the data. methods of estimating a parameter and if the two es-
Traditionally, model over-identi
cation has rarely been timates happen to coincide in data at hand, it stands
determined by direct examination of the model's con- to reason that the estimates are correct, or, at least
straints but, rather indirectly, by attempting to solve robust to the assumptions themselves. This intuition
for the model parameters and discovering parameters is the guiding principle of this paper and, as we shall
that can be expressed as two or more distinct1 func- see, requires a careful de
nition before it can be ap-
tions of , for example, p = f1() and p = f2 (). plied formally.
This immediately leads to a constraint f1 () = f2 () If we take literally the criterion that a parameter is
which, according to De
nition 3, renders the model over-identi
ed when it can be expressed as two or
over-identi
ed, since every for which f1 ( ) = f2 ( )
0 0 0
more distinct functions of the covariance matrix , we
must be excluded by the model.
6
whenever an over-identi
ed model induces a constraint identi
ed. However, this conclusion clashes violently
( ) = 0, it also yields (at least) two solutions for
g with intuition.
any identi
ed parameter = ( ), because we can
p f
To see why, imagine a situation in which is not mea-
always obtain a second, distinct solution for by writ- p
sured. The model reduces then to a single link ! ,
c
must be formulated to re
ne the notion of \two dis-
tinct functions." Such quali
cations will be formulated = b Ryx
in Section 4. But before delving into this formulation and would be classi
ed as just-identi
ed. In other
we present the diculties in de
ning robustness (or
b
which stands for the equations: of . The capacity to measure new causes of a vari-
y
tural parameters, we obtain: Our next section makes this distinction formal.
=
4 ASSUMPTION-BASED OVER-
Ryx b
=
Rzx
Rzy =
bc
c
IDENTIFICATION
where is the regression coecient of on . and
Ryx y x b
Rzx = Ryx Rzy (3) 2. each set induces a distinct function p = f (),
3. each assumption set is minimal, that is, no proper
If we take literally the criterion that a parameter is subset of those assumptions is sucient for the
over-identi
ed when it can be expressed as two or more derivation of p.
distinct functions of the covariance matrix , we get
De
nition 4 diers from the standard criterion in two (5) cov(ez ey ) = 0
important aspects. First, it interprets multiplicity of
solutions in terms of distinct sets of assumptions un- (6) cov(e1 ey ) = 0
derlying those solutions, rather than distinct functions
from to p. Second, De
nition 4 insists on the sets of This set yields the estimand: c = Rzy x = (Rzy ;
assumptions being minimal, thus ruling out redundant Rzx Ryx )=(1 ; Ryx
2 ),
be just-identi
ed if it is identied to the degree 1 (see Rzx Ryx )=(1 ; Ryx
2 ),
of (S W Z ).
Y is said to be maximal in G, if it is an IV-pair for
Note that the robust estimand of is 32 1 , not 32 . assumption in this case is irrelevant to the identi
ca-
This is because the pair ( = 2 = ), which yields
c R R
tion of the target quantity.
W X Z
32 , is not maximal there exists an edge-supergraph This section reformulate the notion of over-
of (shown in Fig. 3(b)) in which = fails to -
R
G Z d
identi
cation in terms of the whether the data
separate 2 from 3 , while = 1 does -separate
X X Z X d
can the set of relevant assumptions (for a given
2 from 3 . The latter separation quali
es ( = quantity) are testable.
= 1 ) as an IV-pair for 2 ! 3 , and yields
X X W
X 2 Z X X X
c = 32 1 .
R If the target of analysis is a parameter (or a set of
p
The question remains how we can perform the test support that the estimation of earns through con-
p
without constructing all possible supergraphs. frontation with the data, we need to assess the dis-
Every ( W Z) pair has a set ( ) of maximally
lled
S W Z parity between the data and the model assumptions,
graphs, namely supergraphs of to which we cannot G but we need to consider only those assumptions that
add any edge without spoiling condition (2) of Lemma are relevant to the identi
cation of , all other assump-
p
clude graphs that admit no IV-pair for X ! Y , yet permit, One simplistic de
nition would be to classify as rel-
nevertheless, the identication of c through the collective evant assumptions that are absolutely necessary for
action of k IV-pairs for k parameters (e.g., Pearl, 2000, the identi
cation of . In the model of Figure 1,
Fig. 5.11]). The precise graphical characterization of this p
class of graphs is currently under formulation, but will not since can be identi
ed even if we violate the as-
b
cient for identifying p" and let the symbol \j= p" discarding the portions of the model associated with
denote the relation \sucient for identifying p" (6j= p, z . Since Mb is saturated (that is, just identi
ed) it
its negation). If A is a member of some msa then, by has zero degrees of freedom, and we can say that b
de
nition, it is relevant. Conversely, if A is relevant, has zero degrees of freedom, or df (b) = 0. If q = c,
we will construct a msa of which A is a member. If Mc would be the entire model M , because the union
A is relevant to p, then there exists a set S such that of assumption sets A1 A2 and A3 span all the seven
S + A j= p and S 6j= p. Consider any minimal subset assumptions of M . We can therefore say that c has
S of S that satis
es the properties above, namely
0
one degree of freedom, or df (c) = 1.
S + A j= p and S 6j= p
0 0
b c
and, for every proper subset S of S , we have (from
00 0 x y z b c
minimality) d x y z
7 CONCLUSIONS
This paper gives a formal de
nition to the notion of
robustness, or over-identi
cation of causal parameters.
This de
nition resolves long standing diculties in
rendering the notion of robustness operational. We
also established a graphical method of quantifying the
degree of robustness. The method requires the con-
struction of maximal supergraphs sucient for render-
ing a parameter identi
able and counting the number
of such supergraphs with distinct estimands.
Acknowledgement
This research was partially supported by AFOSR
grant #F49620-01-1-0055, NSF grant #IIS-0097082,
and ONR (MURI) grant #N00014-00-1-0617.
References
Bollen, 1989] K.A. Bollen. Structural Equations with
Latent Variables. John Wiley, New York, 1989.
Bowden and Turkington, 1984] R.J. Bowden and
D.A. Turkington. Instrumental Variables. Cam-
bridge University Press, Cambridge, England,
1984.