You are on page 1of 258

Matrices and Graphs

Theory and Applications


to Economics
Proceedings of the Conferences on

Matrices and Graphs


Theory and Applications
to Economics

University of Brescia, Italy 8 June 1993


22 June 1995

Sergio Camiz
Dipartimento di Matematica "Guido Castelnuovo"
Universita di Roma "La Sapienza", Italy

Silvana Stefani
Dipartimento Metodi Quantitativi
Universita di Brescia, Italy

b World Scientific
II Singapore· New Jersey· London· Hong Kong

'
Published by

World Scientific Publishing Co. Pte. Ltd.


POBox 128, Farrer Road, Singapore 912805
USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data


A catalogue record for this book is available from the British Library.

MATRICES AND GRAPHS


Theory and Applications to Economics
Copyright © 1996 by World Scientific Publishing Co. Pte. Ltd.
All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,
electronic or mechanical, including photocopying, recording or any information storage and retrieval
system now known or to be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright
Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to
photocopy is not required from the publisher.

ISBN 981-02-3038-9

This book is printed on acid-free paper.

Printed in Singapore by Uto-Print


v

FOREWORD

the editors

The idea to publish this book was born during the conference « Matrices
and Graphs: Computational Problems and Economic Applications» held in
the far June 1993 at Brescia University. The conference was such a success that
the organizers, actually the present editors themselves, after a short talk with
the lecturers, decided on the spot to apply to the Italian Consiglio Nazionale
delle Ricerche (CNR) for a contribution to publish the conference proceedings.
The second editor did and the contribution came after a while.
In the meantime the editors organized another conference on «Matrices
and Graphs: Theory and Economic Applications», held as the previous in
Brescia during June 1995, partly with different invited lecturers. The confer-
ence was a success again and therefore the first editor applied to the Italian
National Research Council and got a second contribution, that came only re-
cently.
While the lecturers of the first conference, who were not at the second one,
were a bit upset, having submitted their paper without seeing any proceedings
published at that time, the lecturers common to the first and the second con-
ference suggested to join the contributions and publish a unique book for both
conferences. This is what we did.
During all these years, both authors were very busy lecturing, researching,
publishing, raising more funds to make their research possible. Most papers
arrived late and were carefully read by the editors, then the search of suitable
referees was not easy, so that the reviewing process took also a while, some
papers being sent back to the authors for corrections and then submitted again
to referees. A complete re-editing was necessary in order to get the uniform
editor's style ... , well, these are the reasons of such a delay, but eventually here
we are.
The book reflects our scientific research background: for academic and sci-
entific reasons both of us were drawn to different research subjects, both shift-
ing from pure to applied mathematics and statistics, with particular attention
to data analysis in many different fields the first editor, and to operational
research and mathematical finance the second. So, in each of the steps of this
long way, we collected a bit of knowledge.
The fact that in most of investigations we dealt with matrices and graphs
suggested us to investigate in how many different situations they may be used.
vi

This was the reason that led to the conferences; as a result, this book looks
like a patchwork, as it is composed of different aspects: we submit it to the
readers, hoping that it will be appreciated, as we did.
In fact, the numerous contributions come from pure and applied math-
ematics, operations research, statistics, econometrics. Roughly speaking, we
can divide the contributions by areas: Graphs and Matrices, from theoretical
results to numerical analysis, Graphs and Econometrics, Graphs and Theoret-
ical and Applied Statistics.

Graphs and Matrices contributions begin with John Maybee: in his pa-
per New Insights on the Sign Stability Theorem, he finds a new characterization
of a sign stable matrix, based on some properties of the eigenvectors associ-
ated to a sign semi-stable matrix. Szolt Thza in Lower Bounds for a Class
of Depth- Two Switching Circuits obtains a lower bound for a certain class of
(0,1) matrices. It is interesting to note that the problem can be formulated
in terms of a semicomplete digraph D, if one wants to determine the smallest
sum of the number of vertices in complete bipartite digraphs, whose union is
the digraph D itself. Tiziana Calamoneri and Rossella Petreschi's Cubic
Graphs as Model of Real Systems is a survey on cubic graphs, i.e. regular
graphs of degree three, and at most cubic graphs, i.e. graphs with maximum
degree three and show a few applications in probability, military problems, and
financial networks. Silvana Stefani and Anna Torriero in Spectral Proper-
ties of Matrices and Graphs describe from one hand how to deduce properties
of graphs through the spectral structure of the associated matrices and on the
other how to get information on the spectral structure of a matrix through
associated graphs. New results are obtained towards the characterization of
real spectrum matrices, based on the properties of the associated digraphs.
Guido Ceccarossi in Irreducible Matrices and Primitivity Index obtains a
new upper bound for the primitivity index of a matrix through graph theory
and extends this concept to the class of periodic matrices. Sergio Camiz and
Yanda Thlli in Computing Eigenvalues and Eigenvectors of a Symmetric Ma-
trix: a Comparison of Algorithms compare Divide et Impera, a new numerical
method for computing eigenvalues and eigenvectors of a symmetric matrix, to
more classical procedures. Divide et Impera is used to integrate those proce-
dures based on similarity transformations at the step in which the eigensystem
of a tridiagonal matrix has to be computed.

Among contributions on Graphs and Econometrics we find Sergio Camiz


paper I/O Analysis: Old and New Analysis Techniques. In this paper, Camiz
compares various techniques used in I/O analysis to reveal the complex struc-
vii

ture of linkages among economic sectors: triangularization, linkages compari-


son, exploratory correspondence analysis, etc. Graph analysis, with such con-
cepts as centrality, connectivity, vulnerability, turns out to be a useful tool for
identifying the main economic flows, since it is able to reveal the most impor-
tant information contained in the I/O table. Manfred GilU in Graphs and
Macroeconometric Modelling deals with the search of a local unique solution of
a system of equations and with necessary and sufficient conditions for this s0-
lution to hold. He shows how through a graph theoretic approach the problem
can be efficiently investigated, in particular when the Jacobian matrix is large
and sparse, a typical case of most econometric models. Manfred Gilli and
Giorgio Pauletto in Qualitative Sensitivity Analysis in Multiequation Models
perform a sensitivity analysis of a given model when a linear approximation is
used, the sign is given and there are restrictions on the parameters. They show
that a qualitative approach, based on graph theory, can be fruitful and lead
to conclusions which are more general than the quantitative ones, as they are
not limited to a neighborhood of the particular simulation path used. Mario
FaUva in Hadamard Matrix Product, Graph and System Theories: Motivations
and Role in Econometrics shows how the analysis of a model's causal struc-
ture can be handled by using Hadamard product algebra, together with graph
theory and system theoretical arguments. As a result, efficient mathematical
tools are developed, to reveal the causal and interdependent mechanisms asso-
ciated with large econometric models. At last, International Comparisons and
Construction of Optimal Graphs, by Bianca Maria Zavanella, contains an
application of graph theory to the analysis of the European Union countries
based on prices, quantities and volumes. Graph theory turns out to be a most
powerful tool to show which nations are more similar.

Graphs and Statistics papers are represented by three contributions. Gio-


vanna lona Lasinio and Paola Vicard in Graphical Gaussian Models and
Regression review the use of graphs in statistical modelling. The relative mer-
its of regression and graphical modelling approach are described and compared
both form the theoretical point of view and with application to real data.
Francesco Lagona in Linear Structural Dependence of Degree One among
Data: a Statistical Model models the presence of some latent observations using
a linear structural dependence among data, thus deriving a particular Marko-
vian Gaussian field. Bellacicco and Tulli in Cluster identification in a signed
graph by eigenvalue analysis establish a connection between clustering analysis
and graphs, by including clustering into the wide class of a graph transforma-
tion in terms of cuts and insertion of arcs to obtain a given topology.
viii

After this review, it should be clear how important is the role of matrices
and graphs and their mutual relations, in theoretical and applied disciplines.
We hope that this book will give a contribution to this understanding.

We thank all the authors for their patience in revising their work. A
special thanks goes to Anna Torriero and Guido Ceccarossi for their constant
help, but especially we would like to thank Yanda Tulli, who did the complete
editing trying to (and succeeding in) making order among the many versions
of the papers we got during the revision process. Last, but not least, thanks
to Mrs. Chionh of World Scientific Publishers in Singapore, whom we do not
know personally, but whose efficieny we had the opportunity to know through
E-mail.

October, 1996.
Sergio Camiz and Silvana Stefani

The manuscripts by Sergio Camiz, Guido Ceccarossi, Manfred Gilli, Gio-


vanna lona Lasinio and Paola Vicard, Francesco Lagona, and Bianca Maria
Zavanella, referring to the first Conference, have been received at the end of
1993. The manuscripts of Antonio Bellacicco and Yanda Tulli, Tiziana Cala-
moneri and Rossella Petreschi, Sergio Camiz and Yanda Tulli, Mario Faliva,
Manfred Gilli and Giorgio Pauletto, John Maybee, Silvana Stefani and Anna
Torriero, and Szolt Tuza, referring to the second Conference, were received at
the end of 1995.
This work was granted by the contributions from Consiglio Nazionale delle
Ricerche n. A.I. 94.00967 (Silvana Stefani) and Consiglio Nazionale delle
Ricerche n. A.I. 96.00685 (Sergio Camiz).
ix

Sergio Camiz is professor of Mathematics at the Faculty of Architecture of


Rome University ~La Sapienza~. In the past, he was professor of Mathemat-
ics at Universities of Calabria, Sassari, and Molise, of Statistics at Benevento
Faculty of Economics of Salerno University, and of Computer Science at the
American University of Rome. He spent periods as visiting professor at the
Universities of Budapest (Hungary), Western Ontario (Canada), Lille (France)
and at Tampere Peace Research Institute of Tampere University (Finland),
contributed to short courses on numerical ecology in the Universities of Rome,
Rosario (Argentina), and Leon (Spain), held conferences on data analysis ap-
plications at various italian universities, as well as the Universities of New
Mexico (Las Cruces), Brussels (Belgium), Turku and Tampere (Finland), and
at IADIZA in Mendoza (Argentina), contributed with communications to vari-
ous academic congresses in Italy, Europe, and America. After a long activity in
the frame of computational statistics and data analysis for numerical ecology,
and in programming numerical computations in econometrics and in applied
mechanics, his present research topics concern the analysis, development, and
use of numerical mathematical methods for data analysis and applications in
different frames, such as economical geography, archaeology, sociology, and
political sciences. He was co-editor of two books, one concerning the analysis
of urban supplies and the other on pollution problems, and author of several
papers published on scientific journals.

Silvana Stefani is a Full Professor of Mathematics for Economics at the


University of Brescia. She got her Laurea in Operations Research at the Uni-
versity of Milano. She has been visiting scholar in various Universities in War-
saw (Poland), Philadelphia (USA), Jerusalem (Israel), Rotterdam (the Nether-
lands), New York and Chicago (USA). She has been Head of the Department of
Quantitative Methods, University of Brescia, from November 1990 to October
1994 and is currently Coordinator of the Ph.D. Programme ~Mathematics
for the Analysis of Financial Markets~. She was co-editor of two books, one
concerning the analysis of urban supplies and the other on mathematical meth-
ods for economics and finance, and author of numerous articles, published in
international Journals, in Operations Research, applied Mathematics, Mathe-
matical Finance.

Typeset by Jj.TEX
Edited by Vanda Tulli
x

AUTHORS ADDRESSES

Antonio Bellacicco Dipartimento di Teoria dei Sistemi e delle Organizzazioni


Universita di Teramo
Via Cruccioli, 125, 64100 Teramo, Italia.

Tiziana Calamoneri Dipartimento di Scienza dell'Informazione


Universita "La Sapienza"di Roma
Via Salaria, 113, 00198 Rama, Italia.
E-mail: Calamo@dsi.uniroma1.it

Sergio Camiz Dipartimento di Matematica "Guido Castelnuovo"


Universita "La Sapienza"di Roma
P.le A. Moro, 2, 00185 Roma, Italia.
E-mail Camiz@mat.uniroma.it

Luigi Guido Ceccarossi Dipartimento Metodi Quantitativi


Universita di Brescia
Contrada S. Chiara 48/b, 25122 Brescia, Italia.
E-mail Ceccaros@master.cci.unibs.it

Mario Faliva Istituto di Econometria e Matematica per Ie Decisioni Econo-


miche
Universita Cattolica di Milano
Via Necchi 9, 20100 Milano, Italia.

Manfred Gilli Departement d':Econometrie


U niversite de Geneve
Boulevard Carl-Vogt 102, 1211 Geneve 4, Switzerland.
E-mail Manfred.Gilli@metri.unige.ch

Giovanna lona Lasinio Dipartimento di Statistica, Probabilita e Statistiche


Applicate
Universita "La Sapienza"di Roma
P.le A. Moro, 2, 00185 Rama, Italia.
E-mail Iona@pow2.sta.uniroma.it

Francesco Lagona Dipartimento di Statistica, Probabilita e Statistiche Appli-


cate
Universita "La Sapienza"di Roma
P.le A. Moro, 2, 00185 Rama, Italia.
xi

John Maybee University of Colorado


265 Hopi PI. Boulder, Co 80303 USA.
E-mail Maybee@newton.colorado.edu

Giorgio Pauletto Departement d'Econometrie


Universite de Geneve
Boulevard Carl-Vogt 102, 1211 Geneve 4, Switzerland.
E-mail Giorgio.Pauletto@metri.unige.ch

Rossella Petreschi Dipartimento di Scienza dell'Informazione


Universita "La Sapienza di Roma "
via Salaria, 113, 00198 Roma, Italia.
E-mail: Petreschi@dsi.uniroma1.it

Silvana Stefani Dipartimento Metodi Quantitativi


Universita di Brescia
Contrada S. Chiara 48/b, 25122 Brescia, Italia.
E-mail Stefani@master.cci.unibs.it

Anna Torriero Istituto di Econometria e Matematica per Ie Decisioni Econo-


miche
U niversita Cattolica di Milano
Via Necchi 9, 20100 Milano, Italia.
E-mail Torriero@aixmiced.mi.unicatt.it

Vanda Tulli Dipartimento Metodi Quantitativi


Universita di Brescia
Contrada S. Chiara 48/b, 25122 Brescia, Italia.

Zsolt Tuza Computer and Automation Research Institute


Hungarian Academy of Sciences
1111 Budapest, Kende u. 13-17, Hungary.
e-mail tuza@lutra.sztaki.hu

Paola Vicard Dipartimento di Statistica, Probabilita e Statistiche Applicate


Universita "La Sapienza"di Roma
P.le A. Moro, 2, 00185 Roma, Italia.

Bianca Maria Zavanella Istituto di Statistica, Facolta di Scienze Politiche


Universita Statale di Milano
Via Visconti di Modrone, 20100 Milano, Italia.
E-mail Zavanell@imiucca.unimi.it
xiii

Contents

New Insights on the Sign Stability Problem 1


John Maybee

Lower Bounds for a Class of Depth-two Switching Circuits 7


ZsoltTuza

Cubic Graphs as Model of Real Systems 19


Tiziana Calamoneri and Rossella Petreschi

Spectral Properties of Matrices and Graphs 31


Silvana Stefani and Anna Torriero

Irreducible Matrices and Primitivity Index 50


Luigi Guido Ceccarossi

A Comparison of Algorithms for Computing the Eigenvalues


and the Eigenvectors of Symmetrical Matrices 72
Sergio Camiz and Yanda Tulli

I/O Analysis: Old and New Analysis Techniques 92


Sergio Camiz

Graphs and Macroeconometric Modelling 120


Manfred Gilli

Qualitative Sensitivity Analysis in Multiequation Models 137


Manfred Gilli and Giorgio Pauletto

Hadamard Matrix Product, Graph and System Theories: Motivations


and Role in Econometrics 152
Mario Faliva

International Comparisons and Construction of Optimal Graphs 176


Bianca Maria Zavanella
xiv

Graphical Gaussian Models and Regression 200


Giovanna Jona Lasinio and Paola Vicard

Linear Structural Dependence of Degree One among Data: A


Statistical Model 223
Francesco Lagona

Cluster Identification in a Signed Graph by Eigenvalue Analysis 233


Antonio Bellacicco and Yanda Tulli
NEW INSIGHTS ON THE SIGN STABILITY PROBLEM

J. MAYBEE
Progmm in Applied Mathematics
University of Colomdo

We obtain a new characterization of when a matrix is sign stable. Our results


makes use of properties of eigenvectors of sign semi-stable matrices. No classical
stability theorems are required in proving our results.

1 Introduction

We deal with n x n real matrices. Such a Matrix A is called semistable (stable)


if every A in the spectrum of A, u(A) lies in the closed (open) left-half of the
complex plane. The real matrix sgn(A) = [sgn aij] is called the sign pattern of
A and two real matrices A and B are said to have the same sign pattern if either
aij bij > 0 or both aij and bij are zero for all i and j. When A is a real matrix
we let Q(A).be the set of all matrices having the same sign pattern as A. We
also will write A in the form A = Ad + A where Ad = diag[an, a22, ... ,ann]
and A= A-Ad.
Let u be a complex vector u = (Ul' U2, ... , un). We say that U is q-
orthogonal to Ad if au # 0 implies Ui = O. Notice that if U is q-orthogonal to
Ad, then U is q-orthogonal to Bd for every matrix BE Q(A).
Let A be a real matrix satisfying aij # 0 if and only if aji # 0 for all i # j.
Then A is called combinatorially symmetric and we may associate with A the
graph G(A) having n vertices and an edge joining vertices i and j if and only if
i # j and aij # o. The graph G(A) is a tree if it is connected and acyclic. We
also use, for any matrix, the directed graph D(A) defined in the usual way.
The real matrix A is called sign semi-stable (sign-stable) if every matrix
in Q(A) is semi-stable (stable). We will deal only with the case where A is
irreducible in order to keep the arguments simple (Gantmacher, 1964). All of
our results can be readily extended to the reducible case.
We will prove the following results about sign semi-stable matrices.

Theorem 1 The following are equivalent statements:

1. The matrix A is sign semi-stable.

2. Matrix A satisfies

(i) ajj S; 0, for all j,


2

(ii) aijaji ::; 0 for all i and j, and


(iii) every product of the form ai(1)i(2)ai(2)i(3) ... ai(k)i(l) = 0 for k 2:
3 where {i(I), i(2), ... , i(k)} is a set of distinct integers in N =
{l,2, ... ,n}.

3. There exists a positive diagonal matrix D = diag[dl, d2 , ... , dnl, di >


0, i = 1,2, ... , n such that the matrix DAD- 1 = Ad + S where S is skew
symmetric (Gantmacher, 1964) and satisfies (ii), and (iii).

Theorem 2 The following are equivalent statements about a sign semi-stable


matrix:

I'. The matrix A has .A = 0 as an eigenvalue.


2'. Every matrix in Q(A) has .A =0 as an eigenvalue.

3'. There is an eigenvector u satisfying Au = 0 which is q-orthogonal to Ad.

Theorem 3 The following are equivalent statements about a sign semi-stable


matrix:

I". The matrix A does not have a purely imaginary eigenvalue.

2". No matrix in Q(A) has a nonzero purely imaginary eigenvalue.

3". There is no eigenvector u satisfying Au = ij.Lu, which is q-orthogonal to


Ad.

The equivalence of conditions (1) and (2) of Theorem 1 is a well known


result due to Maybee, Quirk, and Ruppert (see Jefferies et al, 1977 for one
proof of this result). All the known proofs of this equivalence make use of one
of the classical stability theorems. By proving that (1) ::::} (2) ::::} (3) ::::} (1) we
can avoid the use of any stability theorem, a fact of some independent interest.
A consequence of Theorem 1 is that the family of sign semi-stable matrices
can be identified with the family of matrices of the form A = Ad + S, where
S is skew-symmetric and A satisfies (i),(ii), and (iii). This fact is used in an
essential way to prove Theorem 3.
Our proofs of Theorems 2 and 3 lead directly to simple algorithms for
testing a given matrix satisfying the conclusions (i),(ii), and (iii) to determine
whether or not it is sign-stable.
Finally, given Theorems 1,2, and 3 we can state the following sign stability
result.
3

Theorem 4 The real matrix A is sign stable if and only if the following four
conditions are satisfied.

(i) ajj ::; 0 for all j;


(ii) aijaji ::; 0 for all i and j;

(iii) every product of the form ai(1)i(2)' ai(2)i(3) ... ai(k)i(l) = 0 for k 2: 3
where {i(l), i(2), ... ,i(k)} is a set of distinct integers in N = {I, 2, ... ,n}

(iv) the matrix A does not have an eigenvector q-orthogonal to Ad

2 The proof of Theorem 1

Suppose first that the matrix A is sign semi-stable. The fact that (i),(ii),
(iii), are then true follows by a familiar continuity argument which we omit.
Hence (1) ::::} (2). Given that (2) is true and A is irreducible it follows that,
if aij :f 0 then aji :f 0 also. For suppose aij :f 0 and aji = O. Since there
is a path from j to i in A, (iii) is violated. Thus A is combinatorially sym-
metric. But then G(A), the graph of A exists, is connected and has no cy-
cles. Hence G(A) is a tree. Then by a theorem of Parter and Youngs (1962),
there exists a positive diagonal matrix D such that DAD- 1 = Ad + S where
Ad = diag[all' a22, ... , ann], S = [Sij], with Sii = 0 for i = 1,2, ... , n, and
Sij = -Sji for all i :f j. Thus (2) ::::} (3). Now set A = DAD-l and sup-
pose Au = AU. Taking scalar products on the right and left with U yields
u·Au = U· Adu+u· Su = U· AU = ~lul2 and Au·u = Adu·u+Su·u = Alul 2.
We have U· AdU = Adu· u and U· Su = -Su· u. Hence 2Adu· u = (A + ~)luI2
so we obtain
Re(A) = AI~~ u (1)

But AdU . u = L aiilu l2 where Io = {j I ajj :f O}. Hence condition (i)


iE10
implies that for any A in a(A), Re(A) ::; O. Thus (3) implies (1) and Theorem
1 is proved.
Now a sign semi-stable matrix is sign stable if and only if it has no eigen-
values on the imaginary axis in the complex plane. On the other hand, if u is
an eigenvector of A belonging to an eigenvalue on the imaginary axis, then we
must have Ui = 0 if i E Io by (1), i.e. u is q-orthogonal to Ad.
Note also that it follows from the proof of Theorem 1 that, if aii = 0, i =
1 ... n, A is skew-symmetric and all the eigenvalues of A are purely imaginary,
hence A is not sign stable. If aii < 0, i = 1 ... n, then every nonzero vector
4

U satisfies Re()..) < 0 so A is sign stable. Hence the interesting case for sign
stability is 1 :5 1101 < n, which we assume to hold henceforth.

3 The proof of Theorem 2

Let A be a sign semi-stable matrix. By conditions (i) and (ii) every term in
the expansion of det A has the same sign. Therefore if det A = 0, it must
be combinatorially equal to zero and hence every matrix in Q(A) also has
determinant equal to zero. It follows that (1') implies (2'). Our task is to
discover when there exists a non-zero vector U such that Au = O. Now U must
vanish on the (nonempty) set 10 so we partition the components of a candidate
vector U initially into the sets Z(Io), N(Io) where Z(Io) = {i liE I o}, Ui = 0
if i E Z(Io), and Ui =J 0 if i E N(Io). Now given a set Ip ;2 10 and a partition
of the components of U such that Ui = 0 if i E Z(Ip) and Ui =J 0 if i E N(Ip).
We look at the equations

L SijUj = 0 (2)
jEN{I,,)

If such an equation has exactly one nonzero term, it has the form SikUk = 0
for some fixed value of k. Since Sik =J 0 and k E N(Ip), this is a contradiction.
Hence we must place k E I p +1 ' We do this for each such occurrence. Thus
IpH ;2 Ip and Z(Ip) ~ Z(Ip+l) , N(Ip) ;2 N(IpH)' If the system (2) contains
no equation having only a single non-zero term, then IpH = Ip and Z(IpH) =
Z(Ip) , N(Ip+l) = N(Ip). We will examine this case below. Suppose that
Ip+l = N. Then Z(IpH) = Nand u=O, i.e. no matrix in Q(A) has zero as
an eigenvalue. It remains to consider the case where we have some Ip = IpH
with IIpl < n so Ui = 0 for i E Z(Ip) and Ui =J 0 for i E N(Ip). Clearly
every equation in system (2) at this point contains either no non-zero terms
or at least two non-zero terms. We have N(Ip) ~ 2 and the induced graph
(N(Ip)) is a forest. We claim that this forest consists of isolated single trees, i.e.
S(N(Ip)) = O. For suppose (N(Ip)) has a nontrivial tree To. This tree has a
vertex of degree one and there would then exist an equation in the subsystem
S(N(Ip))u = 0 having exactly one nonzero term, a contradiction. Next let
IN(Ip) I = q and suppose there exists r rows in the subsystem L SijUi =
uEN{Ip)
0, i E Z(Ip), having two or more non-zero entries. We have r ~ 1 so the set
of such rows is nonempty. Let this set be Zo(Ip) and consider the submatrix
S(N(Ip))UZo(Ip). The graph of this submatrix is a forest on the q+r vertices.
If IZo (Ip) I ~ q then the numbers of edges in this forest is at least 2r ~ r + q, a
contradiction. Similarly, there cannot be two directed paths from vertex k to
5

vertex l in the directed graph D(So) where So is the matrix of the subsystem
(2) for i E Zo(Ip). It follows that the subsystem Sou = 0 uniquely determines
one or more eigenvectors U belonging to .A = o. Hence each matrix in Q(A)
has at least one eigenvector U belonging to .A = 0 and vanishing upon the set
Z(Ip) ~ 10. Thus Theorem 2 is proved.

4 The proof of Theorem 3

We look for an eigenvector U such that U is q-orthogonal to Ad and Au = iJL,


for some JL -=I o. As in the proof of Theorem 2, we partition a candidate vector
u initially into the sets Z(Io), N(Io) with Ui = 0 if i E Z(Io) and Ui -=I if
i E N(Io). Now given Ip ;2 10 and a partition of U into Z(Ip) and N(Ip) we
°
look at the equations

L SjkUk = O,j E Z(Ip), (3)


kEN(Ip)

and
L SjkUk = iJLuj,j E N(Ip), (4)
kEN(Ip)

If any equation in the subsystem (3) contains exactly one nonzero term,
we have SjkUk = 0 and, as in the proof of Theorem 2, we adjoin each such k to
Ip. Similarly, if any sum on the left hand side of an equation in the subsystem
(4) contains no nonzero term, we have iJLuj = 0 and this contradiction compels
us to add the index j to Ip. Doing this for every such occurrence produces
the new set IpH ~ Ip and thereby the new partition, Z(IpH), N(IpH). If the
subsystem (3) contains no equation having a single term and the subsystem
(4) contains no empty sums, then Ip+l = Ip. We examine this case below.
Suppose that Ip+l = N. Then Z(IpH ) = Nand U = O. Since every matrix
in Q(A) has the same zero-nonzero pattern, it follows that no matrix in Q(A)
has a purely imaginary nonzero eigenvalue.
It remains to consider the case where we have some Ip = IpH with IIpl < n,
so Ui = 0 for i E Z(Ip) and Ui -=I 0 for i E N(Ip). At this point every equation
in the subsystem (3) has either no nonzero tenns or at least two nonzero terms.
Also the induced subgraph (N(Ip)) is a forest and contains no nontrivial trees,
since every sum in the subsystem (4) contains at least one term. This forest
must contain at least two trees, because if it were a single tree the subsystem
(3) would have an equation containing exactly one nonzero term. Moreover, if
a tree in the forest is adjacent to a vertex j E Z (I p) there must also be another
tree in the forest adjacent to vertex j for the same reason. We must therefore
6

have IN(Ip)1 2:: 4. Let jo be the index of a row in subsystem (3) containing
q 2:: 2 nonzero terms. Then vertex jo in G(A) must be adjacent to q distinct
trees in the forest (N(Ip)).
Now choose a pair of trees in (N(Ip)) adjacent to vertex jo. Set Uk = 0
if k is not a vertex of one of these trees. Let the trees be TI and T2, re-
spectively. Then the submatrices S(TI ) and S(T2) are disjoint nonzero skew
symmetric submatrices of S. Hence they have nonzero purely imaginary eigen-
values iJ.LI and iJ.L2. Let VI and V2 be nonzero vectors satisfying S(TI)VI =
iJ.LI VI, S(T2)V2 = iJ.L2v2. Then any vectors aVI and f3v2 also satisfy these equa-
tions where a and f3 are nonzero constants. If J.LI = J.L2 then we choose a and
f3 to satisfy

where kl is a vertex in one tree and k2 is a vertex in the other. Thus aVI and
f3v2 determine a vector u such that Au = iJ.LI u with U q-orthogonal to Ad. If
J.LI #- J.L2 then choose ao such that aoJ.L2 = J.L2 and modify S by multiplying
S(T2) by ao to obtain S'(T2). The resulting matrix is in Q(A) and has iJ.LI as
an eigenvalue.
This proves that some matrix in Q(A) has an eigenvector q-orthogonal
to Ad and it belongs to a purely imaginary eigenvalue. Hence Theorem 3 is
proved.

References

Jefferies C., V. Klee, and P. van den Driessche , 1977. «When is a


matrix sign stable?~ Can. J. Math., 29: p. 315-326.

Gantmacher, F. R. , 1964. The Theory of Matrices, Vols 1,2. Chelsea,


New-York.
Parter S. and J. W. T. Youngs, 1962. «The symmetrization of matrices
by diagonal matrices~ J. Math. Anal. Appl., 4: p. 102-110.
7

LOWER BOUNDS FOR A CLASS OF DEPTH-TWO


SWITCHING CIRCUITS

Z. TUZA
Computer and Automation Institute
Hungarian Academy of Sciences

Let M = (aij) be an (m x m) matrix with zero diagonal and ar + a 2i > 0 for


all i =I j, 1 ~ i,j ~ m. For a set R of rows and a set 0 of colutnns, ~enote by
R x 0 the set of the IRI . 101 entries lying in the intersection of those rows and
columns. We prove that if Rl, . .. , Rl and 01, . .. ,Ol are l sets of rows and l sets
l
of columns of M, respectively, such that the set U(Rk x Ok) is identical to the
k=l
set of nonzero entries of M (i.e., aij =I 0 if and only if the i-th row is in Rk and
l

the j-th column is in Ok for some k ~ l), then L (IRkl + lOkI) ~ mlog2 m.
k=l

1 The problem

In this note we investigate an extremal problem on a class of m by m 0-1


matrices, motivated by switching theory. The particular case in question can
be formulated in several equivalent ways, as follows.

• Suppose that the square matrix M = (aij) E {o,l}mxm has zero diag-
onal, and at least one of aij and aji is 1 for all i f. j, 1 :::; i, j :::; m.
Minimize the total number of rows and columns in a collection of 1-cells
(submatrices with no 0 entry) such that each aij = 1 occurs in at least
one of those I-cells.

• Let B = (X U Y, E) be a bipartite graph with vertex classes X =


{Xl, ... , xm} and Y = {Yl. ... , Ym}, and edge set E, such that XiYi ~ E
for aliI:::; i :::; m, and at least one of XiYj and XjYi belongs to E for all
if. j, 1 :::; i,j :::; m. Find the smallest total number IV(Bl)I+·· .+IV(Be)1
of vertices in a collection of complete bipartite subgraphs Bi C B, Bi =
(Xi U Yi,Ei ), Ei = {xy I X E Xi, Y E Yi} (1 :::; i :::; e), such that
El U···uEe =E.

• Given a semi-complete directed graph D = (V, E) on m vertices, without


loops and parallel edges (i.e., each pair x, Y E V is adjacent either by
just one oriented edge, or by precisely two oppositely oriented edges
8

Xy, YX E E), determine the smallest sum of the numbers of vertices in


complete bipartite digraphs Di C D (with all edges oriented in the same
direction between the two vertex classes in each D i ) whose union is D .
• Suppose that a circuit has to be designed with inputs Xl, . .. ,X m and
outputs YI, . .. ,Ym, where a set of conditions Cij prescribes whether there
exists a directed path of length 2 from Xi to Yj (written as Cij = 1;
otherwise we put Cij = 0). Assuming Cii = 0 for alII::; i ::; m, and
Cij = 1 or Cji = 1 (or both) for all i '" j, 1 ::; i,j ::; m, minimize the
number of links (adjacencies) in such a circuit.

The equivalence of the matrix problem and the two types of graph the-
oretical formulations is established by the corresponding adjacency matrices:
In the bipartite case we define aij := 1 if and only if Xi is adjacent to Yj; or,
conversely, we join Xi to Yj if and only if aij = 1. For digraphs, the entry
aij = 1 of the matrix corresponds to the edge oriented from vertex i to vertex
j.
To see that the switching circuits also give an equivalent formulation, no-
tice first that each link involved in a path of length 2 verifying Cij = 1 for some
pair i,j either starts from an input node or ends in an output node. Now, each
internal node Zk of a length-2 path connects a set Xk of inputs with a set Yk
of outputs, and the number of links incident to Zk is IXkl + IYkl. Therefore,
X k x Yk must be a I-cell in the 0-1 matrix (Cij). Conversely, each I-cell R xC
of r rows and c columns in a 0-1 matrix M can be represented by an internal
node Z connected to r input nodes and c output nodes in the circuit to be
constructed.

Notation
We denote R xC := {aij I ri E R, Cj E C}, where R ~ {r}, ... ,rm } is a
set of rows and C ~ {CI,"" em} is a set of columns. (We may also view
the 0-1 matrices as subsets of {rl,' .. , rm} X {c}, ... , em}.) The shorthand
l
U (Rk x Ck) = M means that the entry aij has value 1 in M if and only if
k=l
ri E Rk and Cj E C k for some k, 1 ::; k ::; i (and aij = 0 otherwise). The
complexity, IT(M), of M is defined as

where the value of i is unrestricted.


9

Obviously, the definition of u(M) can be extended for arbitrary (not nec-
essarily square) 0-1 matrices, but in this paper we do not consider the more
general case; i.e., M E {O, 1}mxm will be assumed throughout.

2 The results

It can be shown (Tuza, 1984) that

cm2
u(M) < logm

holds for every (m x m) matrix M, for some constant c, and also that this
upper bound is best possible apart from the actual value of c. For some re-
stricted classes of matrices, however, the complexity can be much smaller. This
is the case, for example, in the following two particular sequences, as proved
by Tarjan (1975).

Theorem 1 If m = 2n and M = (aij) is the upper triangle matrix (aij = 1


for 1 ::; i < j ::; m and aij = 0 otherwise), then

u(M) = n . 2n = m log2 m .

/2
Theorem 2 If m = (L n J)' where LxJ is the lower integer part of x, i. e. the
largest integer not exceeding x, and M = (aij) is the matrix J - I with aii = 0
and aij = 1 for all i =I- j, 1::; i,j ::; m, then

Our main goal is to show that the lower bound of m log2 m in Theorem 1 is
valid for a much larger class of (m x m) matrices. Namely, we will prove the
following result:

Theorem 3 If an (m xm) matrix M = (aij) (aij E {O, I}) has zero diagonal
and aij + aji > 0 for all i =f:. j, then

Theorems 1 and 3 are best possible in general, as it is discussed in the conclud-


ing section. On the other hand, we are going to observe that the complexity
10

a(M) of a typical member of the class of matrices involved in Theorem 3 is


much larger than O(mlogm). To formulate this assertion more precisely, de-
note

Mm := {M = (aij) E {O, 1}7nX7n I aii = 0, aij + aji > 0 for j =f i} ,


M~ := {M = (aij) E {O, l}7nX7n I aii = 0, aij + aji = 1 for j =f i} .

Theorem 4 There is a constant c > 0 such that


cm 2
a(M) 2: - -
logm

holds for (1 - 0(1)) IMml matrices M E Mm and (1 - 0(1)) IM~I matrices


M E M~ as m ~ 00.

The proofs of Theorems 3 and 4 are given in Sections 3 and 4, respectively.


Some open problems are mentioned in Section 5.

3 The general lower bound

The subject of this section is to prove Theorem 3, i.e., that a(M) 2: m log2 m
holds for all matrices M = (aij) of order m with zero diagonal, containing at
least one nonzero entry in each {aij, aji}, i =f j.
Suppose that an optimal collection of all-l submatrices Rk x Ck C M
e e
has been chosen, U(Rk x Ck) = M, L (IRk I + ICkl) = a(M). Let X =
k=l i=l
{XI, ... , xl'l, and let us define the following two sets for i = 1,2, ... , m:

Ai .- {xklriERk},

Bi .- {Xk I Ci E Ck}.
Notice that the ordered pairs (i,j) with Ai n B j =f 0 correspond to precisely
those entries aij of M which occur in at least one Rk x Ck, therefore the
e .
assumption U(Rk x Ck) = M and the initial conditions on M imply that
k=l

(i) Ai n Bi = 0 for alII:::; i :::; m, and

(ii) Ai n B j =f 0 or Aj n Bi =f 0 for all i =f j, 1:::; i,j :::; m.


11

Let us recall now the following inequality from Tuza (1987).

Lemma 5 Suppose that {(Ai, B i ) 11 ~ i ~ m} is a collection of pairs of finite


sets Ai, Bi satisfying the conditions (i) and (ii) above. If p is an arbitrary
real number with 0 < p < 1, then
m
L plAil (I - p)IBil < 1. (I)
i=l

Variants of this inequality have been considered in several papers; see Katona
and Szemeredi (1967), Tarjan (1975), Alspach et al. (1975) and Pedes (1984)
for its more particular versions, and Tuza (1989), Caro and Tuza (1991) for
generalizations. The various applications of these inequalities - and also those
of further similar types of set-pair collections - are discussed in the two-part
survey (Tuza, 1994, 1995).
In order to make the present proof self-contained, we describe a short
argument verifying (I) in the way as has been done in Tuza (1994). Let us
choose a subset Y = Y (p) ~ {Xl, ... , Xl} at random, by the rule

Prob{xk E Y) = p,

where the choice for Xk is done independently of those for all the other elements
of X. For i = 1,2, ... , m, denote by Ei the event

The condition (i) implies that the events Ei are nonempty. More explicitly, by
the random choice of Y, we have

for aliI:::; i ~ m. Moreover, the simultaneous occurrence of two events E i , E j


would imply

and

hence
Ai U Aj ~ X \ (Bi U Bj)
would follow. This possibility is excluded by the condition (ii), however, there-
fore
Prob{El ) + ... + Prob{Em ) :::; 1,
12

completing the proof of (1).


By what has been said above, the conditions of Lemma 5 hold for the sets
A, B i .Consequently, putting p = 1/2 we obtain
m
LTClAil+IBil) ~ 1. (2)
i=l
Moreover, 2- X is a convex function, therefore (2) implies
m
-~L(IAil + IBil)
m·2 i=l ~ 1. (3)

Finally, row ri (column Ci) occurs in precisely IAil sets Rk (in IBil sets Gk ,
respectively), thus
m l

L (IAil + IBil) = L (IRkl + IGkl) = a(M). (4)


i=l k=l
The substitution of (4) into the left-hand side of (3) now implies

m· T t7 (M)/m ~ 1,

from which the required inequality a(M) ;::: m log2 m = n . 2n follows.

4 The bound for almost all matrices

In this section we prove Theorem 4. The argument will be quite similar for
Mm and M~, therefore we can handle these two classes together.
Instead of counting, we are going to select a matrix M at random from
the corresponding class, and show that

lim Prob (a(M) < cm2/logm) = 0,


m-+oo

provided that the value of C is chosen appropriately. The probabilistic model


for Mm is

Prob(aij = 1/\ aji = 0) = 1/3,


Prob(aij = 0/\ aji = 1) = 1/3,

Prob(aij = 1/\ aji = 1) = 1/3,


13

for each pair i,j (1 ~ i < j ~ m) independently, while for M~ the corre-
sponding probabilities are

Prob(aij = 1/\ aji = 0) = 1/2,


Prob(aij = 0 /\ aji = 1) = 1/2.
These probability spaces represent each member of Mm and M~, respectively,
with the same probability (namely, 3-(';) or T(';)). In the first case,

Prob(aij = 1) = Prob(aji = 1) = 2/3


for all i # j, while in the second (antisymmetric) case each non-diagonal
element of M has value 1 with probability 1/2. Notice further that the values
of aij and aji are correlated, but they are independent of ai'j' for all {i',}'} #
{i,j}. Denoting
p := Prob(aij = 1) ,

one further essential fact is that p ~ 1 - 8 holds for some fixed 8 > 0 (in both
probabilistic models).
We claim that, with probability 1-0(1) as m ~ 00, every I-cell RxC c M
satisfies
min {IRI, IGI} < c'logm (5)
for some constant c'. Indeed, denoting m' := c' log m, arbitrarily chosen m'
rows and m' columns generate a I-cell with probability precisely

if they do not induce a diagonal element, because the presence of two dependent
entries {aij, aji} C R x C would also yield {au, ajj} C R xC. Moreover, the
probability to get a I-cell R x C is zero if a diagonal element is included. On
the other hand, the number of m' x m' submatrices is

therefore the probability that some of those submatrices contains no zero entry
is less than

, 2 (em)2m' empm'/2) 2m' ( em1-T1og1/p


, ) 2m'
p(m) _ = = .
m' ( m' m'
14

Choosing c' := 2/ log (I/p), this probability will tend to zero, since m' -+ 00
as m -+ 00. Thus, all (m' x m') submatrices of M contain at least one zero
entry, with probability 1 - 0(1).
l
Assume now U(Rk x Gk ) = M, and suppose that this is an optimal
k=l
l
choice of I-cells, i.e., 2:: (IRkl + IGkl) = O"(M). For each ordered pair (i,j) E
k=l
m x m with aij = 1, choose a cell Rk x G k containing aij, and define
1 1
Cij := IRkl + IGkl .

For aij = 0, we simply define Cij := O. Now we have

k=l
= O"(M).

On the other hand, applying (5) for the I-cell Rk x G k containing aij (the
I-cell that has been chosen in the definition of Cij), we obtain that

Cij > (c' log m) -1


holds almost surely for every i,j with aij =/: O. Moreover, the number of
nonzero entries is at least m(m - 1)/2, as at least one of aij and aji equals 1.
(If M E M~, then precisely one of them is 1.) Consequently,
m m
O"(M) > 2:: 2:: Cij
i=l j=l
m(m -1) .
> 2 mm {Cij I aij = I}

m(m -1)
> 2c'logm
15

with probability 1 - 0(1) as m -+ 00. Thus, taking c = (2c')-I, the assertion


follows.

5 Concluding remarks and open problems

Finally, we discuss the tightness of the results proved above, and mention some
related questions which remain open.

5.1 Tightness of the lower bound m log2 m


Both Theorems 1 and 3 are tight, and in fact the upper triangle matrices of
order m = 2n involved in Theorem 1 are the simplest extremal examples for
Theorem 3, too. Let us denote them by Tn (where the order is 2n). Tarjan
(1975) proved the inequality

(6)
by an explicit construction. A simple alternative way to prove (6) is to consider
the following recursive procedure. Clearly, for n = 1, {rl} x {C2} is the subma-
trix required to decompose T I . For n ~ 2, we can take RI := {r}, ... , r m /2}
and G I := {Cm/HI,"" Cm}, i.e., the I-cell generated by the first 2n - 1 rows
and the last 2n - 1 columns. Then IRII + IGII = m = 2n , and if we remove
those 4n- 1 (nonzero) entries of RI x GI from Tn, the remaining nonzeros form
two triangle matrices isomorphic to Tn-I, which then can be decomposed
separately, by induction.

Consequently, denoting by s(n) the total number of rows and columns in the
collection of I-cells obtained recursively, we conclude

s(n) = 2s(n - 1) + 2n ,

from which s(n) = n . 2n follows.


Many further examples can be given which also show the tightness of
Theorem 3, but we do not have a characterization of those matrices.
We should also note at this point that the structural description of the
collections {(A, B i ) I 1 :::; i :::; m} of set-pairs attaining equality in Lemma 5
is another interesting open problem for further research.
16

5.2 Other types of well-structured matrices


It would be worth investigating in a greater detail what kinds of structural
properties of a matrix imply small or large complexity. The results above
illustrate how some conditions imposed on the pairs of entries of M can restrict
the range of a(M). As regards relationships between pairs of rows or columns,
the class of Hadamard matrices is one of the interesting examples to consider.
A lower bound is given in Tarjan (1975), but as far as we know, the exact value
of a(M) has not yet been determined for those matrices.

5.3 Classes of dense matrices


It is quite natural to ask how a(M) changes if the probability p occurring in
the proof of Theorem 4 takes different values. More precisely, suppose that
M = (aij) is an (m x m) matrix with zero diagonal, and let each entry aij
(i =I- j) be equal to 0 or 1 at random, independently of the other entries (or
possibly depending just on the value of aji), by the rule

Prob(aij = 1) = p.
Here we allow that p = p(m) may depend on m. The argument given in Section
4 shows that
cm 2
a(M) ~--
logm
holds for an appropriately chosen constant c > 0 with probability 1 - 0(1) as
m -+ 00, provided that p(m) ::; 1 - {j (for an arbitrarily fixed {j > 0, and for all
sufficiently large m ~ mo).
On the other hand, according to Theorem 2, for p = 1 we have

a(M) = (1+o(1))mlog2m.
Hence, writing p in the form p( m) = 1 - q( m), the speed of the convergence of
q( m) to zero determines the expected asymptotic value,

a(m) = a(m,p) := E(a(M))

of the complexity of M. The current methods are not strong enough to describe
the exact relationship between q(m) and a(m), and we do not even know how
quickly q(m) must approach 0 to ensure a(M) = O(mlogm).
Note that a(m,p) is small also in the case where p(m) itself tends to zero
at a sufficiently large speed. From this point of view it would be interesting
to see which pairs of small and large probabilities (tending to 0 and to 1,
respectively) yield the same asymptotics for the expected value of a(M).
17

5.4 Circuits of depth 2


The problem for 0-1 matrices is equivalent to the depth-two circuit problem
only if paths of length precisely 2 have to connect the prescribed input/output
pairs. On the other hand, allowing a link from the input directly to the output
would mean that the corresponding weight IRkl + ICkl = 1 + 1 = 2 associated
with a degree-2 internal node is reduced to Ij or, more generally, a star from an
input node to a set of c output nodes (or from a set of r input nodes to an output
node) has weight one smaller than that of the corresponding (r x 1) or (1 x c)
submatrix. This change, however, does not influence the asymptotic behavior
of a(M), because there can be no more than O(m) such star configurations in
any optimal decomposition (covering) of a matrix of order m into I-cells.

Acknowledgments

The research was supported in part by the OTKA Research Fund, grant
no. 7558.

References

Alspach, B., L.T. OHmann and K.B. Reid, 1975. «Mutually disjoint
families of 0-1 sequences» Discrete Math., 12: p. 205-209.
Caro, Y. and Z. Tuza ,1991. «Hypergraph coverings and local colorings»
J. Combinatorial Theory Ser., B 52: p. 79-85.
Katona, G.O.H. and E. Szemeredi , 1967. «On a problem of graph
theory» Studia Sci. Math. Hungar., 2: p. 23-28.
Perles, M.A ., 1984. «At most 2d +1 neighborly simplices in Ed » Annals
of Discrete Math., 20: p. 253-254.
Tarjan, T .G. , 1975. «Complexity of lattice-configurations» Studia Sci.
Math. Hungar., 10: p. 203-211.
Tuza, Z. , 1984. «Covering of graphs by complete bipartite subgraphsj com-
plexity of 0-1 matrices» Combinatorica,4: p. 111-116.
Tuza, Z. , 1987. «Inequalities for two set systems with prescribed inter-
sections» Graphs and Combinatorics, 3: p. 75-80.
Tuza, Z. , 1989. «Intersection properties and extremal problems for set
systems» In G. Halasz and V.T. Sos (eds.), Irregularities of Partitions,
Algoritheorems and Combinatorics Vol. 8, Springer-Verlag, p: 141-151.
18

Tuza, Z. , 1994. «:Applications of the set-pair method in extremal hyper-


graph theory~ In P. Frankl et al. (eds.), Extremal Problems for Finite
Sets, Bolyai Society Mathematical Studies Vol. 3, Janos Bolyai Math.
Soc., Budapest, p. 479-514.
Tuza, Z. , 1996. «:Applications of the set-pair method in extremal problems,
II.~ In D. Miklos et al. (eds.), Combinatorics, Paul Erdos is Eighty,
Bolyai Society Mathematical Studies Vol. 2, Janos Bolyai Math. Soc.,
Budapest, p: 459-490.
19

CUBIC GRAPHS AS MODEL OF REAL SYSTEMS

T. CALAMONERl, R. PETRESCHI
Dipartimento di Scienze dell'InJormazione
Universitd di Roma "La Sapienza"

In this paper we deal with cubic graphs, i.e. regular graphs of degree 3, and with
at most cubic graphs, i.e. graphs with maximum degree 3. We recall two basic
transfurmation techniques that are used to generate these graphs starting from a
smaller graph, either cubic or general. Moreover we show some applications. To
complete this brief survey we present the state of the art of a specific problem on
these graphs: their orthogonal drawing.

1 Introduction

Any system consisting of discrete states or sites and connections between, can
be modelled by a graph. This fact justifies that they are a natural model for
many problems arising from different fields.
For instance, the psychologist Lewin proposed (Lewin, 1936) that the life
space of a person can be modelled by a planar graph, in which the faces rep-
resent the different environments.
In probability, a Markov chain is a graph in which events are vertices and
a positive probability of direct succession of two events is an edge connecting
the corresponding vertices (Hoel et al., 1972).
Military problems like mining operations or destruction of targets may be
led to the maximum weight closure problem (Ahuja et al., 1993).
Different processes such as manufacturing, currency exchanges, translation
of human resources into job requirements find as natural models networks, i.e.
directed weighted graphs (Evans and Minieka, 1992).
This interpretation is also applied to financial networks, in which nodes
represent various equities such as stock, current deposits, certificates of deposit
and so on, and arcs represent various investment alternatives that convert one
type of equity into another.
The search of the solution of problems in so different fields justifies the
existence of many types of graphs and many basic notions that capture as-
pects of the structure of graphs. Moreover, many applications require efficient
algorithms that operate above all on graphs.
In this paper we deal with cubic graphs, i.e. regular graphs of degree 3,
and with at most cubic graphs, i.e. graphs with maximum degree 3. We recall
two basic transformation techniques that are used to generate these graphs
starting from a smaller graph, either cubic or general. Moreover we show some
20

applications. To complete this brief survey we present the state of the art of a
specific problem on these graphs: their orthogonal drawing.
In all this paper we use the standard graph theoretical terminology of
Hartsfield and Ringel (1994).

2 Cubic graphs

A graph G is said regular of degree k or k-regular, if every vertex of G has


degree equal to k. A graph is called cubic if it is regular of degree 3. When the
degree of the vertices is less than or equal to 3, we have a more general class:
at most cubic graphs (see Figure 1).

Figure 1: An at most cubic graph and a cubic graph

It is to notice that to restrict the problem to cubic graphs makes sometimes


the solution easier to be found than in the case in which the problem holds for
at most cubic graphs. On the other hand there are problems for which this
restriction does not help.
For example, let us consider the chromatic index problem (CIP) and the
minimum maximal matching problem (MMMP).
CIP: "Given a graph G = (V, E) and an integer K, can E be partitioned
into disjoint sets E 1 , •.. ,Ek with k ::; K such that for 1 ::; i ::; k, no two edges
in E share a common endpoint in G 7"
MMMP: "Given a graph G = (V, E) and an integer K, to decide if there
is a subset E' of E with IE'I ::; K such that E' is a maximal matching of G."
21

The first problem is open for at most cubic graphs (Garey and Jonhson,
1979) while it is polynomially solvable for cubic graphs (Johnson, 1981). The
second problem is proved to be NP-complete by a transformation from vertex
cover for cubic graphs and it remains NP-complete for at most cubic planar
graphs and for at most cubic bipartite graphs (Garey and Jonhson, 1979).
The orthogonal drawing that we present in the last section is an example
in which the more general problem is related to at most cubic graphs. On
the contrary, the regularity of the degree is fundamental when a graph is the
model of an interconnection network, as we show in subsection 3.1.
The first time that cubic graphs appeared in the literature was in an in-
formal manner in Tait (1878) and in a more formal way in Petersen (1891)
dealing with factorizations of graphs and related colourings.
Many specific theoretical results on cubic graphs are known, but they
require a background that is not possible to give here. For a deep insight on
this topics, see Ore (1967), Hartsfield and Ringel (1994) and Greenlaw and
Petreschi (1996).
In a cubic graph the number n of vertices is always even and the number
of edges is 3n/2. If the cubic graph is plane, the number of faces is 2 + n/2.
In the following we just recall two basic transformation techniques that
are used to generate cubic (or at most cubic) graphs starting from a smaller
graph, either cubic or general.
The following construction method is due to Johnson (1963) and is based
on the concept of H-expansion. We call H-graph the graph with 6 vertices and
5 edges shown in Figure 2.

Figure 2: H-graph

Let G = (V,E) be a cubic graph on n vertices and let el = (V2,V4) and


e2 = (V3, V5) be two edges in G where all endpoints are distinct.
The H-expansion of G, with respect to el and e2 is obtained by eliminating
22

el and e2 and adding two vertices VI and V6 with edges:


(V6, VI )(V6, V2) (V6, V5) (VI, V3)(VI, V4)
or by (V6, Vt}(V6, V2) (V6, V3)( VI, V4) (VI, V5)

Theorem 1 (Expansion theorem)


For n ::::: 6, every connected cubic graph on n + 2 vertices is an H-expansion of
a connected cubic graph on n vertices.

In Figure 3 the 8-vertex cubic graphs derived from a 6-vertex one is shown,
when edges (V2,V3) and (V4,V5) are removed.

Figure 3: A 6-vertices cubic graph H-expanded into two 8-vertices ones

The second method we consider allows to transform general graphs into


cubic ones and it is given in Ore (1967).
Let G = (V, E) be a graph on n vertices and let ni be the number of
vertices of G with degree i.
The cubic-trasformation of G is obtained by enclosing each vertex V having
degree not less than 4 by a circle, small enough not to intersect neither any
other circle nor any edge-crossing. Each intersection of the circle with an edge
emanating from V will become a dummy vertex in G. (see Figure 4).

Theorem 2 (Cubic Transformation):


Each graph G = (V, E) with n vertices such that ni is the number of vertices
with degree i can be transformed into an at most cubic graph with N = nI +
n2 + n3 + 4n4 + 5n5 + ... + (n - l)nn-I vertices.
23

Figure 4: Scheme of transformation by general to cubic graphs

This transformation is sometimes useful to solve problems on general


graphs by utilizing properties on cubic graphs. An example is the case of
the four colour problem: "every map (or equivalently planar graph) is 4-
colourable."It can be proved that this conjecture is true for any map if it
is true for any planar cubic graph (Ore, 1967: 117-118). If this cubic map is
4-colourable, the coloring of the original graph G is obtained simply by the
contraction of all the circles introduced in the cubic transformation.
In the next section, we will present some problems that have as natu-
ral model a cubic or at most cubic graph and that utilize the just presented
transformation techniques.

3 When the graph model is cubic

We will present only three different applications that cover different fields and
seem particularly significant.

3.1 Cube-Connected-Cycles network


A communication network is a collection of processors executing in parallel.
The processors are nodes of a fixed graph. Communication between processors
is via the edges of the network. In general, the network imposes a limit to the
communication as edge per step. For example, there may be a requirement
that at most a single value may be communicated across each edge on each
step. Generally speaking, it would be desirable that each pair of processors
is connected, but the number of connections out from the same processor is
limited by physical characteristics.
There are many useful examples of graphs which are used for communica-
tion networks with limited degree like the tree connection and the d-cube con-
nection. In particular, we present the Cube-Connected-Cycles network (GCC)
introduced in Preparata and Vuillemin (1981). This network is modelled by
24

a cubic graph derived from a hypercube whose all nodes have degree d. To
obtain the eee model, the cubic transformation is applied to the hypercube,
as shown in Figure 5 when d = 3.

Figure 5: A CCC interconnetion network of dimension 3

The cubicity of the eee allows to present it as a feasible substitute for


other networks both for its efficiency and for its more compact and regular
VLSI layout.

3.2 Interaction of particles


Let PI and P2 be two sub-atomic particles having two trajectories from Xl to
X2 and from YI to Y2, respectively. Let X and Y be the points of the trajectories
in which the particles interact because of magnetic attraction or repulsion. The
graph obtained by connecting Xl and X2 to X, YI and Y2 to Y, and X to Y is an
H-graph (Figure 6).
The H-expansion techniques allow to show (Bjorken and Drell, 1994) that
the interaction between different sub-atomic particles can be modelled by a
general cubic graph.

3.3 The mosaic problem


Mosaic problem is related to biology, chemistry and graphics in general. It
consists in covering the plane with copies of the same shaped polygon. The
only regular polygons that can be used in a mosaic covering of the plane are
25

Figure 6: Interaction between two particles

hexagons, squares and triangles (Ore, 1967). It is easy to see that hexagons
induce a graph that is cubic except along the border of the external face (Figure
7).
We want to conclude this paper presenting the state of art of a particu-
lar problem related to cubic graphs. In view of the fact that the graphical
representation of a graph is not unique, (see e.g. Figure 8) and that a "good"-
drawing may be either a starting or arriving point of different problems, the
next section will survey orthogonal grid drawing of at most cubic graphs.

4 Efficient drawing of at most cubic graphs

An orthogonal drawing of a graph G = (V, E) consists in a graphical represen-


tation of G on a grid such that:
- vertices are represented by points and they lie on the crosses of the grid;
- edges are represented by a sequence of alternatingly horizontal and vertical
segments running along the lines of the grid.
Notice that the definition of orthogonal drawing limits G to be a graph of
maximum degree 4.
A point of the grid where the drawing of an edge changes its direction
(from horizontal to vertical or vice versa), is called a bend of this edge.
We call such a drawing an embedding if no edges have intersections different
26

Figure 7: Hexagon covering

Figure 8: Moving from an unpleasant to a nice drawing

from their endpoints.


A k-bend graph is an embedding of a planar graph in which every edge has
at most k bends.
A planar cubic graph, except the tetrahedron, is a I-bend graph (Liu et
al., 1992).
If the drawing can be enclosed by a quadrangle of width wand height h,
we call it a drawing with gridsize w x h.
According to the requests of the different applications, in general, the
feasibility of an orthogonal drawing increases according to the minimization of
the functions tied to the drawing, representing the number of bends and the
area.
In Table 1 we report all the most recent results about orthogonal drawings
27

Table 1: Results concerning the orthogonal drawing of at most cubic graphs.

Input Output Time Grid- Total Max. pla-


size nurn. of nurn. of
bends bends rity
per edge test

DLV93 3· orthogonal O(n.q,rnlogn) not minimum not


connected drawing inves- inves-
planar at with the tigated tigated
most minimurn
cubic num.of
graph bends
LMP94 2· orthogonal O(n) O(n L ) n/2 +1 yes
connected planar ammortized
at most drawing if
cubic Gis
graph planar.
nothing
otherwise
BK94 connected orthogonal O(n) (n - 1) 2n
at most drawing X
cubic (planar if (n - 1)
graph and Gis
a layout planar)
of G id
it is
.planar
PT94 2- orthogonal O(n) (n/2 + 1) n/2 +3 1 except
connected drawing X one edge
at most (not n/2 bending
cubic necessarily twice
graph planar)
K94 planar at orthogonal O(n) In/21 In/21 +1
most planar X
cubic drawing In/21
graph
CP95a at most orthogonal O(n) (n/2) n/2 +1
cubic drawing X
graph (not (n/2)
necessari-
ly planar)
cp90b at most orthogonal o(log~n) (3/4n+ n+3
cubic drawing parallel 1/2)2
graph (not using
necessari- n/logn
ly planar) proc. on (n+ 1)2 3/2n
crcw pram
28

of at most cubic graphs.


In the first column, an acronym of the references is given, so that it is
possible to distinguish the algorithms, for example DLV93 is for Di Battista et
al. (1993) and CP95b is for Calamoneri and Petreschi (1995b). Observe that
in the table all the algorithms are sequential except the last one that is the
only known parallel algorithm to draw a cubic graph on a grid, without the
restriction of planarity.
In the second and third columns the input and output of the relative
algorithms are reported.
The algorithm BK94 described in Biedl and Kant (1994) is the only one
making a difference between planar and non planar graphs: if G is planar, then
an embedding in the plane is required, and in this case the output drawing is
without crossings. The algorithms PT94 (Papakostas and Tollis, 1994) and
CP95a (Calamoneri and Petreschi, 1995a) comparing with the others accept
the most general input and do not distinguish between planar and non planar
graphs.
The fourth column labelled "time" presents the computational complexity
of each algorithm.
Then, the values of the three most important optimization functions are
listed: the achieved maximum gridsize, the maximum number of bends present
in the whole drawing and the number of bends per edge.
Concerning an optimization criterion, the algorithm DLV93 may appear
the worst. Its importance is in confuting the conjecture stating that the prob-
lem of finding the drawing with the minimum number of bends is NP complete.
In the row corresponding to CP95b, two values are written, because in the
same paper two slight variants of the same algorithm are presented: the first
one puts at most 2 bends on each edge, while the second one guarantees that
at most one bend is on every edge, but worse results about gridsize and total
number of bends are achieved.
Finally, only the algorithm LMP94 has a planarity test, in the sense that,
given in input a 2-connected at most cubic graph G (no information about its
planarity), the algorithm decides if G is planar or not while it tries to embed
it on the grid. So, the output is both the answer to the question if G is planar
and the drawing of G, if it is.

References

Ahuja R. K., T. L. Magnanti and J. B. Orlin ,1993. Network flows,


Prentice-Hall, London.
29

Biedl T. and G. Kant, 1994. «A Better Euristic for Orthogonal Graph


Drawings» Proc. 2nd European Symposium on Algorithms {ESA '94},
LNCS, Springer-Verlag, Berlin, 855: p. 24-35.

Bjorken J. D. and S. D. Drell, 1994. Quantum Electrodynamics,


Mc-Graw Hill, New-York.

Calamoneri T. and R. Petreschi , 1995a. «An Efficient Orthogonal Grid


Drawing for Cubic Graphs» Proc. COCOON '95 LNCS Springer-Verlag,
Berlin, 959: p. 31-40.

Calamoneri T. and R. Petreschi , 1995b. «A Parallel Algorithm for Or-


thogonal Drawings of Cubic Graphs» Proc. FICTCS '95.

Di Battista G., G. Liotta and F. Vargiu , 1993. «Spirality of Orthogo-


nal Representations and Optimal Drawings of Series-Parallel Graphs and
3-Planar Graphs» Proc. WADS '93 LNCS, Springer-Verlag, Berlin, 709:
p. 151-162.

Evans J. R. and E. Minieka , 1992. Optimization Algorithms for Networks


and Graphs, Marcel Dekker Inc., New-York.

Garey M. R. and D. S. Johnson, 1979. Computers and Intractability: A


Guide to yhe Theory of NP-Completeness, W. H. Freeman and Company,
New-York.

Greenlaw R. and R. Petreschi , 1996. «Cubic Graphs» ACM Comput-


ing Surveys, 27{4}: p.471-495.

Johnson D. S. , 1981. «The NP-completeness column: An ongoing guide»


Journal of Algorithms, 2{4}: p.393-405.
Johnson E. L. , 1963. «A proof of the 4-coloring of the edges of a regular
3-degree graph» Tech. Rep. 0.RC.63-28{RR} Min. Rep., University
of California, Operations research Centre.

Hartsfield N. and G. Ringel, 1994. Pearls in Graph Theory, Academic


Press, New-York.

Hoel P. G., S. C. Port and C. J. Stone, 1972. Introduction to stochas-


tic processes, Houghton Mifflin Company, London.

Kant G. , 1994. «Drawing Planar Graphs Using the canonical ordering»


To appear in Algorithmica - Special issue on Graph Drawing.
30

Lewin K. ,1936. Principles of Topological Psycology, Mc-Graw-Hill, New


York.

Liu Y., P. Marchioro, R. Petreschi and B. Simeone, 1992. «Theo-


retical Results of At Most I-bend Embeddability of Cubic Graphs» Acta
Mathematicae Applicatae Sinica, 8(2): p. 188-192.
Ore O. , 1967. The four Color Problem, Academic Press, New-York.

Papakostas A. and I. G. Tollis , 1994. «Improved Algorithms and Boun-


ds for Orthogonal Drawings» Proc. Graph Drawing '94 LNCS, Springer-
Verlag, Berlin, 894, p: 40-51.

Petersen J. ,1891. «Die theorie der regulaen graphen» Acta Mathematica,


15: p. 193-220.

Preparata F. P. and J. Vuillemin , 1981. «The Cube-Connected-Cycles:


A Versatile Network for Parallel Computation» Commun. of ACM, 24
(5): p. 300-309.

Tait P. G. , 1878. «On the colouring of maps» Proc. Roy. Soc. Edinb.,
10, p. 501-503.
31

SPECTRAL PROPERTIES OF MATRICES AND GRAPHS

S. STEFANI
Dipartimento M etodi Quantitativi
Faroltd di Economia, Universitd di Brescia

A. TORRIERO
Istituto di Econometria e Matematica per le Decisioni Eronomiche
Universitd Cattolica, Milano

Matrices and graphs are characterized by their spectral properties. Useful infor-
mation on graphs can be desumed by their associated matrices, while, on the other
hand, many interesting results on matrices can be proved by using their associ-
ated graph structure. Through some bounds known for the spectrum of a matrix,
based on its distribution, conditions for connectivity and estimates for some invari-
ant measures of a non oriented graph will be obtained. Furthermore, a class of real
spectrum matrices is investigated throughout the cyclic structure of the associated
graph. Some results on diagonal similarity are applied to provide conditions for a
strongly combinatorially symmetric matrix to have real eigenvalues.

1 Introduction

This paper is concerned with a characterization of graphs based on the spectral


properties of the associated matrix, like the Laplacian matrix, and with the
spectral properties of matrices based on a graph theoretic approach.
In order to assess properties of a given graph, limitations on the eigenvalues
of the Laplacian matrix can be used: thus, bounds for some graph invariants
and for connectivity can be found, which are in general very difficult to com-
pute. The results are presented in Section 3.
Furthermore, the relationship between the cyclic structure of the asso-
ciated graph and the diagonal similarity with a suitably defined symmetric
matrix is investigated.
Characterizations of diagonal similarity in terms of equality between cor-
responding cycle products have been discussed by many authors (Basset et al.,
1968; Fiedler and Ptak, 1969; Engel and Schneider, 1973, 1980). Using these
results, in section 4 we establish sufficient conditions for a strongly combina-
torially symmetric matrix to have real eigenvalues.
The next section contains the preliminary definitions, for unoriented and
oriented graphs, that shall be used in the following.
32

2 Some basic definitions on graphs

In this section we recall some basic graph theoretic definitions and some of the
main relationships between graphs and matrices.

2.1 Undirected graphs


A graph on a non empty finite vertex set V is G(V,E), where the edge set is
E ~ V X V. The graph is said to be undirected if and only if (u, v) E E implies
(v,u) E E and with loops if (u,u) E E for at least one u E E.
The elements of V are called vertices and the elements of E are called
edges or arcs. The order of G is the number of vertices and is denoted by IGI.
The size of G is the number of arcs and is denoted by e( G).
A weighted (undirected) graph is a graph in which a real number (called
weight) is assigned to each edge.
The weights Wij, usually positive, must satisfy the following conditions:

(i) Wij = Wji, i,j EV

(ii) Wij =f 0 if and only if i and j are adjacent in G, namely (i,j) E E.

Unweighted graphs can be viewed as special cases of weighted graphs,


once the weight assigned to each i,j E V corresponds to the number of edges
between i and j.
G' = (V', E') is said to be a subgraph of G = (V, E) if V' ~ V and E' ~ E.
A sequence of edges of the form (ioil' i l i2, ... ,i r - l ir) in which all vertices
are distinct is called a path. A graph is connected if there is a path joining
each pair of vertices. There is a similar definition of connectivity for digraphs
which will be given later.
Let d( i) denote the degree of i E V, i.e. the number of edges incident to
the vertex i, d(i) = LjWij, and let D = D(G) = diag(d(1),d(2),,,.,d(n)) be
the diagonal matrix indexed by V. The matrix Q = Q(G) = D(G) - A(G),
where A(G) is the adjacency matrix, is called the Laplacian matrix of G. If
the graph is weighted, A( G, w) is the matrix of weights and Q is called the
weighted Laplacian matrix of G. It is easy to check that Q is a singular M-
matrix, symmetric, positive semidefinite and its smallest eigenvalue is zero.
Those last two results hold if G does not have loops (Grone, 1991; Friedland,
1992; Mohar, 1992). We call {Al,A2,,,. ,An} the spectrum of Q with Al ~
A2 ~ ... ~ An = O. It is important to note that the multiplicity of 0 as an
eigenvalue of Q(G) is equal to the number of components of G. In particular,
An-l > 0 if and only if G is connected.
33

Let G be a graph of order n and size m. If the graph is unoriented, we


can orient its edges arbitrarily, i.e. for each edge we choose the initial and the
terminal vertex. The incidence matrix B = B( G) = [bqj 1is the m x n matrix
defined by:

if ij is the initial vertex of the arc gq


if ij is the terminal vertex of the arc gq
otherwise

A very interesting result is that for any incidence matrix B = B(G), Q(G)
can be factored as Q(G) = BTB. When the graph is weighted, Q(G) =
BTWB, where W is the m x m diagonal matrix of weights (Friedland,1992).

2.2 Directed graphs


A directed graph, or digraph, G(V, E) is an ordered pair of two finite sets
E ~ V x V; the elements of V are called vertices and the elements of E are
called arcs.
A chain in a digraph G of length s is a sequence "I = (io, el, iI, e2, i 2 , ... ,
is-l,es,i s ) where either ek = 1 and (ik-l,i k ) is an arc of G or ek = -1
and (ik' ik-l) is an arc of G, k = 1, ... , s. "I is called a simple chain if all
vertices are distinct. A simple chain whose first and last vertex coincide, i.e.
io = is , is said to be a cycle. A simple chain "I where el = ... = e s = 1 is
called a path. A circuit is a path that is a cycle and it is denoted shorthly
by "I = (i o ,i},i 2 , ... ,is-l,io). A circuit is said to be of length k or a k-
circuit if it consists of k arcs. 'Y = {io, ... , is} is defined the support of "f. If
"I = (io, iI, i 2 , ... ,is-I, io) is a circuit of G, then "1- 1, if it exists, is the circuit
(io, is-I, ... , iI, io).
A directed graph G is strongly connected if for every pair (im, ik) of distinct
vertices there is a path from im to ik. A component of G is a maximal strongly
connected subgraph of G, i.e. it is strongly connected and is not properly
contained in any strongly connected subgraph of G.
A chord of a circuit "I is an arc (i, j) of a digraph G such that if i and j
are distinct vertices of "I then neither (i,j) nor (j, i) belong to "I. A circuit of
G having no chords is called chord less.
Let 'Y be a cycle in G and let gq = (ij, iHJ) an arc of "I. If gq is an
arc of "I oriented from i j to i j +l then z,(gq) = 1, q = 1, ... ,m, if gq is
an arc of "I oriented from i j + l to ij then z,(gq) = -1, if gq is not an arc
of "I then z,(gq) = O. Then every cycle "I can be identified with the vector
z, = [z,(gq)], q = 1, ... ,m. The cycle space Z(G) is the space generated
by all cycles of G and dimZ(G), called the cyclomatic number of G, is equal
34

to m - n + k, where m, nand k are respectively the size, the order and


the components of G. If 'Y is a cycle and B(G) is the incidence matrix then
z~B = 0, that is the cycle space Z(G) is the kernel of BT(Bollobas, 1990).
Let A be an n x n matrix. Then the directed graph of A, denoted by
G(A), is the directed graph on n vertices iI, i2, ... , in such that there is an arc
in G(A) from ir to is if and only if ars -=I- O.
Let'Y = (i I , ... , is, i I ) be a circuit of G(A), then the circuit-product IT"{(A)
is defined by: IT,,{(A) = aili2 ... aisi 1 •
G(A) is a symmetric graph if aij -=I- 0 implies that aji -=I- 0 Vi,j. G(A) is a
sign-symmetric graph if it is symmetric and aijaji ;::: 0 Vi, j. A is combinato-
riaily symmetric if G(A) is a symmetric graph. A is strongly combinatorially
symmetric if G(A) is a sign-symmetric graph.
A is said to be completely reducible if every arc of G(A) is the arc of a
circuit of G(A) or, equivalently, if there exists a permutation matrix P such
that PAp T is the direct sum of irreducible matrices (Engel and Schneider,
1973).
Notice that if G(A) is a symmetric graph and there exists a cycle through
s vertices, there always exists a circuit through the same vertices. Hence, if we
are concerned with a symmetric graph, the cycle space may always be replaced
by the circuit space.

3 Graphs and their characterization through associated matrices

In this section we present some conditions based on the spectral properties of


the Laplacian matrix as defined in the previous section. Those properties are
of practical interest to get information on the graph structure. The relevance
of the Laplacian matrix Q(G) in graph theory has been stressed by many
authors (Anderson and Morley, 1985; Grone, 1991; Friedland, 1992; Merris,
1994; Mohar, 1992). We recall here some limitations and properties of the
Laplacian eigenvalues:

(a) if the graph has no loops or mUltiple edges, then 0 ::; Ai ::; n for each i,
and Al = n iff G, the complement of G, is not connected

(b) AI::; max{d(u) +d(v),u,v E V(G)}


n n
(c) l:d(v) = l:Ai
v=I i=I

(d) Al ;::: n ~ rmax{d(v),v E V(G)}


35

(e) 0::; An-1 ::; n ~ 1 min{d(v) , v E V(G)}

(f) The spectrum of a complete graph is


Al = n, ma = n - 1; An = 0, ma = 1.
(g) if G has order n and A = n is a Laplacian eigenvalue, then G is connected.

This allows to prove the following

Theorem 1 The avemge degree of a gmph is not higher than the largest Lapla-
cian eigenvalue.
n n n
Ld(v) Ld(v)
Proof The average degree is 11=1 n Since 11=1 n = i=h (property
(c)), from the associativity property of the mean we get
n

o
See also Brouwer (1995), for an analogous result valid for the adjacency matrix.
n
Ld(v)
Note that, from property (c), 11=1 n = trhG) , where tr( G) is the trace of
the Laplacian.
The multiplicity of 0 as an eigenvalue of G is related to the graph con-
nectivity: the method we propose here allows to rapidly check for connectiv-
ity without computing the eigenvalues of Q(G) directly, but working on the
trace of Q(G) and of its square Q2(G) only. The bounds we compute for the
spectrum of Q(G) are drawn from the statistical properties of the eigenvalues
distribution, whose /l and a 2 are respectively mean and variance (Wolkowitz
and Styan, 1980; Stefani and Torriero, 1994 and 1995):

(i) the whole spectrum belongs to the interval


[/l-av'n=l,/l+avn - 1] (1)

(ii)

/l - a i-I. ::;
Vn-z+l Ai ::; /l + a p,-i
- .-
z
for each i = 1, ... , n (2)
36

The interesting fact is that J.l and a are related respectively to the trace of
Q and its square:
n n

L'\
-n - -- -tr(Q)
J-. li=l -n- .'

When the graph is complete of order n (without loops or multiple edges),


the Laplacian matrix is such that tr(Q) = n(n - 1), tr(Q2) = n 2 (n - 1).
Thus, we have J.l = n - 1 and a 2 = n - 1. Being the Laplacian spectrum
for a complete graph of order n distributed as in (f), by the properties of
concentration indexes (Frosini, 1987), it turns out that this is the case of
maximum variance. Therefore a 2 , or better ~, the variation coefficient, can
be taken as a measure for the departure from completeness of a graph. In fact,
the variation coefficient of a complete graph of order n is
r'
= ¥
n- 1
vnS.
It is
easily to see that ~ is a non decreasing function of the degrees of vertices. The
maximum is reached when the graph is complete.
Furthemore, some invariant measures, like the diameter diam(G) or the
isoperimetric number i(G), very difficult to compute, can be bounded by the
second smallest eigenvalue, An-l (Mohar, 1992):
diam(G):2~,
nAn-l
i(G):2~. L,

Using our bounds (2) we get a first approximation of the two invariants:

diam(G);'
n
( f;;)'
J.l+a --1
i(G);' ~ (~- aJn 2 2)
n-
The following examples show how the use of the inequalities above can
improve the bounds for the Laplacian eigenvalues and give substantial infor-
mation on the connectivity of G.

Example 1

We consider a weighted graph with 4 vertices and 5 arcs.

1 0 0

~
[ -1 1 -1 0
B = 0 -1 1 W = diag( 4,4.5,6,7.2,3)
-1 0 0 1
-1 0 1 0
37

Id-__~~____________~3

Figure 1: A graph with 4 vertices and 5 arcs

the weighted Laplacian matrix Q is

14.2 -4.0 -3.0 -7.2]


Q = BTWB = -4.0 8.5 -4.5 0
-3.0 -4.5 13.5 -6.0
[
-7.2 o -6.0 13.2
f.L = 6.5625, a = 5.900l.
The bounds for the second smallest eigenvalue are (n = 4):
• '\3 :s: 4 by property a)
• O:S: '\3 :s: 2.6667 by property e)

• .6624:S:'\3:S: 9.9689 by (2)

Since '\3 > 0, the graph is connected. The method we propose (equations
(1) and (2)) improves substantially the information for connectivity. Note
again that to get those bounds we only need to compute tr(Q) and tr(Q2).
The spectrum of the weighted Laplacian is:
{15.964, 6.5136, 3.7722, 2.9995 X 1O-6}.
38

Example 2

/ 3

6 4

~ 5

Figure 2: A connected graph

0 1 1 0 0 0 2 0 0 0 0 0
1 0 1 0 0 0 0 2 0 0 0 0
1 1 0 1 0 0 0 0 3 0 0 0
A= 0 0 1 0 1 1
D= 0 0 0 3 0 0
0 0 0 1 0 1 0 0 0 0 2 0
0 0 0 1 1 0 0 0 0 0 0 2

2 -1 -1 0 0 0
-1 2 -1 0 0 0
-1 -1 3 -1 0 0
Q=D-A= 0 0 -1 3 -1 -1
0 0 0 -1 2 -1
0 0 0 -1 -1 2

Bounds for the second smallest eigenvalue are:

• A5 :::; 6 by property a)
39

• 0 ~ A5 ~ 2.4 by property e)

•. 0724 ~ A5 ~ 3.0478 by (2)

Being the second smallest eigenvalue strictly positive, by making use of


the third inequality, we can conclude that the graph is connected.
The spectrum of the Laplacian is:
{4.5616, 3.0, 3.0, 3.0, .43845, -5.0 x lO-IO}

Example 3

Consider a graph with two pendant vertices, i.e. vertices that are adjacent to
only one vertex.

3 2

6 5

Figure 3: A graph with two pendant vertices

0 0 1 0 1 0
0 0 1 0 0 0
1 1 0 1 0 0
A= , I) = diag(2,1,3,2,3,1)
0 0 1 0 1 0
1 0 0 1 0 1
0 0 0 0 1 0
40

2 0 -1 0 -1 0
0 1 -1 0 0 0
-1 -1 3 -1 0 0
Q=D-A=
0 0 -1 2 -1 0
-1 0 0 -1 3 -1
0 0 0 0 -1 1

J.L = 2.6668, a = 1.633.

Through (1) we find bounds for the whole spectrum [-1.6515,5.6515] which is
definitely better than [0,6] resulting from (a). In fact, from the former bound
we can conclude that the complement of G is connected.
The spectrum of the Laplacian is:
{4.7321, 3.4142, 2.0, 1.2679, .58579, -9.2539 x lO-IO}.

Example 4

Let us consider the following two different graphs (without loops or multiple
edges) G I and G 2 , taken from Cliff et al. (1979); G 2 describes the Argentinian
airline network for the seven main cities. Al and A2 are the corresponding
adjacency matrices.

2 2

7 3

6 5 4 6 5 4

Figure 4: Two connected graphs edges


41

0 1 1 1 1 1 1 0 1 1 1 1 1 1
1 0 1 1 1 1 1 1 0 1 0 0 1 0
1 1 0 1 1 1 1 1 1 0 0 0 0 1
Al= 1 1 1 0 0 0 0 ,A 2 = 1 0 0 0 1 0 0
1 1 1 0 0 0 0 1 0 0 1 0 0 0
1 1 1 0 0 0 0 1 1 0 0 0 0 0
1 1 1 0 0 0 0 1 0 1 0 0 0 0

6 0 0 0 0 0 0 6 0 0 0 0 0 0
0 6 0 0 0 0 0 0 3 0 0 0 0 0
0 0 6 0 0 0 0 0 0 3 0 0 0 0
D1 = 0 0 0 3 0 0 0 ,D2= 0 0 0 2 0 0 0
0 0 0 0 3 0 0 0 0 0 0 2 0 0
0 0 0 0 0 3 0 0 0 0 0 0 2 0
0 0 0 0 0 0 3 0 0 0 0 0 0 2

6.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0


-1.0 6.0 -1.0 -1.0 -1.0 -1.0 -1.0
-1.0 -1.0 6.0 -1.0 -1.0 -1.0 -1.0
Ql= -1.0 -1.0 -1.0 3.0 0 0 0
-1.0 -1.0 -1.0 0 3.0 0 0
-1.0 -1.0 -1.0 0 0 3.0 0
-1.0 -1.0 -1.0 0 0 0 3.0

6.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0


-1.0 3.0 -1.0 0 0 -1.0 0
-1.0 -1.0 3.0 0 0 0 -1.0
Q2= -1.0 0 0 2.0 -1.0 0 0
-1.0 0 0 -1.0 2.0 0 0
-1.0 -1.0 0 0 0 2.0 0
-1.0 0 -1.0 0 0 0 2.0
42

We computed the variation coefficient for the two graphs, in order to assess
the departure from completeness of both graphs i.e. which of the two graphs
is more complete than the other.
/11 = 4.2857, 01 = 2.5475, 01/11 = .59442;
/12 = 2.8571, 02 = .88449, ~ = .30958.
Note that ~ > ~ and it is immediate to check that G 1 is more close
to completeness than G2. (in Stefani and Torriero (1995) an analogous mea-
sure is proposed, based on the asymmetry Pearson coefficient of the spectrum
distribution) .
The spectrum of G 1 is:
{7.0, 7.0, 7.0,3.0, 3.0, 3.0,-4.5774 x 1O-10}.
The spectrum of G 2 is:
{7.0, 4.4142, 3.0, 3.0, 1.5858,1.0, -1.4838 x 1O-1O}.

4 A class of real spectrum matrices and its characterization through


graphs

In this section we present some conditions based on the cyclic structure of


strongly combinatorially matrices, which are of practical interest to get infor-
mation on the reality of their eigenvalues (Hearon, 1953; Goldberg, 1958).
We start this section by proving the following

Theorem 2 If A is combinatorially symmetric then A is completely reducible.

Proof Being G(A) a symmetric graph, each arc of G(A) belongs to a 2-


circuit of G(A). It follows, by definition, that A is completely reducible. 0

c
Let A, B E nn . A and B are diagonally similar if there exists a nonsin-
gular diagonal matrix D such that A = D- 1 BD.
c
Let A E nn . A is diagonally symmetrizable if it is diagonally similar to a
symmetric matrix.
Obviously a diagonally symmetrizable matrix has real eigenvalues, therefore
such matrices generalize the concept of symmetric matrices.
c
Let A, B E nn . A and Bare c-equivalent if G(A) = G(B) and for each
circuit 'Y of G(A) we have II/A) = I1i (B).
43

Theorem 3 (Basset et al., 1968) Let A and B be completely reducible matri-


ces , then A and B are diagonally similar if and only if they are c-equivalent.

The next Theorem 4 provides a sufficient condition for A to have real


eigenvalues. As mentioned before, this result was firstly obtained by Hearon
(1953) and Goldberg (1956). Here we present an alternative proof based on
the concept of diagonal simmetrizability.
First we prove the following lemma:

Lemma 1 Let A be strongly combinatorially symmetric and B = [bij] =


sign(aij) Jaijaji. Then n,,,(A) = IT--y-l (A) if and only if IT--y(A) = IT--y(B)
for each circuit'Y of G(A) .

Proof Let IT--y(A) = IT--y-I(A), that is

(3)
We first observe that G(A) = G(B) and the elements of A belonging to
the circuits are, by definition, necessarily different from zero.
Then we have:

i) sign(aij) = l/sign(aij)

ii) sign(aij) = sign(aji)

Multiplying each element aij of (3) by sign(aij) and extracting the square
root of both sides, in virtue of ii) and iii), we get:
Jail i2 Jai2i3 ... jaikil = Jail ik jaikik_l ... j a i 2i l ·
Now multiplying both sides of the above equality by
Jail i2 jai2i3 ... Jaikil yields:
ail i2 ai2i3 ... aiki l = Jail ikaiki l Jaiki k _l ai k _ l i k ... J a i 2 i l ail i2
by i), implies IT--y(A) = IT--y(B).
The proof of the sufficient condition is analogous. 0
Theorem 4 Let A = [aij] be a strongly combinatorially symmetric matrix and
let IT)A) = ITTI (A) for each circuit'Y of G(A). Then all eigenvalues of A
are real.
Proof Let B = [bij] = sign(aij)Jaijaji.
By Theorem 2, Lemma 1 and Theorem 3 it follows that A is diagonally similar
to B, whose eigenvalues are real, being B symmetric. Hence the thesis follows.
o
44

An easy consequence of this theorem is that matrices, whose directed graph


is sign symmetric and patterned as a tree or a forest, have certainly real eigen-
values. In fact, by definition, they have no cycles of order greater than 2.
Obviously, as n increases, the number of circuits to be checked increases.
More precisely, if A is a n x n matrix, the maximum number of circuits is equal

to ~ (~).
In the next theorems we prove that it is possible to decrease the number
of circuits to be checked, by only considering either the chordless circuits of
G(A) or the circuits forming a basis in the circuit space Z(G) of dimension
s = m - n + p , being m the size of G(A), n the order of G(A) and p the
number of components.
The first result will appear in Theorem 5 and is based on a characterization
of the diagonal similarity of two matrices A and B in terms of a certain class
of circuits of G(A) given by Engel and Schneider (1980, Theorem 3.5). More
precisely they proved that if A is completely reducible and B is symmetric, then
the matrices A and B are diagonally similar if and only if detA[)'] = detB[)']
where "( is a 1 - or 2-circuit of G(B) or "( is a chordless circuit of G(A).
DetA[),] is the principal minor of A, whose rows and columns are indentified
by the indices in )'.
Finally, Theorem 6 gives a further condition for a strongly combinatorially
symmetric matrix to have real spectrum. From Corollary 2.4 and Remark
5.2 in Saunders and Schneider (1978), it follows, in particular, the diagonal
similarity of A and B = [b ij ] = sign( aij) Jaij aji by checking the condition
I1,),(A) = I1')'(B) for the circuits "( forming a basis in the circuit space Z(G)
only.
First of all we state the following:

Lemma 2 Let A be a strongly combinatorially symmetric matrix and let B =


[bij] = sign(aij)Jaijaji. IfI1')'(A) = I1')'(B) then detA[)'] = detB[)'] for each
chordless circuit "( of G(A), where), is the support of"( and detA[)'] is the
principal minor of A, whose rows and columns are indentified by the indices
in)'.
Proof: Clearly, 1- or 2- or 3- circuits are always cordless.
We prove the theorem by complete induction on the length of the circuit
"(.
For n = 3 the theorem is true. In fact, for the 1 and 2-circuits in G(A), we
obtain immediately:
detA[i l ] = detB[i l ] and detA[i l ,i2] = detB[il,i2]' For 3-circuits, based on the
45

fundamental determinant formula (Basset et al. (1968)) we get:


detA[i}, i 2, i 3] = ai3i3 det[(I, 2)] + (_1)3+1-1 (det[(I)]ai2 i3 ai3i2 +
+det[(2)]aili3ai3il +
+( -1)3+ 1- Odet[( <I»]( ail i2ai2i3 ai3il + ail i3ai3i2ai2il)'
where det[(<I»] = 1.
But, since aini n = bin in , aiji k aikij = bijikbikij' by definition, and IT,/A)
Hy(B), being "( a 3-circuit, it follows that detA[i 1,i 2 ,i 3] = detB[il,i2,i3]'
Then let n > 3 and suppose that the thesis holds for every k such that 3 ::;
k ::; n - 1. First we observe that each arc (i, j) of "( lies on a 2-circuit and
each loop of "( is I-circuit. Furthermore if"( is a chordless circuit of G(A) of
length 1 (l ::; n), then there does not exist any other r- circuit, ,,(r, (3 ::; r < l)
of G(A), except ,,(-1, whose all vertices belong to "(. Otherwise "( would not
be chordless. This implies that n)'r (A) = 0 (3::; r < l).
Now, applying the fundamental determinant formula we have:

n-2
= ainindetA[J] + 2)-I)n+I-2 L detA[K]A(KI)
r=O

where J = (iI, i 2, ... ,in-d, K = (kl' k2' ... ,kp) is the set of increasing sets
of distinct indices in J, K' is the complement to the set K and A(KI) is the
sum of all (n-p)-cycles in A[K']. Finally, Dr,J denotes the set of all increasing
sets J = (jl,'" ,jn)'
Now, taking into account the above remark, it follows that A(KI) is always
equal to zero except for K = (kl' k2, ... , k n -2) or K = (0).
But we also have aini n = binin , aijikaikij = bijikbikij and IT,),(A) =
IT,),(B). Hence, based on the complete induction hypothesis, detA[J] = detB[J]
so that detA[;Y] = detB[;Y]. 0

Theorem 5 Let A be a strongly combinatorialiy symmetrix matrix. If, for


each chordless circuit"( of G(A), IT')'(A) = IT,),-l (A), then all eigenvalues of
A are real.

Proof By the previous Lemmas we get detA[;Y] = detB[;Y], being B =


[bij] = sign(aij)Jaijaji. Hence, because A is completely reducible and B
is symmetric then, based on Engel and Schneider (1980: Theorem 3.5), the
matrices A and B are diagonally similar so that the thesis follows. 0

We observe that if the directed graph of A is a complete graph, the chord-


less circuits are only the 1-2 and 3-circuits. Then, it suffices to check the
46

condition of Theorem 5 at most for (;) circuits instead of ~ (:) as ex-

pressed in Theorem 4.
The following theorem relates real spectrum matrices to the circuit space
of their directed graph.
Theorem 6 Let A be a strongly combinatorially symmetric matrix and let
be a basis for the circuit space ofG(A). If I1')'i (A) = I1')'i- 1 (A),
')'1,')'2, ... ,')'s
i = 1, ... , s, then all eigenvalues of A are real.

Proof: The proof follows easily from Theorem 2, Lemma 1 and from Saun-
ders and Schneider (1978: Corollary 2.4, Remark 5.2). 0

Example 5

Let us consider the following strongly combinatorially symmetric matrix:

0 2 1 0 1 5
6 0 3 0 0 0
4 4 0 10 9 0
A= 0 0 5 0 2 0
4 0 9 4 0 10
6 0 0 0 3 0

We observe that, being G(A) a sign-symmetric graph, to each cycle through


s vertices always corresponds a circuit through the same vertices.
By choosing an orientation of the arcs of G(A) arbitrarily, we get the
digraph G of Figure 5.
Then, the incidence matrix is the following (9 x 6) matrix:

1 -1 0 0 0 0
0 1 -1 0 0 0
0 0 1 -1 0 0
0 0 0 1 -1 0
B(G) = 0 0 0 0 1 -1
-1 0 0 0 1 0
-1 0 0 0 0 1
1 0 -1 0 0 0
0 0 1 0 -1 0
47

6 4

Figure 5: Digraph G

It is easy to check that rank(B) = 5 and consequently the dimension of


kernel of BT, i.e. Z(G), is equal to 4.
A possible basis of Z(G) is:
z'Y1 = [1,1,1,1,0,1,0,0,0]',
Z,2 = [-1,-1,0,0,0,0,0,1,0]',
Z,3 = [0,0,0,0, 1, -1, 1,0,0]',
Z,4 = [1,1,0,0,0,1,0,0,1]',
which corresponds to:
11 = (1,2,3,4,5,1),
12 = (1,2,3,1),
13 = (1,5,6,1),
14 = (1,2,3,5,1).
Now, by Theorem 6, IT /1. (A) = IT Ii-l(A), j = 1,2,3,4, as is easily checked,
so all eigenvalues of A are real.
By computing them we have :-11.768, -6.6326, -3.7136, 1.5121, 5.24, 15.362.
On the other hand, Theorem 5 is also applicable to this example. In fact, if we
consider the condition IT'i (A) = IT,:-l (A) to be checked on every chord less
circuit (the four triangles), we find th~t it is always satisfied.
48

Acknowledgments

We thank Guido Ceccarossi and two anonymous referees for useful suggestions.
All errors are our own.
Research supported by MURST 60% 1993 (S.stefani) and MURST 60%
1994 (A.Torriero).
Although the whole paper is attributable to both authors, Silvana Stefani
has written in particular Section 3, Anna Torriero Section 4, Sections 1 and 2
have been jointly written.

References

Anderson, W. N. and T. D. Morley, 1985. «Eigenvalues of the Lapla-


cian of a Graph:» Linear and Multilinear Algebra, 18: p. 141-145.
Basset, L., J. Maybee and J. Quirk, 1968. «Qualitative economics and
the scope of the correspondence principle:» Econometrica, 36 (3-4): p.
544-563.

Bollobas, B. , 1990. Graph Theory. An Introductory Course. Springer Ver-


lag, New York.
Brouwer, A. E. , 1995. «Toughness and Spectrum of a Graph:» Linear
Algebra and its Applications,226-228: p. 267-271.
Cliff, A. D., P. Haggett and J. K. Ord ,1979. «Graph Theory and
Geography:» In R. J. Wilson and L. W. Beineke (eds.), Applications
of Graph Theory. Academic Press, London.
Engel, G. M. and H. Schneider, 1973. «Cyclic and diagonal products
on a matrix:» Linear Algebra and its Applications, 7: p. 301-335.

Engel, G. M. and H. Schneider, 1980. «Matrices Diagonally Similar to


a Symmetric Matrix:» Linear Algebra and its Applications, 29: p. 131-
138.

Fiedler, M. and V. Pta!< , 1969. «Cyclic products and inequality for de-
terminants:» Czechoslovak Math. J., 19: p. 428-450.

Friedland, S. ,1992. «Lower Bounds for the First Eigenvalue of Certain M-


Matrices Associated With Graphs:» Linear Algebra and Its Applications,
172: p. 71-84.

Frosini, B. , 1987. Lezioni di statistica, Vita e Pensiero, Milano.


49

Goldberg, K. , 1958. 4;;A matrix with real characteristic roots» J. Res.


Nat. Bur. Standards, 56: p. 87.
Grone, R. , 1991. 4;;On the Geometry and Laplacian of a Graph» Linear
Algebra and Its Applications, 150: p. 167-178.
Hearon, J. Z. , 1953. 4;;The kinetics of linear systems with special reference
to periodic reactions» Bull. Math. Biophys., 15: p. 121-141.

Merris, R. , 1994.4;; Degree Maximal graphs are Laplacian Integrals» Lin-


ear Algebra and Its Applications, 199: p. 381-389.

Mohar, B. , 1992. 4;;The Laplacian Spectrum of Graphs» In Y. Alavi, G.


Chartrand, O.R. Oellermann and A.J. Schwenk (eds.). Graph Theory,
Combinatorics and Applications, J .Wiley and Sons New York, Vol 2: p.
871-898.

Parter, S. , 1960. 4;;On the eigenvalues and eigenvectors of a class of matr-


ices» J. Soc. Indust. Appl. Math., 8(2), June: p. 376-388.

Saunders, B. D. and H. Schneider, 1978. 4;;Flows on graphs applied


to diagonal similarity and diagonal equivalence for matrices» Discrete
Mathematics, 24: p. 205-220.

Stefani, S. and A. Torriero , 1994. 4;;Localizzazione di autovalori di ma-


trici a spettro reale» Rapporto di ricerca n. 6, Istituto di Econome-
tria e Matematica per Ie Decisioni Economiche, Universita Cattolica del
S.Cuore, Milano.

Stefani, S. and A. Torriero, 1995. 4;;Limiting Intervals for


Spectrum Distribution» Cahier 95.03, Departement d'Econometrie, Fac-
ulte de Sciences Economiques et Sociales, Universite de Geneve (submit-
ted).

Wolkowitz, H. and G. P. H. Styan , 1980. 4;;Bounds for Eigenvalues Us-


ing Traces» Linear Algebra and Its Applications, 29: p. 471-506
50

IRREDUCIBLE MATRICES AND PRIMITIVITY INDEX

G. CECCAROSSI
Dipartimento M etodi Quantitativi
Facoltd di Economia, Universitd di Brescia

The aim of this paper is to highlight the existent relations between the primitivity
index and the class of irreducible matrices. We consider both the subclasses of
irreducible, either primitive and periodic, matrices (the latter well described in
Seneta, 1981), in order to obtain some original results about: lowering the upper
bound for the primitivity index as stated by Berman and Plemmons (1979) or
Seneta (1981) through the application of graph theory and extending the idea to
the subclass of periodic matrices.

1 Preliminary remarks

In this section we remind some useful definitions and theorems about graphs
and non-negative matrices (we assume henceforth that all matrices we speak
about are non-negatives), presenting only proofs explaining the methodology
followed in the paper and omitting others that readers may find in the original
texts (Berman and Plemmons, 1979; Bollobas, 1979; Seneta, 1981; Buckley
and Harary, 1990).

1.1 Some useful definition about graphs and matrices


Starting with graphs, first of all it must be pointed out that we are working
on digraphs, i.e. graphs with oriented edges. In the following, for the sake of
simplicity, we will speak about graphs, noted D, meaning the oriented ones.
We call walk joining two nodes (vertices, indices) Vo and Vn an alternative
sequence of nodes and edges with Vo as first vertex and Vn as last; if such a
walk exists, then we say that Vn is reachable from Vo. A walk is closed if Vo = Vn
and its length is the number of edges involved in it. A path is a walk whose
nodes are distinct and a closed path of length n 2:: 3 is a circuit; moreover, if
a path or a circuit contain all the nodes of D is called spanning. Again, the
girth of a graph, noted g(D), is the length of the shortest circuit in D; the
circumference c(D) is the length of any longest circuit. The distance d(u,v)
between two nodes u and v is the minimum length of a path joining them
where, in general, a shortest u - v path is called u - v geodesic. The diameter
of a connected graph d(D) is the length of any longest geodesic. In general we
will note V(D) the set of vertices of D.
Concerning connectivity, we say that a graph is: strongly connected if
51

there exists a path joining each pair of nodes; unilaterally connected if for each
pair of nodes, at least one is reachable from the other; otherwise the graph
is disconnected. A component of a graph is a maximal strongly connected
subgraph where maximal means with the maximum number of nodes.
Other two concepts we will use in the following are the eccentricity and the
cut-node. The first one is defined for a node v as the distance from the farthest
node in V(D) as illustrated for a generic (non oriented) graph in figure 1, and
formally written as
e(v) = max {d(u, v) : u E V(D)} (1)
About the second, we define a cut-node as a node whose removal raises the
number of components in a graph, in particular if the graph is strongly con-
nected, the removal of a cut-node disconnects it.

a(2) b(3) g(4)

d(3) e(2) £(3)

Figure 1: A graph and its eccentricities (in parentheses)

From definitions we can directly state the following theorem about cut-
nodes:

Theorem 1 If v is a node of a connected graph D, v is also a cut-node if and


only if two other distinct nodes u and w exists such that v belongs to every
u -w path.

About matrices, we recall the standard notation for non-negative matri-


ces (we will work only on such matrices in the rest of the paper) and the
52

classification of these ones into the two classes of reducible (decomposable)


and irreducible (indecomposable) matrices. Moreover, the class of irreducibles
could be divided into the two subclasses of primitive and periodic matrices and
this will be discussed in the third subsection. Finally, we will define the adja-
cency matrix for a graph to show how we can link the theory of non-negative
matrices and the theory of graphs.
About inequalities involving two matrices A and B we will use the follow-
ing notation:

• A ~ B if aij ~ bij for '1 i and j;


• A > B if A ~ B and A # B;
• A > > B if aij > bij for all i and j.
If we take B = 0 then we say that A is non negative if A ~ 0 and that A
is positive if A > > o.
Before proceeding with the classification we need to introduce the canonical
form of a matrix. Let C be a non negative matrix, then its canonical form is a
matrix A similar to C where the similarity transformation is obtained through
a permutation matrix P, that is A = PCP' (about similarity transformations
see for example Graham, 1987). The permutation consists of rearranging the
index entries of the rows and columns of C in the same way in order to have, if
possible, a matrix A in a block diagonal form or in a block triangular (upper
or lower) form. The usefulness of investigating the canonical form of a matrix
arises from the classification of the index entries well known in the input-output
analysis and in the Markov chain's theory (see for the first topic Yan, 1972;
Leontief, 1986; Pasinetti, 1989; and for the second Seneta, 1981; Revuz, 1984).
Now we can distinguish between reducible and irreducible matrices refer-
ring to their canonical forms: if a permutation matrix P such that a matrix
C is similar to a block triangular(diagonal) matrix A exists, then C (and also
A) is called reducible, otherwise is called irreducible. Every way we will see in
subsection 1.3 that there exists a canonical form also for irreducible matrices.
Finally we define the adjacency matrix of a graph as a square matrix
whose index entries coincide with the nodes of the graph and whose the generic
element in position i, j equals 1 if there is an edge from i to j and equals 0
otherwise. It is easy to check that if we replace the positive elements of a non
negative square matrix with ones, we can see it as an adjacency matrix and
study the associated graph; in the following we will denote by D(A) the graph
associated to the matrix A. This operation is useful when we are interested
on a qualitative analysis of the entries of a matrix, that is to say when we look
for the relations between indices.
53

1.2 Connectivity and reducibility


Applying the definition of connectivity we can characterize the irreducible or
reducible feature of a matrix,more detailed in the next subsection, through the
following

Theorem 2 Let A be a matrix and D(A) its associated graph, then

1. D(A) is strongly connected if and only if A is irreducible;


2. D(A) is unilaterally connected if and only if A is reducible in a block
triangular form;
3. D(A) is disconnected if and only if A is reducible in a block diagonal
form.

In other words, if for each pair of matrix indices i,j there exists a positive
integer k, function of i, j such that a~j > 0, then the matrix is irreducible. This
statement leads to the following consideration useful for proofing the following
theorem: in the graph associated to an irreducible matrix there exists a walk
joining each pair of nodes i,j (i,j = 1, ... , n where card [V (D (A))] = nand n
is also the order of A) and, by strong connectivity, we can always choose the
path oflength d( i, j) such that d( i, j) = k; moreover, by definition of path, we
have k = d(i,j) ~ nand d(D(A)) ~ n. A particular case is when D (A) is a
unique spanning circuit which implies d(i, i) = n for each i and d(D(A)) = n.
The following theorem, here presented with an original proof based on graph
theory, is the logical sequence of the previous consideration.

Theorem 3 A matrix A is irreducible if and only if:

1. A+A2+ ... +An»o;


2. (I + A)n-l » o.
Proof
n
1. Suppose A reducible and at the same time 1 holds: E Ak >> 0 imply
k=l
the existence of a path (closed if i = j) of length k, with 1 ~ k( i, j) ~ n,
joining each pair of nodes, i.e. D(A) is strongly connected; by Theorem
2, A is irreducible refusing the assumption. Otherwise, if 1 does not
hold, like above, for at least a pair of nodes no path exists joining them,
thus implying the reducibility of A.
54

2. First of all we remark that the diagonal elements of the matrix (I + A)


are positive and so will be for every power of such a matrix; concerning
the graph, D(I + A) is obtained from D(A) adding a loop to each node.
Again by absurd, let us suppose A reducible, so that (I + A) is also
reducible, and 2 hold. Reminding that powers of reducible matrices are
also reducible, 2 can never be satisfied under this assumption (a strictly
positive matrix is obviously irreducible) so that (I + A) and A must
be both irreducible. Otherwise, let A, and consequently (I + A), be
irreducible and suppose that 2 does not hold. D(I + A) is strongly
connected and d(i,i) = 1 for each i and, more generally, d(D(1 + ~A»
n - 1. This means that in (I + A) S: the diagonal elements are positive for
all s and the extradiagonal ones are positive for some s ~ n - 1 because
of the existence of a path of length s ~ n - 1 joining each pair of nodes.
Let's choose now a node i; for any other node j, d(i,j) ~ n-1 must hold.
If d(i,j) = n -1 then the i,j-element of (I+A)n-l is positive; instead,
if d(i,j) = l < n - 1 is the i,j-element of (I + A)l to be positive, but
j (or i, or any other node) has a loop so for each k 2: l it exists a walk
of length k joining i and j (the path of length land k -l times the loop
on j) with the consequence that the i,j-element of (I + A)k is positive
for all k 2: l and so for k = n - 1. This shows that with A irreducible, 2
must always hold. 0

1.3 The subclasses of irreducible matrices

In this subsection we consider only the class of irreducible matrices in which


we distinguish between two subclasses: the primitive matrices and the periodic
matrices. Moreover we give the method to conveniently rearrange the indices
of a matrix in order to obtain a canonical form (Seneta, 1981).
If the node i is reachable from itself, p( i) is called the period of i if it is the
greater common divisor (GCD) of those k such that afi > 0 or, equivalently, if
it is the GCD of the closed walk's lengths on i; if aii > 0, i has a loop, then
p( i) = 1. An irreducible matrix A is called periodic of period p if the period
of some of its indices (then of all its indices) is p > 1, A is called primitive if
p = 1. An equivalent definition of primitivity state that a matrix A is primitive
if there exists a positive integer k such that A k > > 0 (the equivalence is shown
in Seneta, 1981).
Before describing the canonical form of a periodic matrix, we define an
equivalence relation between the matrix indices. Let a, b and x be positive
integers, a and b are called congruent modulo x, i.e. a == b mod x, if a and b
can be written as a = qax + ra and b = qbX + rb with ra = rb.
55

Theorem 4 Let i be any index belonging to the index set {I, 2, ... , n} of A.
Then for each other index j there exists a unique integer rj in the interval
o ::; rj ::; p - 1, where p is the period of A, such that:
1. if aij > 0 then s == rj mod p;
2. a~J+rj > 0 for k ~ N(j), where N(j) is some positive integer depending
on j.
This theorem says that the indices can be grouped into p disjoint classes,
called residual, whose elements are reachable between them through walks of
length 1 == rj mod p. The subset of matrix indices, taken from {I, 2, ... , n},
belonging to the same residual class mod p is noted Cr.
The canonical form of an irreducible matrix A is suitably obtained (see
Seneta, 1981) permuting the matrix entries in order to have all those belonging
to the same residual class in adjacent positions and ranking the residual classes
in ascending order. The result will be a matrix like the one below
0 AO,l 0 0 0
0 0 A 1,2 0 0
PAP' = 0 0 0 0 (2)
0 0 0 0 A p- 2,p-l
Ap-1,o 0 0 0 0
where p is the period and Ai,j are matrices in which the set of row indices
corresponds to the subset Ci and the set of column ones to Cj • Such a matrix
may be studied in term of powers of primitive matrices as we can easily show
for the simple case of p = 3. Starting from the canonical form

PAP' ~ [ ~ A~.']
AO,l
0 (3)
A 2,o 0
we have

(PAP')' ~ [ AI.,:A,.O
0
0 AO'f"] (4)
A2,oAo,1
and

[ A..IArA,.o
]
0 0
(PAP,)3 = A 1,2 A 2,oAo,1 0 (5)
0 A 2,oAo,lA 1,2
56

3
The diagonal blocks in (PAP') , and more generally in (PAP'r, are
square and primitive, so as [(PAP')Pt with k positive integer; that is to
say that powers multiples of the period can be studied in term of primitive
matrices.

2 Primitivity index

The primitivity index, .,,(A), of a primitive matrix A is the smallest positive


integer k such that A k > > o.
Here we will present in a first part some known results about upper bounds
for the primitivity index of a matrix omitting proofs except for two of them,
Lemma 2 and Theorem 5, given with original proofs based on graph theory.
This new approach allows us to restate those results in order to obtain better
bounds applying graph properties; moreover, keeping in mind the mentioned
proofs, we don't need to show them formally.
The notation follows Berman and Plemmons (1979). Let N = {1,2, ... ,n}
be the index set of A; for L ~ N, Fh(L) is the set of indices i for which in
D (A) there exists a walk of length h from i to j for some j ELand we get
FO(L) = L. Fh(j) stays for Fh({j}), that is the set of indices i such that
a7i > O. Obviously: if Fh(L) = 0, also F h+9(L) = 0 for each positive integer
9; if A is irreducible and L is a proper subset of N, then FI (L) contains some
index not in L.

h
Lemma 1 If A is irreducible of order n, j EN and h ~ n -1, then U Fl(j)
1=0
contains at least h + 1 elements.

Lemma 2 Let k be a non negative integer, j E N and A be irreducible of


order n. Suppose that for every l ~ k, D(A) contains a closed walk of length l
on j. Then Fn-I+k(j) = N.

Proof (Lemma 2) In this case we give an intuitive and original proof based
on the irreducibility of A. If A is irreducible, d(i,j) ~ n -1. If d(i,j) = n-l
then surely exists a walk from i to j of length n - 1 + k; it is the path of
length n - 1 joined to the closed walk of length k existing by assumption. If
d(i,j) = l < n - 1 then the walk from ito j is formed by the path of length l
joined to the walk of length k - (n - 1 - l) > k existing by assumption. So,
j is reachable from any other index i in n - 1 + k steps and also after s for
all s > n - 1 + k. A particular case is when j has a loop; if this condition is
57

fulfilled then F n- 1 (j) = N. Note that k would be set equal to one but, as by
definition i E F°(j), it may be considered equal to O. 0

Theorem 5 Let A be irreducible of order nand k be a non negative integer.


Suppose that there exists at least d elements in N, il, ... ,id, such that for every
1 2:: k and s = 1, ... , d, aLB > 0 (there exists d nodes in D(A) such that for
alll 2:: k exists a closed walk of length 1 on them). Then A is primitive and
')'(A) ~ 2n - d -1 + k.

Proof. It must be shown that for every i E N, F2n-d-l+k(j) = N. By


Lemma 2, for all i E N there exists 0 ~ h ~ n - d and 1 ~ s ~ d such that
is E Fh(j). Then

N;2 F 2n - d- 1+k(j) = F n- d- h {F n-1+k [Fh(j)]} ;2 Fn-d-h(N) =N


We remark now the meaning of F n- d- h {Fn-1+ k [Fh(j)]}. By Lemma 1, if
Fh(j) contains is then there exists a walk of length h leading from is to i.
Then F n-1+ k [Fh(j)] may be written as F n-1+ k { ... ,is, ... } that, by Lemma
2, equals N. We can conclude that for the pair (js, im) with s, m = 1, ... , d
satisfying the hypothesis of Lemma 2, the upper bound for ')'(A) derives from
the same lemma; for others indices i, the lninimal distance between i and is
must be added, distance surely lower then n - d, then we proceed as above. 0

Theorem 6 Let A be primitive of order n. If for some positive integer h,


(A + A2 + ... + A h) has at least d > 0 positive diagonal elements, then ')'(A) ~
n-d+h(n-1).

Corollary 1 Let A be primitive of order n such that aij > 0 if and only if
aji > O. Then ')'(A) ~ 2(n -1).

Corollary 2 Let A be irreducible of order n with d positive diagonal elements.


Then A is primitive and ')'(A) ~ n - d + h(n - 1).

Theorem 7 Let A be primitive of order nand s be the length of the shortest


circuit in D(A). Then ')'(A) ~ n + s(n - 2).

In order to obtain better bounds for ,),(A), we rewrite theorems, lemmas


and corollaries stated above using graph definitions. Before proceeding, it's
useful to explain the methodology followed to achieve the mentioned bounds.
From Lemma 1 we introduced an upper bound for h and it was referred, in
58

both cases h = n - 1 and h = n - d, to the maximal distance between a generic


index i and the index j knowing that either d(i,j) :-: :; n - lor d(i,j) :-:::; n - d.
From graph theory we know that the distance of a node j from whatever other
node i is d(i,j) :-:::; e(j) :-: :; n - 1 so we have to replace n - 1 with d(D(A)) and
n - d with maxs =l, ... ,d e(js). Proofs are obvious following the approach used
to proof Lemma 2 and Theorem 5, given the graph definition of diameter and
eccentricity.

h
Lemma 3 If A is irreducible of order n, j E Nand h :-:::; e(j), then U pl(j)
1=0
contains at least h + 1 elements.

Lemma 4 Let k be a non negative integer, j E N and A be irreducible of


order n. Suppose that for every I ~ k, D(A) contains a closed walk of length l
on j. Then pe(j)+k(j) = N.

Theorem 8 Let A be irreducible of order nand k be a non negative integer.


Suppose that there exists at least d elements in N, 11, ... , jd, such that for every
l ~ k and s = 1, ... , d, aL. > 0 (there exists d nodes in D(A) such that for
alll ~ k exists a closed walk of length l on them}. Then A is primitive and
,(A) :-: :; d(D(A)) + max e(js) + k.
s=l,,,.,d

Theorem 9 Let A be primitive of order n. If for some positive integer h,


(A + A 2 + ... + A h) has at least d > 0 positive diagonal elements, then ,(A) :-: :;
max e(js) + hd(D(A)).
s=l, ... ,d

Corollary 3 Let A be primitive of order n such that aij > 0 if and only if
aji > O. Then ,(A) :-: :; 2d(D(A)).

Corollary 4 Let A be irreducible of order n with d positive diagonal elements.


Then A is primitive and ,(A):-:::; max e(js) + hd(D(A)).
s=l, ... ,d

Theorem 10 Let A be primitive of order nand s be the length of the shortest


circuit in D(A). Then ,(A):-:::; max e(js) + sd(D(A)).
s=l, ... ,d
59

3 Primitivity index and periodic matrices

In this section we will propose an original method to relate the primitivity


index to periodic matrices. The idea stems from the possibility, stated above,
of studying periodic matrices in term of powers of primitive matrices. As shown
in subsection 1.3, kp-powers of a periodic matrix may be written as

A~,I o o
A 2 ,2 o
1..]
Akp= (6)
[ o o
o o o
where the diagonal blocks AI,I, ... , Ap,p are primitive. The aim is to calculate
the first k satisfying AI,I > > O, ... ,Ap,p > > 0 and in the following we will note
it "(' (A) using again the terminology of ''primitivity index "given the similarity
of concepts. In terms of the previous notation, "('(A) is the smallest k for which
Fkp(Cr ) = C r , where C r is an index class.
It seems to be trivial to find an upper bounds for "('(A) because it is
sufficient to apply theorems we stated in the previous section to each Ai,i (i =
1, ... ,p) and observing that. max ,,((Ai,i) ~ "('(A). A more difficult problem is
1.=l, ... ,p
to calculate its exact value and here it is possible given the regularity shown by
graph associated to periodic matrices. In order to solve this problem we need
a detailed analysis of the associated oriented graphs with its features, which
allows us to find algorithms for "('(A) computation. In the first subsection we
introduce some hypothesis on the graph structure of a periodic matrix and
in the others we gradually analyze the problem from the easiest to the more
complex case.

3.1 Preliminary hypothesis


The oriented graph associated to an irreducible matrix can always be drawn in
a way that circuits are clearly observable (it will be clear what we mean look-
ing at figure 2) and, for our purpose, we need to introduce two new concepts.
The first is similar to the path, but concerning circuits viewed as nodes: given
two circuits, we call chain the path joining them in which all circuits involved
are different, their number l is the length of the chain. We note that a node
belonging to two or more adjacent circuits is a cut-node. In order to compute
"(' (A), the interesting chains are those linking extreme circuits, that is to say
circuits with only one cut-node; this because the more distant nodes belong
to those circuits so k is higher in such cases. The number of chains joining
60

extreme circuits (here supposed in number of E) equals the possible combina-


tions of them taken by twos, (~); in the following we will need a criterion to
select the more binding chain. The second concept we need is the definition of
jump for a node: we say that a jump is allowed to a node belonging to a circuit
if, through a path of length p, it may reach a node belonging to the same class
in a circuit distant 2 from its one. This is possible if and only if the described
path contains two cut-nodes.

Assumption 1 Every index i belongs to one or more circuits, all of the


same length h. We have the following implications: i) a~ > 0 for each i; ii)
h = p the period of the matrix; iii) every index of each circuit belongs to a
different residual class. We remark that i) derives from circuit definition, ii)
from period definition and iii) from Theorem 4. Hereinafter we will assume
the first hypothesis always verified.

Assumption 2 There exists only one cut-node, belonging to all circuits.


The implications are: i) the length of every chain is 2 (all circuits are extreme);
ii) the cut-node form a residual class. In this case i) is trivial and ii) stems
from Theorem 4 and from iii) of assumption 1: we argue that belonging to all
circuits, since all indices of a circuit belong to different classes, the cut-node
must form a class by itself. In another way, calling t the cut-node, we note
that under assumption 1 e(t) = h -1 = p-l, so that the only index congruent
modulo p with t is t itself; moreover, the other p - 1 classes contains so many
indices as the number of circuits.

Assumption 3 Choosing any chain of length l in D(A), and ordering the


cut-nodes calling tl those belonging to the first and the second circuits and
tl-l those belonging to circuits l - 1 and l, the distance between two of them,
let suppose ti and tj with i < j, may be written as the product between the
distance of two adjacent (i = j - 1) cut-nodes, assumed to be constant and
noted x, and the non negative integer function z(i,j) = j -i being the distance
computed in circuits from ito j; i.e. d(ti,tj) = xZ(i,j). We note that under
assumption 1 this notation may be applied in the same form also if i > j :
the function Z becomes z(i,j) = i - j, the distance between two adjacent cut-
nodes is again constant and, if we call h the length of all circuits, the new
constants satisfy y = h - x. The implication here is that the optimal criterion
to select the binding chain is its length. Again, we note that assumption 2
implies assumption 3.
61

3.2 How to compute the ''primitivity index"under two assumptions


We start with the simplest case presenting the result obtained when the first
two assumptions are verified.

Theorem 11 Let A be irreducible, of period p and of order n. If assumptions


1 and 2 are verified then

o o
A 2 ,2 o
(7)
o
o o
with AI,I » 0; ... ; Ap,p » 0, that is ,'(A) = 1. (We remark that the equal
sign in equation 7 stems from the application of Boolean algebra to the calculus
of powers of A, since we are interested in checking if entries of A are different
from zero and not in their actual magnitude).

Proof We have to show that, for every choice of an index pair i, j belonging
to the same class, including the case i = j, d(i,j) = p. If i = j, d(i,j) = p by
assumption 1; if i # j, assumption 2 implies that i and j belong to different
circuits and t (the cut-node) is equidistant from all elements of the same class.
Generalizing, let us assume t E Co and i, j E Cr with 1 ::; r ::; p - 1 and note
d(t, C r ) = kp + r the distance between t and a generic element of Cr. Since t
belongs to all circuits, d(t,Cr ) = r and, since the length of each circuit is p,
we have also d(Cn t) = P - r. Joining the two paths we obtain d(Cr , Cr ) = p
~~~ 0

Now we show an example, satisfying assumptions 1 and 3 in which all the


chains are of length 1 = 3, useful to show the idea of jumps and to state the
next theorem.
62

Example 1

Let us consider the matrix


0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 1 0 0 0 0 0 0 0
0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0 0 1 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0
A= 0 0 0 0 1 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 1
1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0
0 0 1 0 0 0 1 0 0 0 0 0 0

Figure 2: A graph satisfying assumptions 1 and 3 with I = 3 for all chains.

whose graph is like in figure 2. We may deduce:


• there are four linked circuits (a,c,d extreme) of length l = 4, so p = 4;
• the classes are: Co = {I, 3, 7}, C 1 = {5, 6, 12}, C 2 = {2, 8, 10, 11} ,
C 3 = {4, 9, 13} ;
63

• the cut-nodes are 5,7,13;

• the canonical form of A, is obtained via the following permutation:

1 2 3 4 5 6 7 8 9 10 11 12 13)
( 1 3 7 5 6 12 2 8 10 11 4 9 13

which means that the first row and column are unchanged, the seconds
take the place of thirds, the thirds take the place of sevenths and so on.

Suppose to check what happens in the chain a f-----+ c with respect to the
nodes in C2 , the only class without cut-nodes. From assumption 1, every node
is reachable from itself and at least from another node belonging to the same
class, moreover we know that d( C2 , t) < 4. In more detail, moving from a
to c, for 11 we have d(11,5) = 3 because 5 E Cl ; again d(5,8) = 1, then
p4 (11) = {11, 8}. Going on, P4(8) = C2 because b contains two cut-nodes, 5
and 13, and by d(C2 , t) < 4 the node 8 can reach the nodes of its class in both
a and c. Adding these first results we have F 2*4(11) = p4(p4(11)) = C2 and,
obviously, also F 2*4(8) = C2 • Concerning 2, taking the chain in the other way
c ----+ a, we remark that: d(2,13) = 1 and d(13, 8) = 3 so P4(2) ;2 {2, 8} , but
we have also d(13, 5) = 2 and d(5, 11) = 1 then F4(2) = C2 • We conclude that
2 makes a jump reaching 11 which belongs to the circuit a distant 2 from c
in the chain; this is possible because with a path of length 4 starting from 2
we can reach two different cut-nodes, here 5 and 13. As results, if a jump is
allowed to each node belonging to an extreme circuit, "(' (A) decreases by 1.
In our example, from F 2 *4(C2 ) = C2 , we have "('(A) = 2 (11 cannot make a
jump) as it is easy to verify computing A kp powers and noting that Al,l > > 0,
A 2 ,2 » 0, A 3 ,3 » 0, A 4 ,4 » ° for k 2: 2.

We can now state the following theorem, without proof, and a corollary,
to introduce the more complex cases.

Theorem 12 Let A be irreducible, satisfying assumption 1 and with chains


of maximal length lmax = 3, then "('(A) = l - 1.

Corollary 5 Taking chains of lmax > 3 we have "('(A) ::; l - 1.

Proof. The worst case is when no nodes can make jumps, but also under these
conditions, for the first three circuits Theorem 12 holds, implying the thesis. 0
64

In order to analyze cases for which -y/(A) < l-1 (at least a jump for each
node in an extreme circuit), at first, we have to consider the cut-nodes positions
and we begin by introducing assumption 3 to discriminate between x's (y's)
values in order to find the interesting ones. We distinguish between h(= p)
even or odd: in the first case x's set is X = {I, 2, ... , ~} because starting from
an extreme of the chain, we say 1, to reach the other, we say l, and finding
x = h - 1 is like starting from 1 to reach 1 with y = 1, for x = ~ it is the
same starting from 1 or from l. In the second case, we have X = {I, 2, ... , h21 }
since, given x and y satisfying x + y = h relative to the chains taken as 1 - - 1
and 1 __ 1, if x = h21 then y = ~, taken x + 1 we would have the same
results but relative to the inverted paths (l __ 1 and 1 - - l). Secondly,
we have just remarked Example 1 that the conditions for a jump are satisfied
differently between nodes belonging to a class but it is most important to note
that it is also true between classes; moreover it will be shown that this depend,
under assumptions 1 and 3, on the position taken by the node identifying the
class in the first circuit, relative to the first cut-node tl' To show it,call node
with its class index r, indexed by the circuit number: that is the notation ri
corresponds to the node belonging to the r-th class in the i-th circuit. The
distance between an rl and the first cut-node is than noted d(rl' td and, by
assumption 1, d(rI, t 1 ) = b :s; h with equality for rl = tl' By assumption 3 we
have, for the other circuits, d(ri' ti) = p where p satisfies:

b + (i -1)x = qh + P (8)
For example, let us investigate the class r such that d(rl' tl) = x (other
classes behave similarly); we know the length of the path joining rl to r2
to be h and this path may be decomposed into two parts which distances
between nodes are d(rl' tl) = x and d(tI, r2) = h - x. Obviously we have
also d(r2' tl) = x and, by assumption 3, d(tl' t2) = x, then d(r2' t2) = 2x =
x + (2 - l)x. The same computations leads to the stated property of d(ri' t i ).
Moreover we consider only the reminder of the division by h because of the
periodicity stemming from assumption 1 which implies that: if we can reach
a node through a path of length p + mh, where m is any positive integer.
The importance arises from the following remark: when in a circuit i # 1
we have d(ri' ti) :s; x, taking the chain in the opposite direction we note that
d(ri+l' ti) = d(ri' ti) :s; x and, using the fact that d(ti' ti-l) = y = h - x, we
have d(ri+l' ti-d :s; h; i.e. the path joining ri+1 to the node ri contains two
cut-nodes, or ri+1 makes a jump. In the following we will more generally say
that a jump is allowed to the class r. The last remark is that the number of
circuits needed to allow a jump to a class r is a non increasing function of
d(rI, tl)'
65

Divisibility of h by x
Our purpose here is to state a general formula to calculate "(' (A) when h is
divisible by x under assumptions 1 and 3. We will start analyzing the extreme
cases relative to x's values (1 and ~, we consider h odd but results are similar
for h even) fully describing the proof and, in a second time, stating the men-
tioned formula from obtained results.

Theorem 13 Let A be irreducible satisfying assumption 1 and 3 with x = 1.


Writing the chain length as 1 = mh + r where m is the quotient and r the*
reminder, then "('(A) = 1 - c c m:
with = { 1 !~ ~; ~ .
Proof Under the hypothesis, the class r such that d(rb td = x = 1 in the h-
th circuit is a cut-node position (if an h+ 1-th circuit exists) and in the h+ 1-th
is the same as in the first; that is rh == th and d(rh+1' th+1) = 1, moreover, from
equation 8, this setting is cyclically repeated. We have to remark that all other
nodes become cut-nodes in a lower number of circuits which is consistent with
the last remark made above. It is important to check after how many circuits
a cut-node position is reached by the class r because, under the assumptions,
if and only if it is attained a jump is allowed to every node. The if part is
obvious: if rh == th we have rh == rh+l, it belongs to circuits hand h + 1, but
d(rh' rh-d = h then also d(rh+1' rh-l) = h. For the only if, remind that the
jump is allowed when d(ri' ti) ::; x. Here we have x = 1 then d(ri' ti) ::; x is
satisfied only for d(ri' ti) = 1 that, from definition of x, equals d(ti' ti+1) or
ri == k Considering that "('(A) decreases by one each time a jump is allowed
to each node and that this situation is cyclically verified every h circuits, we
will have: "('(A) = 1 - 1 for 1 ::; h; "('(A) = 1 - 2 for h + 1 ::; 1 ::; 2h and, more
generally, by induction on l, "('(A) = 1 - c. 0

Before stating the theorem relative to x = ~, we note that, from divis-


ibility of h by x, every node will take ~ different positions and only one of
them is either between two cut-nodes or is a cut-node itself. The last classes
fulfilling this condition, all for the same number of circuits, are those for which
d(rb tl) < x; in the following we will choose r such that d(rl' tl) = x -1. Note
that d(ri' ti) = x - 1 holds for the first time, from equation 8, for i = ~ + 1
and, more generally for each i = milx + 1 where m is a positive integer. Un-
der this setting, a jump is allowed to such a class r if exists a circuit i + 1
where i satisfies the conditions above. This because, as we just explain for
x = 1, the jump is allowed when d(ri' ti) < x and the existence of ti, from the
66

ordering of cut-nodes we carried out, implies the existence of an i+ 1-th circuit.

Theorem 14 Let A be irreducible satisfying assumptions 1 and 3 with x = ~.


Writing the chain length as l - 1 = m* + r, then ,),'(A) = l - c with c =
'2
m if r = 0
{ m+ 1 if r # 0 .
skip 0.4 truecm Proof Following the previous way, we note that each class
takes only two symmetric positions in circuits; for example the class for which
d(rl, tl) = x -1 in the first circuit, satisfies d(r2' t2) = x-I + ~ in the second,
d(r3' t3) = x-I in the third and so on. This means that conditions for jump
are fulfilled foe the first time at i = 3 and than each i = m2 + 1 (by equation

as in the previous proof, we have ,),'(A) = l - c as stated.


*
8); always assuming the existence of a circuit i + 1 (this is the reason for which
we consider the length of the chain minus one'). If we write = 2, proceeding
'2
0

Also Theorem 13 may be rewritten in a similar way setting l = mh + r,


then, in a more general way, we can state the next theorem.

Theorem 15 Let A be irreducible satisfying assumptions 1 and 3 with h di-


visible by x. Writing
l-1 = ml!x + r for x> 1
l = ml!x + r for x= 1
then
,),'(A) = l - c
with
m if r = 0
c= { m+ 1 if r #0

Indivisibility of h by x
We analyze cases for which h is not divisible by x and x > 1.

Theorem 16 Let A be irreducible satisfying assumptions 1 and 3 with h not


divisible by x. Writing
h
l-1 = mCCD(h,x) +r
------------------------
aThis problem does not arise with x = 1 because the class rafter h circuits cover a
cut-node position belonging at the same time to circuits hand h + 1.
67

then
')"(A)=l-c
with
mGC;(h,x) if r= 0
c= {
mGC;(h,x) + 1 if r = 1

skip 0.4 truecm Proof. Every class will take the positions congruent mod-
ulo GCD(h, x) to the starting one, the first circuit situation will be repeated
the first time in the circuit which is distant 1 + GC;(h,x) from the first and
than cyclically in circuits which distances from the first may be written as
1 + m GC;(h,x) ' where m is a positive integer. Now, with divisibility, also the
uniqueness of the intermediate or cut-node position reachable by each class
disappear. In more detail, if GC;(h,x) > 1, the number of such positions
is GC;(h,x)' taken as multiplicative factor of unitary increments of c every
h .. 0
GCD(h,x) CIrCUIts.

As stated in the theorem, we know that every GC;(h,x) circuits the condi-
tions of the jump are verified GC;(h,x) times and now, in order to check every
single jump, we propose to decompose GC;(h,x) in GC;(h,x) addenda, each
one representing the number of circuits needed to obtain the next jump. In
other terms we have to write
h
GCD(h, x) = eo + el + ... + e ac6(h,x} 1

where ei (i = 0,1, ... , GC;(h,x) - 1) are the said addenda. Going on, we may
express the intermediate positions reached by the node in x - I in the first
circuit in the other GC;(h,x)' without considering the order, as

x -ljx -1- GCD(h,x)j ... jX -1- (GC~h,x) -1) GCD(h,x).


Such positions may be linked to the ei saying that ei is the lowest integer
multiplying x such that from position x -1- GCD(h, x) we come back in one
of the listed positions. For example, starting from x - I (i = 0), it will be
sufficient to find the first multiple of x lying between two cut-nodes and the
multiplier of x will be the eo value. We have to repeat the operation to find all
ei values and their order to restate the previous theorem in a more general way.
68

Theorem 17 Let A be irreducible satisfying assumptions 1 and 3 with h not


divisible by x. Writing
h
1-I=m GCD (h,x) +r

then
"Y'(A)=I-c
with
r=Q if m=Q
m x
GCD(h,x)

1
if

if
{ GCn1b,x)
E
i=O
2
ei<r~Q

l~r~eo
if m;:::: 1

3.3 Hints to compute the ''primitivity index "under assumption 1


When only assumption 1 holds, we have the lack of cyclicity for classes in taking
the necessary position to jump; the consequence is that the binding chain is
not necessary the longest one. It is easy verified looking at the following simple
example where we can apply, for simplicity, the previous results.

Example 2

Suppose h = 10 and only two chains with different I and x, but with x constant
in each chain (necessary condition to apply previous theorems); in particular
we choose {x = 2; 1= 1O} and {x = 5; 1= 12} . Following the length criterion
we shall select the second chain for which, h divisible by x, applying Theorem
15 we have: 12 - 1 = m 1~ + r with m = 5 and r = 1, then c = m + 1 = 6
and so "Y'(A) = 12 - 6 = 6. Applying the same theorem to the first chain (the
shortest) we have: 10 - 1 = m~o + r with m = 1 and r = 4 then c = 2 so
"Y'(A) = 8. In conclusion the first chain is more binding than the second even
if it is shortest.

As for the criterion to select chains, also the methodology followed to com-
pute "Y'(A) is no longer valid so we have to choose an alternative method. In
the following we will present an example showing this difference and useful to
69

state the last theorem.

Example 3

First of all, we assume to work on the binding chain or, alternatively, the pro-
posed methods have to be applied to all possible chains to select the significant
one. In the single chain, the aim is always to find the maximal number of jumps
allowed to all classes, looking at positions covered by each class in both chain
directions, going from the first circuit to the last and viceversa. We consider
the result allowing the lower jump conditions, computed dividing the number
of positions found by h, where the quotient is the number of jumps allowed
to all classes and the reminder is the number of classes that may jump once
more.

Figure 3: A chain verifying assumption 1.

Suppose now to have a chain like in figure 3, with h = 6 and l = 6, where


we marked the nodes with residual class indices. We know that interesting
positions are either those of cut-node or those between two cut-nodes so, to
count them, we have to exclude the extreme circuits and the figure is given
by the number of nodes forming the path joining the two cut-nodes belonging
to extreme circuits. Calling 9 and b the a -----+ f and f -----+ a directions
respectively and noting Pg and Pb the favorable positions in 9 and b, in our
example we have Pg = 12 and Pb = 14. Moreover, the number of jumps s to
use in the equation 'Y'(A) = l - (1 + s) is the lowest between Sg = Pa~r9
70

and Sb = P!,~Sb obtained from the mentioned relation p. = s.h + r.; in the
example, Sg = 2 with r g = 0 (all nodes can jump twice) and Sb = 2 with rb = 2
(all nodes can jump twice and two, 3 and 4, three times in direction g), then
,..'(A) = 6 - 3 = 3.
Note that we can count the total favorable positions PT adding pg and
Pb minus all the cut-nodes t, here counted twice, having PT = Pg - Pb - t =
12 + 14 - 5 = 21. Alternatively and more directly, we could compute PT, by
assumption 1, multiplying h by l - 2 minus t - 2 or 6 * 4 - 3 = 21. This is
useful because now it suffices to compute one between Pg and Pb obtaining the
other by difference with PT.
Following the notation introduced in the previous example, we can now
state the last theorem.

Theorem 18 Let A be irreducible satisfying assumption 1 and the chain be


the binding one, then

,..'(A) = l - c with c = 1 + min(sg, Sb).

4 Conclusions

As we expected, in the first part of this paper we arrive to lower upper bounds
for primitivity index as stated in the previous literature by applying the con-
cepts of graph theory and using different techniques to show results. In the
second part we highlight the relation between periodic and primitive matri-
ces extending the concepts of primitivity index, defining ,..'(A) , to the class
of periodic matrices. The cases we analyze are simple and particular because
in this study we never drop assumption 1; they are intended to represent a
possible way to analyze the more complex cases exploiting the demonstration
technique. The interest of this extension is in term, as we already mentioned, of
the qualitative analysis of indices relations: if, for example, we are concerned
with periodic matrices and our time-scheduling coincide with the period, it
could be interesting to know when the diagonal blocks of the canonical form
become strictly positives. We think to input-output analysis where powers
of the matrix describing the economy show relations between industries at
different levels of production; alternatively, we can consider a Markov chain
representing the market demand transitions between firms belonging to the
same industry and so on.
71

Acknowledgements

I am thankful to Professor Silvana Stefani, Professor Anna Torriero and an


anonymous referee for their useful suggestions, comments and encouragements.
This work was supported by and MURST 60% 1994 (A. Torriero).
All errors are my own.

References

Berman, A. and R. Plemmons, 1979. Non negative matrices in the math-


ematical sciences. Academic Press, New York.
Bollobas, B. , 1979. Graph theory an introductory course. Springer Verlag,
Berlin.
Buckley, F. and F. Harary , 1990. Distance in graphs. Addison Wesley,
Redwood City,
Graham, A. , 1987. Non negative matrices and applicable topics in linear
algebra. Ellis Horwood, Toronto.
Leontief, W. , 1986. ~Technological change, prices, wages and rates of re-
turn on capital in the U.S. economy:» In Input-Output Economics. Ox-
ford University Press, New York.
Pasinetti, L. , 1989. Lezioni di teoria della produzione. n Mulino, Bologna.
Revuz, D. , 1984. Markov chains. North Holland, Amsterdam.
Seneta, E. , 1981. Non negative matrices and Markov chains. Springer Ver-
lag, New York.
Van, C. ,1972. L 'analisi delle interdipendenze strutturali. n Mulino, Bologna.
72

A COMPARISON OF ALGORITHMS FOR COMPUTING THE


EIGENVALUES AND THE EIGENVECTORS OF
SYMMETRICAL MATRICES

S. CAMIZ
Dipartimento di Matematica
UniversitO. di Roma la "La Sapienza"

v. TULLI
Dipartimento di M etodi Quantitativi
Facoltd di Economia, Universitd di Brescia.

In this paper traditional methods used for computing the eigensystem of symmetri-
cal matrix are compared with a procedure in which one stage, Le. the computation
of eigensystem ofsymmetrical tridiagonal matrix, currently performed through LR,
QR or QL factorization or by bisection method, is replaced with a technique re-
cently proposed in leterature, since it enables a parallel implementation. The
resulting global method reduces by one order the computational complexity as
well as time execution, especially in parallel implementation, despite a little loss
of precision. Better results are obtained dealing with high order matrix.

1 Introduction

The problem of calculating the eigensystem of a symmetrical matrix was widely


treated in literature (Ralston, 1965; Wilkinson, 1965; Acton, 1970; Ortega,
1972, 1990; Smith et al., 1976; Fontanella and Pasquali, 1977; Atkinson, 1978;
Golub and Van Loan, 1983; Press et al., 1986; Bini et al., 1988) both from
the point of view of the method definition and of the comparative analysis
of computation and time. The evaluation of the most appropriate method to
particular situations or requests (sparse matrices, block matrices, computation
of some eigenvalues, of all eigenvalues, of both eigenvalues and eigenvectors)
was also studied.
The object of this study consists in evaluating the convenience of replacing
the usual procedures of the computational stage of eigenvalues and eigenvec-
tors of a symmetrical tridiagonal matrix, which is part of some of the above
mentioned existing methods, with another more recent method called Divide et
Impera (Cuppen, 1981; Krishnakumar and Mod, 1986; Dongarra and Sorensen,
1987; Gill and Tadmor, 1990; Sorensen and Tang, 1991; Gu and Eisenstat,
1995). The advantage of this new method is estimated comparing it with the
existing ones.
Various aspects of the Divide et Impera method can load an increased
or "different "efficiency in comparison to other methods: on the one hand
73

it allows a reduction in the computational complexity, from O(n 3 ) to O(n2 ),


on the other hand it allows reduction in time, particularly if it is parallel
implemented.
Section 2 contains an overview of the existing methods for computing
the eigenvalues and the eigenvectors of a symmetrical matrix, with particular
reference to those involving the tridiagonalization of the original matrix. In
fact it is the spectral factorization of a symmetrical tridiagonal matrix that
can be replaced with the Divide et Impera method, described in section 3.
Section 4 describes the problem of the eigensystem of a ROM matrix (Rank
One Modification), i.e. a matrix obtained modifying a diagonal matrix by a
one-rank matrix, which represents its fundamental part. Section 5 shows the
results of some numerical examples and the relevant conclusions.

2 The Eigenvalue Problem of Symmetrical Matrices

The problem of computing the eigenvalues of a matrix of order n involves


calculating the zeros of a polynomial, a process that in case of a high-power
polynomial is necessarily infinite. This means that the solution is obtained
through a non-finite iterative process, stopped according to a specified conver-
gence criterion.
All the proposed methods, although generally not based on the compu-
tation of zeros, involve a non-finite number of iterations. Some of them are
based on the strategy of reducing the iteration complexity. More precisely, in
most cases instead of computing the zeros of a polynomial, the symmetrical
matrix is transformed into a similar one (thus having the same eigenvalues),
whose particular form makes the eigenvalues computation easier or someway
shorter.
The problem is commonly divided into stages, each of whom can be devel-
oped in different ways. The main stages are:

- similarity transformation of the original matrix;

- computation of the eigenvalues of the transformed matrix;

- computation of the corresponding eigenvectors.

These stages may be separately examined having clear, however, that the
complete solution of the problem requires in some case the combination of
some of them, as is shown in Figure 1.
The methods considered in this study - just some among the many devel-
oped so far for solving this problem - are based on the following procedures:
~
METHOD FOR EVALUATING THE EIGENSYSTEM OF A SYMMETRICAL MATRIX

CHARACTERISTIC POLYNOII~ ~ ..... liN.£Ak .~it~.

D~NAlllATlON .

FACTORIZATION (lR .9l ..I!RL

~.
SYWWHRIC.\l "'TR~
i RECONSTRUCTION
. EICENVECTOR
CHARACTER5TC POlYNOW~ .
. ! IMTRIX

B5ECTON
!~RS( ITERATION

DMIJ£ £T IWPERA )

ECENVAlU[S '- _E~~~~~ __ _

Figure 1: Methods for computing eigenvalues and eigenvectors of a symmetrical matrix


75

1) Transformation of the Original Matrix to a Diagonal Form


(Jacobi's method: Bini et al., 1988; Golub and Van Loan, 1983; Ralston, 1965;
Wilkinson, 1965): this transformation, which can always be carried out for
symmetrical matrices, calculates both eigenvalues et eigenvectors at the same
time: the eigenvalues are set on the diagonal of the transformed matrix and the
eigenvectors form the columns of the transformation matrix. Such a method
is an extremely concise tool for the complete solution, but has the drawback
to be dependent on the matrix elements, i.e. it is particularly slow when the
diagonal elements are small compared to the other elements.
Given a symmetrical matrix A, the process of diagonalization on which
the Jacobi method is based consists in multiplying A by a sequence of orthog-
onal transformation matrices Pi, called plane rotation or Jacobi's or Givens'
rotation, each of whom turns into 0 a specified non-diagonal element of the
matrix. Since each rotation causes also other changes in the matrix, so that in
some cases elements already turned to 0 are modified, the diagonalization of
a matrix cannot be carried out by a finite number of iterations, but requires
appropriate convergence criteria.
Thus, the process involves the computation of the transformation matrices
Pi, and called Ai+! the transformed matrix of Ai via Pi, it results:
A ~ p 1 1 API = Al ~ P2- 1 A 1 P2 = A2 ~ P3- I A 2P3 = A3 ~ ... ~ A
where A = lim Ak is a diagonal matrix.
k-+oo
The transformation is defined as follows:
A~A=X-IAX
where A is the diagonal matrix whose elements on the diagonal are the
eigenvalues of A and X, obtained by the products of the matrices Pi is the
matrix of the eigenvectors. Jacobi method involves O(n 3 ) operations.

2) Transformation of the Original Matrix to a Tridiagonal Form:


the tridiagonalization of a symmetrical matrix, developed by either Givens
method (Givens, 1958) and Householder method (Martin and Wilkinson, 1968a;
Wilkinson, 1960), offers a remarkable computational advantage, since in a fi-
nite number of steps, the computation of the eigenvalues of the original matrix
can be replaced with the computation of the eigenvalues of a tridiagonal one.
The advantage thus obtained is evident, since, while O(n3 ) arithmetical opera-
tions for each manipulation of a generic matrix need (for instance the product
of two matrices), these are reduced to O(n) in a tridiagonal matrix.
The tridiagonalization process is carried out by a sequence of transforma-
tions: by Jacobi rotations in the Givens method and by Householder Transfor-
mations in the homonymous method. (Householder method transforms original
symmetrical matrix A into a tridiagonal symmetrical matrix S = T- 1 AT by
76

stable orthogonal transformations turning to 0 a certain number of the elements


in a row and in the correspondent column). In the first case (n - 1)(n - 2)/2
rotations for a total of 4n3 /3 operations are required, while in the second case
only (n - 2) transformations for a total of 2n 3 /2 operations. Nevertheless
the Givens method can take advantage from the possible presence of 0 in the
matrix A, requiring in this case less operations than the Householder method.
For the complete solution of the problem this stage has to be integrated
with a procedure for computing the eigenvalues of a tridiagonal symmetrical
matrix and with one for calculating the eigenvectors;

3) Computation of the Eigenvalues of a Symmetrical Tridiagonal Matrix:


different methods can be applied to compute the eigenvalues of a tridiagonal
matrix:
a) one method (Conte and de Boor, 1980) consists in computing the zeroes
of the associated characteristic polynomial, taking advantage of the use
of the following efficient recursive relation that is specific for computation
of a tridiagonal matrix polynomial coefficients:
given ai i = 1, ... , n the diagonal elements,
and bi and Ci i = 1, ... , n - 1 the upper and lower co-diagonal elements
respectively (with bi = Ci for i = 1, ... , n - 1 in case of a symmetrical
tridiagonal matrix), it results:
PO(A) = 1
Pl(A) = (al - A)
P2(A) = (a2 - A)Pl(A) - b2C2

Pn (A) is the characteristic polynomial.


b) by the bisection method (Barth et al., 1967; Fontanella and Pasquali,
1977) an interval containing a single eigenvalue is detected; then it is
splitted into two subintervals and which one contains the eigenvalue is
determined. This is again subdivided and so on until the root is iso-
lated in an interval, whose width is determined according to the required
precision. This method is based on the following result a: if v(p) is the
QThis result follows by two properties of the polynomials sequence defined in 3a):
i) The roots of Pi(>") are all real (being eigenvalues of symmetrical matrices).
ii) Two consecutives polynomials cannot have an equal zero.
(Fontanella and Pasquali, 1977: p. 300-301)
77

number of sign variations in the Pn(P),Pn-l(P), ... , Po(p) sequence (sup-


posing to assign to Pr(P) the sign of Pr-l (p) if Pr(P) = 0), then v{p) is the
number of zeros of Pn(P), and therefore of those eigenvalues of A strictly
greater than p. On the base of this property and supposed to have found
(e.g. through a method of localization) an interval [ao,.8o] with all the
eigenvalues, the method of bisection allows to improve the results.
For example, if >'1 > >'2 > >'3 > ... > >'n and v{ao) ~ k, v(.8o) < k,
then >'k E (ao,.8o). Let /Lo = (ao,.80)/2, define:
al = /Lo, .81 = .80 if v{/Lo) ~ k
al = ao, .81 = /Lo if v(/Lo) < k.
Repeating the procedure, after r steps >'k is localized in an interval
(a r , .8r) of width (.80 - ao) /2 r .
c) another way, most widely used, consists in factorizing a matrix as a
product of either
a) (LR method) a lower triangular one with elements on the diagonal
equal to 1 (L) by an upper triangular matrix (R), or
b) (QR method) an orthogonal matrix (Q) by an upper triangular matrix
(R), or
c) (QL method) an orthogonal matrix (Q) by a lower triangular one (L).
This method is much more efficient than the previous ones, particularly
if all the eigenvalues are required, in fact the factorization of a matrix,
which usually involves O(n 3 ) operations and therefore is not suggested,
requires in case of a symmetrical tridiagonal matrix only O( n) operations.
The LR factorization, usually associated to the Gauss method (Bini et
al., 1988) cannot always be carried out, based on the following theorem:

Theorem 1 Given a matrix A of order nand Ak its main head subma-


trices of order k, if Ak is non-singular for k = 1, ... , n - 1, then a LR
factorization of A exists and is unique (Bini et al., 1988:144).

Let As a squared matrix verifying the theorem hypothesis and LsRs


the correspondent factorization: As = LsRs. By inverting the fac-
tors'order the matrix As+l = RsLs is built, where Ls is non singular
being det(Ls) = 1. From Rs = L;1 As we obtain AsH = L;1 AsLs.
Through this method a sequence of matrices can then be created, each
matrix derived from the previous one through a similarity transforma-
tion. This sequence converges to an upper triangular matrix (or equiva-
lently Ls converges to the identity matrix) showing on its main diagonal
78

the eigenvalues of the original matrix, since all the matrices in the se-
quence are similar.
The LR algorithm for calculating the eigenvalues of a tridiagonal matrix
involves however some problems: this matrix is computed by the Gauss
elimination method, which implies, for the stability, that some rows of
the matrix have be usually exchanged so that the biggest elements (pivot)
are found on the diagonal. The interchanges cause however the loss of
the tridiagonal form (the lower triangle zeros remain, while upper trian-
gle ones get gradually lost), so, for this reason, the algorithm cannot be
efficiently applied to tridiagonal matrices. The basic process can how-
ever be used, after some required adjustments, to obtain a more efficient
algorithm. Each factorization of a matrix in the product of other two,
(which are then multiplied in inverted order), corresponds to a similarity
transformation. Since the problem of the LR algorithm is linked to the
stability of the method, a better procedure may be obtained by using a
stable factorization, as that provided by an orthogonal transformation.
The QR algorithm developed by Francis (1961) and Kublanovskaya (1961)
is related to the LR algorithm and is such that each real matrix A can
be decomposed in the form A = Q R, where Q is an orthogonal matrix
and R is an upper triangular matrix, as stated by the following theorem:

Theorem 2 Given a real matrix A, then exist an orthogonal matrix Q


and an upper triangular matrix R exist, such that A = QR.

The decomposition can be carried out by the Householder transforma-


tions in order to turn to 0 the columns of A below the diagonal. To
examine the QR algorithm consider a matrix As and its factorization
As = QsRs; the matrix obtained multiplying in inverted order matrices
Qs and Rs is:
AsH = RsQs and from Rs = Q;l As = Q~As
you obtain:
As+1 = Q;AsQs.
A sequence of orthogonal similar transformations is obtained (in this
way very stable), which maintain symmetry and do not require rows ex-
changes that may alter the matrix original form.
The sequence of {As} matrices converges to an upper triangular matrix.

The QL algorithm is analogous to the QR and is based on the following


statement:
79

Theorem 3 Given a matrix As an orthogonal matrix Q s and a lower


triangular matrix Ls exist so that
As = QsLs (Ls = Q;l As = Q~As)

In this case the algorithm consists in calculating a sequence of orthogonal


transformations, so that:
A S +1 = LsQs = Q~AsQs
so that {As} converges to a lower triangular matrix.

4) Computation of Eigenvectors after Matrix Tridiagonalization:


When the eigenvalues are computed by the original matrix transformation,
the computation of the eigenvectors proceeds accordingly. As already said,
the method for computing eigenvalues proceeds in two stages: a) the reduc-
tion of the matrix to tridiagonal form by transformations which preserves the
eigenvalues, and b) the computation of the transformed matrix eigenvalues.
Eigenvectors are computed using the same procedure: first the eigenvectors
of the transformed matrix are obtained, then those of the given matrix are
desumed.
Let B a matrix obtained by a sequence of k similarity transformations {Ti}
such that T 1 T2 ... Tk = T. It results:
T;;l ... T2- 1 T1 1 ATIT2 ... Tk = T- 1 AT = B.
Defining Ty = x, where x is an eigenvector of A corresponding to the eigen-
value A, it results:
Ax = AX,
and premultiplying by T- 1
T-l Ax = T- 1 AX,
i.e.
T-lATy = T-lATy = AY => By = AY .

Thus, if y is an eigenvector of the transformed matrix B, from Ty = X


the eigenvector of the original matrix A corresponding to the same eigenvalue
may be calculated.
From a computational point of view, the chain of products T 1T2 .. · Tk
should be carried out in order to obtain the matrix T, but in practice the
products can be accumulated while transforming the matrix into a tridiagonal.
The number of computations to be made is usually smaller than the number of
those required to calculate the matrices T i . In this case there are no problems
of stability and only an additional matrix is required for storage.
80

5} Computation of each Eigenvector by the Corresponding Eigenvalue (In-


verse Iteration):
some methods allow to calculate the eigenvectors one at a time, using the cor-
responding known eigenvalue. One of these methods, called inverse iteration
can be efficiently used also to compute all eigenvectors.
By the method of inverse iteration, being 8 a general matrix, b any vector
and A a specified value very close to the eigenvalue Ak (optimal condition for
A = Ak), the solution x of the system (8 - AI)x = b, tends to be very close to
the correspondent eigenvector Uk. Moreover, if the computed vector x, which
is an approximation of Uk and therefore may be indicated by x = u~, replaces
b, leading to the system (8 - AI)X = u~, the new solution comes to be a better
approximation of the required eigenvector than the previous one.
The initial choice of vector b is not critical: like all the first approxima-
tions, a good one reduces the number of the required iterations, but a bad one
does not make the process much worse.
Iterations are computationally inexpensive, since 8 - AI can be inverted
only once, and then used whenever required, so that in practice no more than
two or three iterations are usually required.
Let consider the system (8 - AI)X = b and define x and b as a linear
combination of the eigenvectors Ui:
x = 2:i ')'iUi and b = 2:i f3i U i.
It results:
(8 - AI)(2:i 'YiUi) = 2:i f3i U i --+ 8 2:i ')'iUi - A 2:i ')'illi = 2:i f3illi --+
2:i ')'i8i Ui - A 2:i ')'iUi = 2:i f3i U i and being 8 i Ui = AiUi
2:i ')'iAiUi - 2:i A')'iUi = 2:i f3i U i
2:i ')'i(Ai - A)Ui = 2:i f3i u i;
from the last expression the coefficients ')'i result:

thus:

n p.
X= "'--'-Ui
L.J A. - A
i=l '

This expression of x shows that, if A is close to Ak and far from the other
eigenvalues and 13k is not too small, then x comes to be close to Uk as stated
(unless normalization). Moreover, if x is replaced with b, the denominator
Ak - A is squared and its power increases at each iteration, making x more and
more similar to Uk.
81

This process can be used with any matrix 8, but proves to be particularly
efficient with a tridiagonal one. In this case the system (8 - AI)x = b can be
solved by stable and efficient algorithms: the Gauss method (triangularization)
can be used for instance after the required permutations of rows and at the
k-th step only two rows must be considered: the k-th and the k + 1-th row
since all the lower elements in the k-th column are O. In this process each
iteration requires only 6(n - 1) products.
If the eigenvalues are well separated, convergence is very rapid (cubic). In
case of multiple or very close eigenvalues, eigenvectors are not usually orthog-
onal.

The Divide et Impera method for computing both eigenvalues and eigen-
vectors of a symmetrical tridiagonal matrix can replace steps 3 (a, b or c) and
4 or 5.

3 The Divide et Impera method for Computing Eigenvalues and


Eigenvectors of a Simmetrical Tridiagonal Matrix

The Divide et Impera method (Cuppen, 1981; Krishnakumar and Mod, 1986;
Dongarra and Sorensen, 1987; Gill and Tadmor, 1990; Sorensen and Tang,
1991) is completely different from all the above considered methods for com-
puting a symmetrical tridiagonal matrix eigensystem. Like all the Divide et
Impera techniques it starts calculating both eigenvalues and eigenvectors of
a certain number of very small submatrices (order 2,3, or 4) of the original
matrix. The results are then used two by two to calculate the eigensystem of
bigger matrices. This procedure is iterated until the information required on
the original matrix are obtained.
Give a symmetrical tridiagonal matrix T of order N = 2n , the algorithm
can be summarized as follows:
A) n - 1 partitions of the matrix T are performed, where each partition
consists in dividing each significant matrix block (with more than one
element not zero) into four subblocks, obtaining 2n - 2 pairs of 2 x 2
matrices (besides the blocks in the lower left or in the upper right corners
of the resulting blocks have each one only one elements not zero).
B) eigenvalues and eigenvectors are calculated for each of the 2n - 1 matrices
of order 2, conveniently modified by the correspondent element outside
the block.
C) an iterative process is started (which will be described in the following),
to calculate (by using for each iteration the results obtained by the pre-
82

vious one, and by considering the correspondent elements outside the


blocks) the eigensystem of some matrices.
The method is based on the following steps:

I eigensystem of 2n - 2 matrices of order 22 ,


i.e. of 2n - 3 pairs.
II eigensystem of 2n - 3 matrices of order 23 ,
i.e. of 2n - 4 pairs.
III eigensystem of 2n - 4 matrices of order 24 ,
i.e. of 2n - 5 pairs.

k - th eigensystem of 2 n -(k+1) matrices of order 2(k+l) ,


i.e. of 2n -(k+2) pairs.

(n - 2) - th eigensystem of 21 matrices of order 2(n-1),


i.e. of 1 pair.
(n - 1) - th eigensystem of 1 matrix of order 2n.

The algorithm clearly allows both a recursive serial implementation as well as


a parallel one.

The heart of this procedure (step C) lies in getting information on bigger


submatrices, by using the results relative to couples of smaller submatrices.
This step is carried out by solving the so-called updating problem (Golub,
1973; Bunch et al., 1978; Cuppen, 1981; Dongarra and Sorensen, 1987) i. e. a
spectral decomposition of a matrix obtained by modifying a diagonal one by a
rank-one matrix (Rank One Modification).

To show this method let us consider a partition of the matrix T in four


blocks and observe how from the eigenvalues and the eigenvectors of the two
diagonal blocks we can calculate the eigensystem of the matrix.

Let T a symmetrical tridiagonal matrix and assume that N even, N =


2m and that TN already in its unreduced form, i.e. ti,i+1 =1= 0, 1 ~ i ~
N - 1; (otherwise TN can be decoupled into smaller unreduced symmetrical
83

tridiagonal matrices), then the matrix

tN-I.N
tN.N-I tN.N
(1)
can be divided into the sum of
tn tI2
t2I t22

t",.m - f3 o
+ (2)
o

tN-I.N
tN.N-I tN.N

0 0

1 1
+(3
1 1

0 0
where (3 = tm.m+l.
That is

1+ ~bNb;"
[ T(l)
N/2
TN= b N-e
- (m) + eN
(m+1)
, (3)
N
T(2)
N/2

where the blocks T~I}2 and TJ}}2 are matrices of order N /2, (3 = t m •m +1 =I: 0
is the link between these two blocks and eN = (ell e2, e3, ... , eN) with ei =
o for every i =I: m, em = 1, e~+l = (el' e2, e3,"" eN) with ei = 0 for every i =I:
m + 1, em+l = 1.
By the Divide et Impera algorithm, the problem of computing the eigen-
system of N-dimensional symmetrical tridiagonal matrices, is reduced to the
84

problem of N /2-dimensional symmetrical tridiagonal matrices. In particular,


if TJ.:}2 and T;;}2 have respectively the following spectral decomposition

(1) (1) t
}JJV/2}JJV/2 == IJV/2'
(2) (2) t
}JJV/2}JJV/2 == IJV/2' (4)

then one can compute the spectral decomposition of the TJV matrix of order
N by the following procedure:

I) the N-dimensional vector (with unitary norm) ZJV is computed:

1 [ I')
ZJV == J2
}JJV/2

}J(2)
JV/2
j'b N bJV ==
1

1..
(5)

so that in (3), (4) and (5), TJV is unitarily similar to a matrix obtained
by modifying a diagonal matrix through a rank-one matrix (Rank One
Modification). It results:

[ pll) AI') pll) ,


JV/2 JV/2 JV/2
TJV ==
}J(2) A (2) }J(2) t
1+ /lbNbj.,
JV/2 JV/2 JV/2

[ pll)
JV/2
]( [ A~)2 1+ 2/lZNZj., )
}J(2) A (2)
JV/2 JV/2

[ pll)
JV/2
,

p(2) ,
JV/2
1
==
[ pili
JV/2 1 + 2,8zJVz~)
[ }JJV/2
11)'

}J(2)
JV/2
(DJV
p(2) ,
JV/2
1
85

(6)
A(2)
N/2
1
II) the updating problem is solved by computing the spectral decomposition
of the ROM matrix through the methods that will be describe in the
following paragraph

DN + (TZNZ~ = QN ANQ~, QNQ~ = IN, (T = 2/3. (7)

III) the unitary matrix


p{1)
N/2
PN= QN (8)
[ p(2)
N/2 1
is computed, and from (7) and (8) the spectral decomposition of TN
results:

TN =

=
[ p(')
N/2

PNANP~,
p(2)
N/2
1 QNANQ~
[ N/2
p(')

p(')
N/2
r
PNP~ IN. (9)

4 Eigenvalues of a Matrix Modified by a Rank-One Matrix (Rank


One Modification)

Let D = diag(di ) a diagonal matrix of order nand u a unitary vector of the


same order.
Let C = D + (TUU t a ROM matrix, i.e. a matrix obtained by modifying D
through a rank-one matrix (Golub, 1973; Bunch et al., 1978) and Al, A2,"" An
the eigenvalues of C, and assume Ai ~ Ai+l and di ~ di +1; it is proved (Wilkin-
son, 1965) that:
if (T 2: 0, di ~ Ai ~ di +1, i=1,2, ... ,n-1
d n ~ An ~ d n + (Tutu;
if (T ~ 0, di - l ~ Ai ~ di , i = 2, ... ,n
dl + (Tutu ~ Al ~ dl . (10)
86

The eigenvalues of the matrix C satisfy the equation

det(D + ITUU t - AI) = 0 (11)

that is proved to be equivalent to the characteristic equation:


n n n
CPn(A)=II(di-A)+ITLU~ II (dj-A)=O (12)
i=l

Thus, from
k k k
CPk(A) = II (d i - A) + IT L u~ II (dj - A) (13)
i=l
it follows
CPk+1(A) =
k k+1 k k+1
= (dk+1 - A) II
(d i - A) + ITU~+1 II
(dj - A) + IT L u~ II (dj - A)
i=l
k k+1 k k
= (dk+1- A) II (di-A)+ITU~+l II (dj -A)+IT(dk+1- A) L u~ II (dj-A)
i=l
k
= (dk+1 - A)CPk(A) + ITU~+1 II (dj - A)
j=l
k
and set 'l/Jk(A) = II (d j - A) it results
j=l
(14)
Therefore

CPk+1(A) = (dk+1 - Acpk(A) + ITU~+l'I/Jk(A) k = 0, 1, ... ,n - 1


'l/Jk(A) = (d k - A)'l/Jk-l(A) k = 1,2, ... , n-l
con 'l/Jo = CPo = 1. (15)

The characteristic equation of matrix C can be defined by those recursive


formulas, while to solve the equation and calculate the eigenvalue other well-
known methods can be used. For instance, by differentiating equation (15)
87

with respect to A we can calculate <p~(A) for each value of A, and then use
Newton method to compute the eigenvalues.
Another technique to compute the eigenvalues of (D + auut)x = Ax is
based on the fact that, if Ui =F 0, i = 1,2, ... ,n, then

det(D + auut - AI) = det(D - AI)det(I + a(D - AI)-luut )

= g (d i - A) (1 + a ~ (diU~ A)) (16)

The eigenvalues of D + auut can be obtained by computing the zeros of


equation
n 2

W(A) == 1 + a ~ d. ~ A
" U· (17)
i=l '
To complete the study of the eigensystem of D + auut one may verify
(Cuppen, 1981) that eigenvectors ql, Q2, ... ,qn are given by

i (D - Ai)-lu
i = 1, ... ,n (18)
q = II(D - Ai)-luIl2

5 Numerical Applications and Conclusions

A portable Fortran 77 program was developed to evaluate the effectiveness of


Divide et Impera method; more precisely, to estimate to what extent Divide
et Inpera technique used after tridiagonalization may be more effective than
the other currently used procedures.
The program creates a symmetrical matrix from given (or randomly gen-
erated) eigenvalues and eigenvectors. Then it computes the eigensystem of a
given matrix, choosing among the described methods, giving for each one the
execution time and some measures concerning the result precision.
The program was tested on randomly generated symmetrical matrices of
order 5, 10, 20,30,50, 100 (10 matrices for each order) on a PC 386 computer,
thus with no parallelism implemented.
In particular each matrix was analyzed by the following procedures:

1) Jacobi diagonalization;

2) Givens tridiagonalization, followed by QL factorization and redefinition


of the eigenvectors matrix;
88

3) Householder tridiagonalization, followed by QL factorization and recon-


struction of the eigenvector matrix;
4) Householder tridiagonalization, followed by QL factorization and inverse
iteration;

5) bisection method followed by inverse iteration;


6) tridiagonalization followed by Divide et Impera.
In this non-parallel implementation, Divide et Impera, proved to be faster
than Jacobi and bisection methods, and competitive with all the others. As
an example, considering 20 x 20 matrices, the average time required by Ja-
cobi and bisection methods is around 25-30 seconds, while the others require
10-17 seconds and, among these, Divide et Impera requires 10-12 seconds.
The improvement of execution time depends on the fact that computational
complexity is reduced to (O(N2)).
From the tests performed we can state that Divide et Impera technique
proves to be a valid alternative method even when implemented in a serial
version, despite a non relevant loss of precision, that shifted from a 14th-
decimal-digit precision to the 11th.
It is evident that the best performance of Divide et Impera technique
may be obtained through a parallel implementation. By a mere theoretical
inspection of the procedure, many steps may be runned simultaneously (i.e.
merging matrices of a certain order to obtain results on matrices of higher
order), so that a parallel implementation should be higly significant in terms
of execution time, particularly with matrices of high order. As an example,
on a parallel computer with 8 processors a 128-order matrix eigensystem may
be obtained by computing simultaneously the eigensystem of 8 matrices of
order 16, then that of four matrices of order 32, then two of order 16. This
speeds up dramatically the process, compared to the serial evaluation of the
14 calculations to carry on sequentially.
During the last years new methods have been proposed in literature (Gu
and Eisenstat, 1995) in order to improve some steps of Divide et Impera (for
instance, to change the way the eigensystem of a ROM matrix is computed)
which could reduce the computational complexity from O(N2) to O(NlogN)
and improve the technical performance in terms of speed: it is likely that in
this way even computation accuracy may be improved.

Acknowledgments

This study was granted by CNR project n. 95.00641.CT15 (Sergio Camiz).


89

Although the whole paper is attributable to both authors, Yanda Tulli has
written in particular Section 3 and 4; Sections 1, 2 and 5 have been jointly
written.

References

Acton F. S. ,1970. Numerical Methods that Work, Harper International


Edition, New York, Evanston, London.

Atkinson K. E. ,1978. An Introduction to Numerical Analysis, John Wiley,


New York.

Barth W., R. S. Martin, J. H. Wilkinson, 1967. «Calculation of the


Eigenvalues of a Symmetric Tridiagonal Matrix by the Method of Bisect-
ion~ Numer. Math.,9: p. 386-393.

Bini D., M. Capovani, O. Menchi , 1988. Metodi numerici per l'algebra


lineare, Zanichelli, Bologna.

Bunch J. R., C. P. Nielsen, D. C. Sorensen, 1978. «Rank-One Modi-


fication of the Symmetric Eigenproblem~ Numer. Math., 31: p. 31-48.

Conte S. D. and C. de Boor, 1980. Elementary Numerical Analysis: An


Algorithmic Approach, 3 ed., McGraw-Hill, New York.

Cuppen J. J. M. , 1981. «A Divide and Conquer Method for the Symmet-


ric Tridiagonal Eigenproblem~ Numer. Math., 36: p. 17-195.

Dongarra J. J.and D.C. Sorensen, 1987. «A Fully Parallel Algorithm


for the Symmetric Eigenvalue Problem~ Siam J. Sci. Stat. Comput., 8
(2): p. 139-154.

Fontanella F. and A. Pasquali , 1977. Calcolo Numerico. Metodi e Algo-


ritmi 1, Pitagora Editrice, Bologna.

Francis J. G. F. , 1961. «The QR Transformation: A Unitary Analogue to


the LR Transformation, Part 1~ «The QR Transformation, Part 2~
Comp. J.,4: p. 265-272, 332-345.
Gill D. and E. Tadmor , 1990. «An O(N2) Method for Computing the
Eigensystem of N x N Symmetric Tridiagonal Matrices by the Divide
and Conquer Approach~ Siam J. Sci. Stat. Comput., 11 (I): p. 161-
173.
90

Givens W. ,1958. «Computation of Plane Unitary Rotations Transforming


a General Matrix to Triangular Form~ SIAM J. App. Math., 6 (1): p.
26-50.

Golub H. G. , 1973. «Some Modified Matrix Eigenvalue Problems~ Siam


Review, 15 (2): p. 318-334.

Golub G. H. and C.F. Van Loan, 1983. Matrix Computations, The Johns
Hopkins University Press, Baltimore, Maryland.

Gu M. and S. C. Eisenstat , 1995. «A Divide-and-Conquer algorithm for


the symmetric tridiagonal eigenproblem~ Siam J. Matrix Anal. Appl.,
16 (1): p. 172-191.

Krishnakumar A. S. and M. Morf , 1986. «Eigenvalues of a Symmetric


Tridiagonal Matrix: A Divide-and-Conquer Approach~ Numer. Math.,
48: p. 349-368.

Kublanovskaya V. N. , 1961. «On Some Algorithms for the Solution of


the Complete Eigenvalue Problem~ USSR Compo Math. Phys., 3: p.
637-657.

Martin R. S. and J. H. Wilkinson , 1968a. «Householder's Tridiagonal-


ization of a Symmetric Matrix~ Numer. Math., 11: p. 181-195.
Ortega J. M. ,1972. Numerical Analysis: A Second Course, Academic
Press,New York.
Ortega J. M. , 1990. Classics in Applied Mathematics. Numerical Analysis:
a second course. Siam.

Press W.H., S.A. Teukolsky, B.P. Flannery, W.T. Vetterling ,


1986. Numerical Recipe. The Art of Scientific Computing, Cambridge.

Ralston A. , 1965. A First Course in Numerical Analysis, Mc Graw-Hill


Book Co., New York.

Smith B. T., J. M. Boyle, B. S. Garbow, Y. Ikebe, V. C. Klema, C.


B. Moler, 1976. «Matrix Eigensystem Routines, EISPACK Guide~,
2nd ed., Lectures Notes in Computer Science 6, Springer Verlag, Berlin.

Sorensen D. C. and P. T. P. Tang, 1991. «OntheOrlhogonalityofEig-


envectors Computed by Divide-and-Conquer Techniques~ Siam J. Nu-
mer. Anal., 28 (6): p. 1752-1775.
91

Wilkinson J. H. , 1960. «Householder's Method for the Solution of the


Algebraic EigenproblemA> Comput. J.,3: p. 23-27.
Wilkinson J. H. , 1965. The Algebraic Eigenvalue Problem, Clarendon Press,
Oxford.
92

I/O ANALYSIS: OLD AND NEW ANALYSIS TECHNIQUES

s. CAMIZ
Dipartimento di Matematica "Guido Castelnuovo"
Universitd di Roma "La Sapienza"

In this paper, classical input/output analysis techniques are considered, triangu-


larization (Simpson and Tsukui, 1965) and subdivision (Chenery and Watanabe,
1958) based on both direct and total linkages. In addition, the exploratory cor-
respondence analysis technique (Benzecri, 1973; Hill, 1974; Lebart et al., 1984)
proposed by Abbate and Bove (1992) is described. All of them result to be poor
in revealing the complex structure of the flowing of commodities among economic
sectors, that is better represented by a graph (Lantner, 1974). The complete graph
analysis, considering such concepts as nodes centrality, simple and strongly con-
nected components and vulnerability, allows a deeper insight in the input/output
table, that may be very synthetically represented in graphical form. The use of
thresholds allows to identify the main flows, thus helping in the selection of the
most interesting information. As an example, the considered techniques are used
for the analysis of the 16 sectors of Italian 1988 economy (Istat, 1992).

1 Introduction

For a good comprehension of an economic system structure, a major help


resulted with the introduction ofInput/Output (I/O) tables by Leontief (1953),
where the economic system is partitioned into homogeneous sectors, whose
reciprocal relations are purchases and sales.
The construction of I/O tables is a delicate task, performed through inves-
tigations, surveys, estimates, calibration of results, etc. A clear outline with
references may be found in Abbate and Bove (1992). It is to be noted that
I/O table coefficients depend on country's economic system, in the sense that
sector exchanges may depend largely on technology, whereas value added and
final demand may depend on salaries, general policy, and country wealth, re-
spectively. For this reason, comparison of I/O tables corresponding to different
periods may show the dynamics of one country economy. One may distinguish
between domestic and total production tables, in the latter including import
(export is always part of final demand), so that comparison may be a good
help for evaluating the dependence of the system from the outer. Finally, com-
parison of different countries I/O tables may help in focusing their differences.
As most of investigations, analysis of I/O tables may be performed at
different stages of a multi-step research model (Tomassone, 1980; De Antoni,
1982; Camiz, 1994), where the aim is more precise and detailed, ranging from
exploratory analysis, as in the previous references, to most detailed econometric
93

models. In particular, exploratory analysis tools, to be used in I/O analysis,


should be able to synthesize the I/O tables and reveal their most important
features, in order to avoid the exam of thousands of individual figures, a task
very difficult to perform effectively.
In the I/O literature, several approaches were used so far, each one re-
ferring to some particular aspect of the tables. It is to be noted, however,
that the methods used first, less sophisticated, could not reveal sufficiently the
table structure, thus biasing even the theoretical interpretation of the results.
In the following, the most widely known methods are reminded, some more re-
cent exploratory analyses are described, and a more consistent representation
technique is proposed.

2 Technical coefficients matrices

An I/O table is a square n x n matrix F, whose elements lij are the value of
the i-th sector goods sold to j-th sector a. In addition to sectors, three vectors
are taken into account, final demand Y, value added V, and total production
X. The relations among matrix and vectors are described by the following
equations:
X=Fu+Y (1)

X'=u'F+V (2)
where u is a vector whose elements are all Is and the apostrophe stands
for vector or matrix transposition. In this way a complete information con-
cerning each sector total origin of expenses and total production destination is
provided.
Two matrices of particular interest may be obtained using equations (1)
and (2), namely
A= FX- 1 (3)

Q =X-1F (4)
where X is a diagonal (non singular) matrix whose elements are Xii = Xi . A
and Q are called technical coefficients matrices and represent respectively the
expense coefficients aij = lij/Xj, the value of sector i commodity necessary
to produce one unit of value of sector j production, and market shares qij =
lij / Xi, the share of sector i production sold to sector j.

aThe I/O table may be either square, thus referring to the same sectors both as input and
as output, or rectangular. If this is the case, some of the methods discussed in the following
lose their sense, in particular graph analysis.
94

Table 1: The classification of industries according to Chenery and Watanabe (1958).


LB, LF = mean linkage values.

II I
Manufacturing - final Manufacturing - intermediate
LFi < LF LFi > LF
LBi > LB LBi > LB

III IV
Primary - final Primary - intermediate
LFi < LF LFi > LF
LBi < LB LBi < LB

In order to compare different countries tables, Chenery and Watanabe


(1958) started from a classification into four classes, according to the value of
each sector backward and forward linkages «to focus on quite different roles
played by various sectors in the total process of production:» (p.494). In order
to evaluate the degree of interdependence of the sectors, they calculated direct
indexes of linkage as follows:

Lj% (5)
Liaij

Here, LFi is a forward linkage and measures the degree of interdepen-


dence of i-th sector with its purchasers, whereas LBj is a backward linkage, in
that it measures the degree of interdependence of j-th sectors with its sellers.
Considering the mean value of both backward and forward linkages, LB and
LF respectively, Chenery and Watanabe partitioned the set of sectors into
four blocks, according to whether linkages were above or below the the corre-
sponding mean. The sectors partition may be interpreted considering that low
backward linkages correspond to primary sectors, low forward to final ones,
whereas high linkages are typical of intermediate sectors: in Table 1 the four
classes are represented.
Such a partitioning technique is quite reasonable, in that it is limited to
enhance two properties of sectors interdependences. Nevertheless, it is to be
said that such a partition depends only on the total amount of each sector
95

purchases and sales, but not necessarily on each sector number of purchasers
or sellers, so that it may not be a quite consistent tool. Chenery and Watanabe
then tried to establish a sectors hierarchy, based on a one-way interdependence
model, because the previous method neglects «the fact that interindustry
transactions may involve either one or many other sectors and that the resulting
pattern of interdependence might, at least a priori, take an infinite variety of
forms» (ibid.).

3 The matrix triangularization

For a similar comparison, in order to «discover a productive structure which


is common to all economic systems having a like technology», Simpson and
Tsukui (1965) suggested a way of triangularizing an I/O table. Triangularize a
matrix means to find a permutation of both matrix rows and columns, in order
to get all non-zero cells above or below the main diagonal; in economic terms,
sectors that sell to others never buy from them. In this way one may «single
out a hierarchical pattern in the structures of the economies to be compared
from primary to final production sectors» (Cassetti, 1994).
Triangularization algorithms are time consuming b : as an alternative,
Simpson and Tsukui (1965) propose a reduction of data table: from the F =
[/ij] matrix, they derived a matrix D = [dij ] by first erasing the flows lower
than an arbitrarily fixed threshold value, then they calculated the two values
dij = lij - hi and d ji = hi - lij, in order to retain the positive one and
erasing the other. In this way, they both reduced all two-way interactions
between sectors to one-way, and reduced the amount of non-zero values, and
consequently the amount of computing time.
Such a technique is far from being a good representation of an I/O table,
since a one-way ranking, performed just erasing most of the flows, seems too
reductive to give a reasonable picture of the economy structure; in addition,
it biases the interpretation constraining it into a totally ordered set of sectors.
As a matter of facts, Simpson and Tsukui noticed that a complete triangu-
larization is impossible, so that they accepted a block triangularization, the
blocks as a whole being ranked in a hierarchy, but not the industries inside
each block, noting that their blocks triangularization is an a priori one, based
upon the physical characteristics of the in- dustries and not upon the tables
(p.437), and that «in the absence of an absolutely triangular and/or com-

b An experience with Korte and Oberhofer (1971) algorithm as implemented by Lucev


(1981) on low order matrices (from 3 x 3 up to 12 x 12), derived by aggregation of Italian
16 industries 1978 I/O matrix, was carried out (Camiz, 1987). In that context, on a Sperry
1100 mainframe, the computing time was shown to increase as 3 n , n being the matrix order.
96

pletely decomposable table, no unique criterion for the ordering of industries


can be derived» (ibid.).
Actually, it is impossible to imagine a complete a priori skeleton of the
economic system, but in addition it must be understood that a triangulariza-
tion procedure is just a rough approximation of the actual system structure,
since the underlying hypotesis, that the system is totally orderable, is never
satisfied. Cassetti (1994) points out that triangularization is performed «:at
the cost of disregarding a plain evidence of circular relationships», but Ch-
enery and Watanabe (1958) themselves where perfectly aware of it when they
were saying that «:if there where no circular relations ( .. ) in the economy, it
would be possible to arrange the input-output matrix in a triangular form ... »
(p. 494). Nevertheless, the myth of triangularization is still alive, although
strongly reductive: an example of how technical limits may bias a theory.
Actually the total order of sectors is an ideal unrealistic pattern.

4 Correspondence analysis of I/O tables

Recently, the use of correspondence analysis (Benzecri, 1973; Hill, 1973; Lebart
et al., 1984) was proposed independently by Abbate and Bove (1992) and
Mattioli (1993), the formers stressing the identity among rows and columns
profiles, analyzed by correspondence analysis, and technical coefficients, and
the latter emphasizing that the distributional equivalence of correspondence
analysis chi-square distance minimizes the effects of the definition of sectors,
since two sectors having similar profiles may be summed-up without effects on
the analysis, so that structural analysis may be independent by the I/O model
chosen for both purchases and sales (Bon and Bing, 1993).
Correspondence analysis is an exploratory technique aiming at represent-
ing both data tables rows and columns on optimally chosen geometrical spaces,
such that proximity among rows or among columns on such spaces may be in-
terpreted as their profiles similarity. The technique applied on a data table K
is based on the scalar product among rows given by

(6)

Here, 8' 8 ij is the generic term of matrix product 8'8, where 8 is a centered
97

matrix, obtained by K through centering:

s.. _ Kij V!J(.""K-


~·i.~·.J

'J - I"J('7(': (7)


V ~',.~'.j K..
and Fij = Kij / K... Since in the case of stochastic independence of rows and
columns

(8)

8' 8 ij measures the covariance of i-th and j-th columns with respect to stochas-
tic independence.
The eigenanalysis procedure gives the eigenvalues in decreasing order, and
it is proved the corresponding eigenvectors represent the directions where data
table both rows and columns profiles may be best represented, i.e. the first
k eigenvalues span the k-dimensional space where the points scattering best
approximates the scattering of points on the original n-dimensional space. It is
proved that this optimum corresponds to the measure of inertia based on chi-
square distances among both rows and columns, i.e. distances corresponding
to the given scalar product:

D .. = ~....!.... (Fik _ Fjk)2 (9)


'J Wk F .k P,. P.J.

Chi-square distance has the property of distributional equivalence, that


may be expressed as follows: if two rows lines having identical profiles are
summed-up, then column distances are unchanged (and vice versa). In the
case of an I/O table this means that rows and columns may be to a certain
extent freely chosen without significant bias in the results of the analysis. In
addition, if one takes into account final demand and value added vectors, the
profiles correspond exactly to the analysis of technical coefficients deviation
from a mean profile.
The work of Mattioli (1993) is interesting in that it discusses the ability
of correspondence analysis to detect both triangular or block triangular struc-
tures, as hypothesized by Simpson and Tsukui (1965), as resulting through the
so-called horse-shoe effect (Guttman, 1953; Greenacre, 1984). In the results
of applications he found a distribution of both rows and columns according
to Chenery and Watanabe (1958) partition. In Mattioli (1993) and in Abbate
and Bove (1992) blocks are acknowledged as those subsets of sectors that are
representable close to each other in that they have similar rowand/or column
profiles.
98

It seems however that neither the comparison of position on factor spaces


of the same sector profiles, once considered as input and once as output, may
have a particular meaning in I/O analysis, nor any information may be de-
rived on the reciprocal flows between two sectors, in particular concerning
their asymmetry. For this reason, Abbate and Bove (1992) discuss the prob-
lem of representing the asymmetry, suggesting suitable exploratory techniques,
such as Greenacre (1978) unfolding for the first and that described by Bove
and Critchley (1989) for the second. Actually, both methods only by far al-
low to attain the main I/O analysis aim, that is to get sufficient information
concerning the whole flows structure.

5 In search of total and circular relations

The values of A and Q coefficients indicate the importance of direct relation-


ships among sectors, because the variations of one sector production, final de-
mand or value added, have an influence on its purchases and its sales, and we
have seen that correspondence analysis may represent proximity in the struc-
ture of technical coefficients. Nevertheless they do not take into account the
fact that the effects of interdependence are larger than the direct flow relation-
ships between sectors: if the sector i purchases from sector j, and j purchases
from k, the variations in the production of i have effects on production of k
too, even if no direct flow exists between k and i.
In order to determine the indirect relationships among sectors, one can
derive from equations (1), (2), (3), and (4):

X=AX+Y (10)

X' =X'Q+ V' (11)


and
(I - A)X = Y (12)
X'(I - Q) = V' (13)
Both matrices (I - A) and (I - Q) are non-singular, since parallel lines cannot
be linearly dependent. In fact, technical coefficients depend upon production
technology and if two parallel lines are equal it would mean that they repre-
sent two parts of the same production sector, thus they would rather be fused
together. In addition, both A and Q are non-negative, since all F. terms are
non negative and diagonal terms are less than 1, since otherwise a sector would
absorb all its production (Hawkins and Simon (1949) condition). These con-
ditions holding, Perron-F'robenius theorem (Gantmacher, 1959) ensures that
99

0< J.£* < 1, where J.£* is the maximum eigenvalue of A, and thus the following
equality holds:
00

(I - A)-l = I + A + A2 + A 3 + ... = LAi (14)


i=O

Here, aij represents the n-th order interaction between two sectors, that tends
to zero for n large enough. The same can be said for Q:
00

(I - Q)-l = 1+ Q + Q2 + Q3 + ... = L Qi (15)


i=O

so that one can eventually write:

x= (I - A)-ly (16)

X' = V'(I _ Q)-l (17)


Thus, the elements of the matrices Z = (I - A)-l and W = (I - Q)-l may be
interpreted as being Zij the variation of production of i-th sector due to the
unitary variation of j-th sector final demand, Wij the variation of production of
j-th sector due to unitary variation of i-th sector value added. These matrices
give then a measure of all interdependences among sectors.
The same partition proposed by Chenery and Watanabe (1958) can be
done by using the total linkages (Yotopoulos and Nugent, 1973; Jones, 1976),
this way considering the whole effects. Based on Z and W matrices, forward
and backward total linkages are respectively:

= 2::j Wij (18)


= 2::i Zij
Although the analysis of inverse matrices is a great step forward in the
comprehension of the interdependences among sectors, it must be said again
that the inspection of all coefficients is quite a heavy task, and the classification
based on total linkages is not at all sufficient for a complete understanding of
an I/O table structure, exactly as it was for the direct matrices analysis.
In order to look for a better investigation method, let us return back
to equation (6), (14), (15) and think about its meaning: considering that
aij = 2::k a':j-lakj, each aij represents the n-th order interaction between two
sectors, i.e. the interaction due to a sequence of n connected sales, starting
from sector i up to sector j, so that each coefficient of (I - A)-l summarizes
all these interactions.
100

We may look for a method aiming at revealing and sythesizing these in-
teractions. This is of particular interest, for let us consider what happens in
the case of circular relations, that is those cases in which the sequence of sales
returns back to the starting sector: it is clear that variation of demand of only
one sector goods of the sequence has influence on all involved sectors. Yan and
Ames (1965) studied interactions through the interrelatedness function R, as
the sum of diversification and indirect relatedness indexes, based on the order
matrix B, associated to A. In B, each bij indicates the lowest degree n such
that aij =1= 0 and, given a submatrix of A composed by rows i = (il, i2, ... , ir)
and columns j = (j1,l2, ... ,is), R is

· .) -
R( IJ
'rs
- -
1 LL-
r

b
S

rs
L= krs
1 -_- - +
·
- nl nk (19)
h=l k=l i "Jk k=2

Diversification and indirect relatedness indexes are the two terms summed
in last member of (19), where nk represents the number of the submatrix
elements whose value is k. The first index, ndrs, represents the proportion of
industries to which a given sector either sells or buys; the second summarizes
the proportion of indirect relations.
So far, no emphasis was given to circular relations among sectors, although
it is not wise to ignore such important components of an economic system, as
industrial blocks with strong circular relations among sectors are, probably
because they did not fit properly into the a priori models used up to that
time. Actually, only through the use of different mathematical tools better
analysis methods may derive, namely through graph theory, able at revealing
the complete structure of the economic system as well as its skeleton, deemed
as the set of strongest sectors and main flows among them.

6 Graph theory for I/O analysis

The use of graph theory in I/O analysis may be done by representing all sectors
as nodes and the commodities flows between two sectors as oriented edges con-
necting them. Although not explicitely stated, Yan and Ames (1965) indexes
are based on a minimum distance matrix among a graph nodes. The first ex-
plicit application of graph theory to the I/O analysis is due to Ponsard (1967),
who introduced the signal-flow graphs in this frame: «transition instruments
from a matrix analysis to topological interpretation [... J, transfert graphs sug-
gest an original formalisation, most elegant and most powerful» (pag. 356).
Far from being a simple graphical representation of commodities flows that
circulate through economy sectors (that, in author's opinion, would be enough
101

to get it a most helpful tool; see also Carpano, 1980), graph theory enables
to define the influence relationships between two sectors, by showing the path
that goods follow, going from one sector to the other. The meaning is clear: if
one sector varies its production, all sectors that provide commodities to it vary
their production accordingly, as well as all sectors that provide commodities
to providers, etc. Each variation in one sector is then broadcasted to other
sectors as a variation of existing flows among them, say paths composed by
one or several edges. We know that the overall measure of variation, namely
the global influence is given by Z and W matrices coefficients, but using flow
graphs Lantner (1974) demonstrated that influence between sectors may be
partitioned according to the several paths that connect the two sectors. Given
a couple of sectors being origin and destination of an economic stimulus, and
all paths connecting them, each path influence may be computed. In addition,
a demand multiplier is introduced, due to the path itself, as a coefficient that
measures the raise of production of a sector per unit increase of final demand
(Samuelson, 1987). In this way, a path or a circuit may be considered not only
as the juxtaposition of adjacent edges having no effective relation among them,
but as the real path of influence circulation among sectors, itself contributing
to commodities exchanges.

It seems clear that the idea of studying I/O tables through graph theory
related methods has many advantages. In particular, it overcomes the re-
striction given by Simpson and Tsukui (1965) triangularisation, based on an a
priori model that does not fit to the actual possible tables structures (Benzecri,
1973): in facts, a graph may take an infinite variety of pattern. Secondly, it
allows the identification of strong blocks of economy, as those whose circular
relationships are stronger; related to it, the study of vulnerability (Camiz and
Pucci, 1986, Camiz, 1987), i.e. the identification of sectors most succeptible to
be cut off the said industrial blocks, helps in the detection of strategic sectors.

For this reason, an exploratory technique in some sense connected with this
model may be designed, since the currently available techniques seem limited
at the most to a comparison of direct flows between each couple of sectors
(Bove and Critchley, 1989; Abbate and Bove, 1992).

In Camiz and Pucci (1986) and Camiz (1987) headlines were given of the
use of graph theory for I/O tables analysis, and Camiz (1993) describes a com-
puter program able at giving a complete description of an economy structure,
based on industrial blocks, and analyze each block structure in order to de-
termine the strategic imp or- tance of industries into them, in connection with
the striking power and vulnerability analysis.
102

7 Graph theory concepts

An (oriented) graph is a couple (N, E), where the elements of set N are called
nodes and those of set E are ordered couples of nodes, called (oriented) edges
(in the following the word oriented will be dropped, should no ambiguity arise):
if e = (x, y) is such an edge, we shall say that e leaves x and enters y (or that
e connects x and y if the orientation is not taken into account). A graph is
usually represented by a drawing where nodes are dots or circles and edges are
straight or curve lines, with an arrow indicating the edge orientation, if any.
Two nodes are called adjacent if it exists an edge that connects them, a
node is adjacent to an edge (or vice versa) if the edge enters or leaves it, and
two edges are said to be adjacent if there exists a node adjacent to both. We
define as well:

(1) a chain connecting two nodes x and y, a totally ordered set of edges, each
one adjacent to its successive, such that x is adjacent to the first edge and
y is adjacent to the last edge of the set, regardless each edge orientation;

(2) a path leading from x to y, as a chain, connecting x and y, such that each
edge of the chain enters the node that the subsequent edge leaves;

(3) a circuit through x and y, as a path leading from any node to itself,
containing x and y as nodes adjacent to some edge of the path.

We define as a chain (path, circuit) length the number of edges that com-
pose it. Given a path leading from x to y, we say that x is an ascendant of y
and y is a descendant of x. We will call them direct ascendant or descendant if
the shortest path length is 1. We define inner degree of a node the number of
direct ascendants, outer degree the number of its descendants. On the degree
base we can classify the nodes according to their relative position in the graph,
thus defining the node centrality:

(1) isolated: a node with no adjacent edges;


(2) source: a node with zero inner degree, that is with no entering edge;

(3) anti-periphery: a node with minimum inner degree;

(4) anti-center: a node with maximum inner degree;

(5) center: a node with maximum outer degree;

(6) periphery: a node with minimum outer degree;


103

(7) sink: a node with zero outer degree, that is with no leaving edge.

In a graph the following are equivalence relations between two nodes x, y:

(1) weak connection: there exists a chain connecting x and y, or x and yare
the same;

(2) strong connection: there exists a circuit through x and y, or x and yare
the same.

Clearly, the strong connection is a subset of the weak one, so that the
induced partition of the strong is a refinement of the weak. The partitions
components containing more than one node are called weakly connected com-
ponents (or simply components), and strongly connected components (or simply
blocks), respectively. For each node of the graph, we can define a degree of con-
nection corresponding to which component type it belongs: it can be strongly
or weakly connected, or merely disconnected (isolated node). Paying attention
to a node x, based on relations with all others, one may partition them into
five classes:

(1) strongly linked: these are the nodes connected to x through a circuit: with
x they form a block;

(2) ascendants: these are the nodes connected to x through a path leading to
x;

(3) descendants: these are the nodes reachable from x through a path origi-
nating from x;

(4) side connected: these are the nodes unreachable from x, but, belonging to
the same graph component, they are linked to x through a chain;

(5) disconnected: these nodes do not belong to the same component of x, thus
no linkage exists among them and x.

This partition may help in describing in higher detail both node position
and importance within the graph. In fact, the existence of strongly linked nodes
means that those nodes belong to a strong block; the number of ascendants
and descendants may help in understanding the hierarchical position of the
node. Nevertheless, a better approach may be done through quotient graph.
104

8 Quotient graphs

Given a graph, one may build the quotient graph given the relation of strong
connection (that we shall call the condensed graph, obtained by collapsing the
original one), considering two nodes equivalent if they belong to the same strong
component, and two edges equivalent if they leave equivalent nodes and enter
equivalent nodes.
After collapsing, all nodes belonging to a strongly connected component
are collapsed in a single node, all edges connecting equivalent nodes disappear,
and all equivalent edges are collapsed in a single edge connecting the two
resulting nodes. Of course, different orientation of edges results in the presence
of two opposite edges.
The usefulness of condensed graphs is due to the fact that it has no circuits:
for this reason a level hierarchy can be defined, starting from the source and
moving to the sink, by defining each node level as the length of the longest
path leading to it from a source node. One could as well move in the opposite
direction and find a slightly different hierarchy, due to the fact that two paths
leading from source to sink may have different length.

9 Striking power and vulnerability

Let us consider a block and wonder what happens to its structure if we extract
an edge e = (x, y): two cases may arise, whether or not it exists a path leading
from x to y that does not contain e. If such a path exists, then all chains,
paths and circuits, originally containing e, can be modified, introducing this
path; otherwise, the connection structure of the whole component is modified,
in the sense that some of the nodes will lower their degree of connection due to
the elimination of the edge e. The latter being the case, we shall say that the
edge e has striking power over the component and over all the nodes that lower
their degree of connection. These nodes will be defined as being vulnerable
bye. In the same way a node striking power may be defined, as the effect of
the extraction of a node, equivalent to the simultaneous extraction of all its
adjacent edges.
If attention is drawn on a strong block, we can say that the striking power
is:

(1) null if no consequence derives on the block structure from the edge or node
removal;

(2) weak if some of the block's nodes are no longer strongly connected, but
the reduced block still exists;
105

(3) medium if the block is splitted into two or more blocks, plus some extra
node set apart, if any;

(4) strong if the whole block is destroyed, in the sense that no more circuit
exists, among the block nodes.

In the same way, a node is said to be weakly vulnerable if it remains weakly


or side connected with the remaining block after edge or node removal, and
strongly vulnerable if it gets disconnected.

10 Analysis of an input/output matrix through associated graph

Given an input/output matrix F, a graph may be associated to it, by con-


sidering each sector as a node, and each flow of commodities from sector i to
sector j as an edge leaving node i and entering node j.
For sake of simplicity, an adjacency matrix M may be built, where a value
of 1 in the ij-th position means the existence of an edge leading from node i
to node j, that is the existence of a flow of commodities from sector i to sector
j. In this case the powers Mh have as ij-th entry the number of h-length
paths leading from node ito j (Harary, 1972), dropping the information about
the amount of flows, and we may build a minimum distance or shortest path
matrix D, where dij represent the number of edges of shortest path going from
node i to node j. In fact, given an input/output matrix F, one may build
several boolean matrices Moo, derived from F using different threshold values
a: for a fixed a, we have then

1 if lij > a
= 0 i=l, ... ,n (20)
o otherwise

In this way a graph is built, where each node corresponds to an industrial


sector and each edge to a flow between sectors greater than a. The use of
several graphs based on different threshold values reduces the loss due to the
qualitative analysis, since the study based on different increasing thresholds
means to take into account a series of graphs whose edges correspond, at each
step, to higher flows among sectors, so that eventually only the highest flows
among the strongest sectors are taken into account.
The concepts defined in the previous section can be applied to industrial
structures: here each sector is a node and each flow greater than a is an
edge; consequently, the analysis of graph structure reflects the structure of the
economy, once only flows greater than are considered. The importance of a
106

sector in the economy may then be analyzed, by considering inner and outer
degrees and centrality attributes, as well as the kind of linkages that exists
between each node and all others, the belonging to a block and its level in the
hierarchy being pictured by the condensed graph. In particular, the position
of blocks is clearly shown.
The interest of blocks is due to the strong connection among inner sectors
and the weak connection with the outer ones, so that much stronger influences
deal within the blocks than between them. It is evident that sectors belonging
to the same block are mutually sustained, due to the circular relationships.
Nevertheless, not all sectors in a block are alike: to determine the relative
importance among block inner sectors one may evaluate sectors centrality,
now limited to the edges connecting inner nodes, but more by evaluating edges
and nodes striking power and nodes vulnerability, for they mark those edges or
sectors whose removal from a block (due to the vanishing of some flows or to
their lowering under the threshold value) causes the detachment of vulnerable
nodes or, even worse, the complete annihilation of the block itself.
A special care may be used in considering striking power and vulnerability,
since their meaning strongly depends on the choice of a. Actually, vulnerability
may result if the alternative path to a removed edge is only very little lower
than a, so that a real vulnerability analysis has sense only for a = 0, unless to
the used threshold value may be given a clear economic meaning.

11 Application to 1988 Italian table

As an example, let us consider the Italian 1988 I/O table (Istat, 1992) reduced
to 16 sectors to be used to compare the different descriptive techniques. The
sectors and their labels, used in tables and graphics, are listed in Table 2. This
choice was done considering the previously described triangulation experiences
(Camiz, 1987): the dramatic improvement of hardware allowed us to triangu-
larize the 16 sectors matrix (without manipulation) on a Unix Convex parallel
machine in about 43": a lightning compared with the estimated 4 h 30' in 1987,
but still too long to consider to triangularize the original 44 sectors table.
The table is quite filled: only 43 off-diagonal flows are zeros, the other
flows ranging from .124 through 303865.22 billion liras, with an average of
3619.6373 billion liras (19838.161 standard deviation). The distribution is
strongly skewed, with median 681.464 and only 30 flows above the mean value.
Both the original table and the triangulated one are shown in Figure 1 (A) and
(B): the sequence of sectors shows in order those that most sell commodities to
other sectors, such as Energetic productions, Rescue and restoring, Communi-
cation, Entreprises supplies, and House rentals, through the buyers or anyway
107

Table 2: The 16 sectors of Italian 1988 table and their labels

1 Agrzoofi Agriculture, livestock, and fishing


2 EnerProd Energy production
3 Tranprod Transformation production
4 Building Building and public constructions
5 Rescrepa Rescue and restoring
6 Tradserv Trade supplies
7 Hotelpub Hotels, restaurants, and bars
8 Transpor Transportation and transport supplies
9 Communic Communication
10 Bankinsu Banking, insurance
11 Entrserv Entreprises supplies
12 Housrent House rentals
13 Resescho Research and education
14 Privheal Private health
15 Amuscult Entertainment and culture
16 Nosalser Non-sale supplies

mostly oriented to final demand, such as Building and public constructions,


Hotels, restaurants, bars, Private health, and Non-sale services. In the middle,
Entertainment and culture, Banking and insurance, Transportation, Trade sup-
plies, Research and education, Agriculture, livestock, and fishing, and Trans-
formation productions seems the intermediate sectors, most integrated both as
sellers and buyers. In Figure 1 (C) the same triangularized matrix is shown,
having set to zero all flows whose value is lower than the mean c.
In Figure 2 classification tables are shown, based on both direct and to-
tal linkages: the subdivision is done into the four groups, according to the
linkages strength (Table 1). Together with the linkages, both backward and
forward variances are reported as well as each sector scaling according to each
index. The main difference between the two tables is Transportation sector
belongs to the first group for direct linkages (the strong backward-strong for-
ward) whereas, based on total linkages, it moves to the fourth one (weak
backward-strong forward), meaning reduced indirect effects of Transportation
sector's purchases. The classification sets in the Manufacturing-intermediate
CAn indipendent triangularization of this reduced matrix was attempted, expecting a
different sectors sequence. Unfortunately, the high numbers of zeros gets the procedure so
longer, that no results were given after few hours of computation
"%j
~' ~
....
'"
..... A) Original table
......
Aqnoofi tnerprod. Tr;approd Building Rescrepa Tradserv Hotelpub Transpor COllllllunic Dankinsu Entnerv Housrent Res.scho Privhed Amuscult Hoailser
Aquoofi 11)44527 588 )1052396 39722 0 3916 3826559 86789 6201 )824 3968 0 2457 50145 32543 521304
..'"
~ EnerProd 1495133 2lf20766 19238516 1362524 359549 6716057 1959670 9271795 315048 341497 13586)9 498193 202949 6)0942 882199 )899117
::s Tranprod 9031293 2777862 )0386521& 42047"116 15125346 6706500 19071249 6951282 151)318 710359 4236984 6454)4 215278 1923620 1816470 19571794
..... Build.ing 21936 1059544 2390723 3621989 103121 &lH4" 480)28 1250614 99iiOS6 391698 5709)8 5991913 69986 68718 120301( 5702326
C> R.scrap. 135165 96990 13200)47 616421 63211 1933246 444669 2767562 89280 287372 1210316 1030150 902S1 109395 118730 1350179
Tndserv 2490904 301451 379834B4 n515S3 1515899 10751668 5842322 21'13895 294330 124079 591119 5H28 H134 27577] 267556 2154780
Hote1pub 3577 H0375 2597073 375690 7030 1729096 0 965435 226859 271823 12820&3 0 0 0 8418B"I 477249
~
('l
Tnnspor 5"/J651 1229288 23280828 3476823 465456 5557603 1120313 11719435 831849 231126 1177524 22490 29566 107515 503356 1926302
COII/IIunic 15622 299682 3748198 308568 67168 2839196 531017 827264 88192 974590 1126535 13903 49136 12122 246918 2107302
.. Bankinsu 785952 433110 5892882 1124514 430013 4047967 119166 2095238 159894 45699824 718600 917193 60987 327034 356868 1421386
Entrserv 212380 470109 23480)40 4969396 383783 10648878 1664028 3446230 "155913 13223334 3410909 326"1022 1130876 127331 1299325 9224220
:?~ Hoosrent 5223 121902 1553572 3584)) 192411 6395683 682818 600533 269797 882405 1608541 0 139179 3829"17 1581841 1332090
Ruucho 3386 n0155 2391868 124 0 0 0 761 6535 0 405641 0 0 0 0 0
Privhul 8592) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
'"
::s .....
to Alauscult 0 52837 3348493 144302 2821 3560615 925601 885834 286250 491470 414904 974621 30161 145009 2359471 2890751
()qCO Nou.1ser 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 28000
= CO
or . . .
............ B) Tf'iangular;zed table
~' 0
En.rPrOd It.scrapil COIIIJI\unic Entrserv Housrent Jl.muscult Bilnkinsu Transpor Tradserv Resescho A9rtOofi TunprOd Building Hotelpub Privheill Nosaher
0."
En.rProd 21420766 359549 315048 1358639 498193 882199 34749"1 9271795 6716057 202849 14951)] 19238516 1362524 1959670 630942 3U9U7
S"2: R.scre~a 96990 63211 U280 1210316 1030150 118130 281372 2767562 1933246 90287 135765 13200347 616421 444669 109395 1350179
0-'" COJIWUn1C 299682 67168 88192 1126535 13803 246918 974590 821264 28)9196 ",136 15622 3748198 308568 537017 12722 210'1302
Entrserv 410109 38]783 755913 3410909 3261022 1299325 132233]4 3H6230 10649818 1130816 272380 23480340 4969396 1664028 127337 9224220
Rousrent 121902 192411 2&9191 1608541 0 1581841 882405 600533 639568) 139119 5223 155)512 358433 682818 382977 1332090
-'
'" AlDuscult 52837 2821 286250 414904 974621 2359471 491470 885834 3560615 30161 0 334849) 144J02 925601 145009 2890751
~~ aankin.\! 433110 430013 159994 118600 911193 356868 45699824 2095238 4041961 60987 185952 5892882 1124574 119166 321034 1427386
Traupor 1229288 465456 831949 1177524 22490 503356 231126 11119435 5551603 28566 513651 23280828 )476823 1120)13 107515 1926302
Tndserv JOU51 1515899 294))0 58111 9 51428 261556 124079 2173895 107516&8 44134 2490904 37983414 J3515S3 5842322 275773 2154780
aes.scho 330755 0 6535 405641 0 0 0 761 0 0 3386 2397868 12. 0 0 0
~§: Agrzoofi 0 6201 3968 0 3250 3824 8f.i789 3916 H57 11344527 3705239& 39122 3826559 50745 527504
rr~nprod 2771862
'" 15125346 750378 4236984 fi45434 1816470 710)59 6951282 6706500 275278 9031293 303865216 42047716 19071248 1923620 19571194
~~, 1059544 103121 996086 570938 5991913 120304 39U98 1250614 681464 69986 27936 2384123 3621989 480328 68718 5702326
:~~!t;~ 2403"15 1030 226859 1282083 0 841887 271123 965435 1728086 0 3577 2597073 )75680 0 0 417249
Prlvhe&l 0 0 0 0 0 0 0 0 0 0 85923 0 0 0 0 0
::r-
~ '" Nosalser 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 28000
o ..
s:~ C) Triangularized table with threshold
!Lif
Enerfrod Rescrepa COIIIIIunic Entrs.rv Housrent AIIIuscult 8i1nkinsu Transpor Tnd.serv Res.acho Agrzoofi Tranprod Building Hot.lpub Privheill Nosaher
OJ
~::s~
En.rProd 21420766 0 0 0 0 9211195 6716057 0 19238517 0 0 389911 7
0 0 0 0 0 0 0 0 13200341 0 0 0
~:';~:f~ 0 0 0 0 0 0 0 1748198 0 0 0
Entnerv 0 0 0 13223)34 0 106488,. 0 23480340 4969)96 0 9224220
Housrent 0 0 0 0 0 0 6395683 0 0 0
~ •• 0 0
::s Amuscult 0 0 0 0 0 0 0 0 0 0 0 0
()q BiinUnn 0 0 0 0 45699823 0 4047961 0 5892882 0 0 0
Transpor 0 0 0 0 0 11719435 5551603 0 23280821 0 0 0
Trads.rv 0 0 0 0 0 0 10151668 0 37983483 0 5842322 0
Ru.scho 0 0 0 0 0 0 0 0 0 0 0 0
Aqrzoofi 0 0 0 0 0 0 0 11344527 37052396 0 3826559 0
Tranprod 0 15125346 4236984 0 0 6951282 6706500 9031293 )03865216 42041115 190'71249 19511794
0 0 0 5991913 0 0 0 0 0 3621989 0 5102326
:~~!t~~ 0 0 0 0 0 0 0 0 0 0 0 0
Privheal 0 0 0 0 0 0 0 0 0 0 0 0
,,'a-~
Nosalser 0 0 0 0 0 0 0 0 0 0 0 0
0-
..'"
_tf
9
109

-----------------------------------------------------------------------------
A) DIRECT : LINKAGES
-----------------------------------------------------------------------------
) I linkages variances
In. I sector ------------------------------------------------------------
I I I backward I forward I backward I forward
--------------------------------------------------------- -------------------
strong backward : strong forward I
31 Tranprod .6332 I 21 .5745 I 71 2.471 51 2.740 61
51 Rescrepa .4645 I 61 .5858 I 61 3.198 11 2.185 81
81 Transpor .4729 I 51 .5741 I 81 1.319 121 1. 873 91
101 Bankinsu .9242 I 11 .9381 I 11 2.913 31 2.780 51
strong backward weak forward I
41 Building .4792 I 41 .1824 I 131 2.669 41 1.332 141
71 Hotelpub .5014 I 31 .1233 I 141 2.072 81 1. 335 131
weak backward : strong forward I
11 Agrzoofi .3968 I 71 .8003 I 41 2.090 71 2.855 41
21 EnerProd .3344 I 81 .8113 I 31 2.929 21 1.541 10 I
91 Communic .2411 I 121 .6297 I 51 1.043 141 1.358 121
111 Entrserv .2147 I 141 .9191 I 21 1.035 151 1.314 151
weak backward weak forward I
61 Tradserv .2726 I 91 .3020 I 111 .9447 161 2.210 71
121 Housrent .1532 I 161 .1839 I 121 1. 919 91 1. 539 111
131 Resescho .2517 I 111 .3727 I 91 2.091 61 3.054 31
14 I Privheal .1666 I 151 .0034 I 151 1.834 101 4.000 21
151 Amuscult .2148 I 131 .3394 I 101 1.130 131 1.217 161
161 Nosalser .2705 I 101 .0001 I 161 1. 510 111 4.000 11
-----------------------------------------------------------------------------
B) TOTAL ) LINKAGES
-----------------------------------------------------------------------------
I I linkages variances
In. I sector -------------------------------------------------------------
I I I backward I forward I backward I forward
--------------- -------------------------------------------------------------
forward
strong backward ! strong I
31 Tranprod 2.347 I 21 2.139 I 81 2.931 51 3.236 31
51 Rescrepa 2.048 I 31 2.190 I 61 2.246 141 2.079 131
101 Bankinsu 4.121 I 11 4.514 I 11 2.901 71 2.632 81
strong backward weak forward I
41 Building 2.045 I 41 1.290 I 131 2.215 151 3.164 41
71 Hotelpub 2.001 I 51 1.261 I 141 2.109 161 3.149 51
weak backward , strong forward I
11 Agrzoofi 1. 799 I 71 2.756 I 41 2.720 101 2.354 101
21 EnerProd 1.574 I 81 2.787 I 31 3.374 11 2.031 141
81 Transpor 1. 919 I 61 2.157 I 71 2.381 131 2.289 121
91 Communic 1. 455 I 111 2.286 I 51 2.709 l11 1. 812 151
111 Entrserv 1. 403 I 131 3.077 I 21 2.970 41 1.584 161
weak backward weak forward I
61 Tradserv 1. 491 I 101 1. 583 I l11 2.781 81 2.719 71
121 Housrent 1.303 I 161 1. 353 I 121 3.033 21 2.914 61
131 Resescho 1. 421 I 121 1.868 I 91 2.777 91 2.330 l11
141 Privheal 1.353 I 151 1.009 I 151 2.918 61 3.960 21
151 Amuscult 1. 375 I 141 1.588 I 101 3.012 31 2.607 91
161 Nosalser 1. 529 I 91 1.000 I 161 2.581 121 4.000 11
----------------------------------------------------------------------------

Figure 2: Classification of sectors according to A) direct and B) total linkages.


110

group Banking and insurance, Transformation production, Rescue and restor-


ing, and Transportation (only for direct linkages); in the Manufacturing-final
group there are Building and public construction and Hotels, restaurants, and
bars; in the Primary-final group there are Trade supplies, Research and educa-
tion, Entertainment and culture, Nnn-sale supplies, House rentals, and Private
health; in the Primary-intermediate group are set Agriculture, livestock, and
fishing, Energy production, Entreprises supplies, Communication, and Trans-
portation (only for total linkages). In the scatter diagrams shown in Figure
3 and 4 each sector is represented according to its backward (vertical) and
forward (horizontal) linkage value. In this way, the position of sectors may be
better understood. It is easily seen that the main difference between direct and
total linkages is represented by the position of Banking and insurance, whose
total linkages values are by far higher than all others.
Correspondence analysis was performed adding a Value added line and a
Final demand column, in order to get the total production table marginals, so
that both rows and columns profiles correspond exactly to expense (columns)
and market shares (rows) technical coefficients. Three axes explain more than
91 % of total table variation, canonical correlation between rows and columns
factors being .73, .69, and .33 respectively. The first axis receive major con-
tributions from Final demand (54.5%) and Banking and insurance (15.3%)
columns and Value added (33.7%), Non-sale supplies (18.3%) and Banking and
insurance (14.1%) rows. Along the first axis, all column sectors (on the neg-
ative side) are opposed to Final demand (positive side), on the negative side
being rows corresponding to Agriculture, livestock and fishing, Energy produc-
tion, Banking and insurance, Entreprises supplies and Value added. To the
second axis contribute most highly both Banking and insurance row and col-
umn (77.3 and 82.2%, respectively). Banking and insurance and Final demand
columns are on the negative axis side opposed to all others, whereas to the pos-
itive axis Agriculture, livestock and fish, Transformation production, and Value
added rows are opposed to all others. To the third axis highest contributions
are those of both Transformation production row and column (45.2, 53.8%,
respectively) .
Comparison of correspondence analysis results with triangularization does
not seem to show any relationship, whereas factor coordinates along the first
two axes seem to have some agreement with linkages scaling: in particular, the
first axis rows pattern reproduces rather well the scaling of forward linkages,
whereas the second axis columns reflects sufficiently the backward linkages.
In Figure 5 the correspondence analysis scatterdiagrams are represented of
the sectors viewed once as columns, corresponding to backward linkages (left
column) and once as rows (forward linkages, right column). It is clear the
111

Direct Linkages
VoI.IH

0,8
-.
TranplOd
D.D

i HaIoI>·~.101ng
J 0.4
T~
Agacofl

EnerPIIOd
T _ ........cho
Commune
0.2 Am ...... EnbMIV
Hout.nt

0
0 0.2 0.' o.e 0,8
Forward

Figure 3: Representation of sectors according to forward (horizontal)


and backward (vertical) direct linkages

Total linkages
VakJes

TranpfOd

AQrzoofi

EnerPIO:I
Ent...erv

o 3 5
Forward

Figure 4: Representation of sectors according to forward (horizontal) and backward (vertical)


total linkages
112

Aals 2 Axil 2
ED.rProd~-·--------------+-------------------------------
PriYbeal I
__ + v.lue.dd-------------t --- -- -------------________ - - _____ - - __ t
JIoI.Ilrent I • I I
I I
I ~cv.1t I I I
&ftt....rve-anic I _rprOCi I I
No.alHrAquoofl I I I
I I a,r&ClOU I
It.eses~ Ilescrep. I I I
.. '".dlu. Hotelpub I I I
I I I I
I Bulhl1n, I I I
I Tc.an.apor I tnn~rtranprod
I Tnnprod t-----------·--------t--r.sc:r!lpl.-----------________________ •I
I I I I I
I I I I I
I I I I reseseM I
I I I c-mic: _,culttradun' I
•________________________.________________________________ e.
I
I I
I
I
I
I
I
I
I
bul leliA'
I
I
I
I I I I bousrtlntbohlpub I
I I I I I
I I I I pnvtM!••
I I • I nos.ber I
I I I I
I I I I
I I I I
I I I I
I I I I I I
+--------+----------8.. nk10Iu------------_+ __________ '1na<SeN entr'erv----bataklnlu-t---------------t---------------.-----t
Axh 1 Axial
Axi.ll
+------------------Tranprod--------------------___________
I I
e.
I
Axis J
+-----------------------------.'tozoolltrMprod------------t
I I I
I I I I I I
.. Aqrzoofi I I I I I
I I I I I I
I I I • I I
I I I I I
I "'eserep. I I I I
I Hatelpub I I I I
I BuUdin, I I I I
I I I I I
I I I I relerep. I
I I I I I
I I I I I
B.lnklnIU-----------------+----------------------------___
I I
wet
I
banUtlIU entrs.n' I
t--------------------------------t--tr.nspor------------ ___ t
I
I I I I I I
I I Fin.deN t I tradlerv I
I I I I I I
I I I I I I
I I I I ... lueoldd I I
NosalaerTranapor I I I I I
Pr!Yh. . l I I I ~ic I
ae •••cbot:ntr.e", I I I I I
I Trad.srv I I I I I
.. eo-unlc I I • I I
I "-u.cult I I I I I
lIouuent I I I enerprod IIot.lpubpnvbe.. l
I I I I I .....cult noa .. lNr
EnerProd-+---------------+---------------t-----------------t
Ad. I
t----------------.---------------.----------bulldlnqho""rent
Axis I

Axia 3 Axis 1
+----------------+----------Tranprod-----------_·_·-----••••
I I I
t---------------------------
I
t ranprodl9r&ootl-- ---- ---------.
I I
I I I I I I
+ I A9uoofi I I I I
I I I I I I
I I I • I I
I I I I I
I I Rescrep. I I I
• I Hotelpub I I I
I I 8ulldln9 I I I
I I I I I
I I I t.ICtep. I
I I I I I
I I I teselcho I
a ..nkinsu-----····.------··------·----------------·· ___ --___ • I entnerv I J
I I I bsnklnsu------------------------tral'1spor-------------------t
I I I I I I
F1n.-d_ I I I t ....cIa • .:.· I 1
I I I I I I
I I I I I I
I I I I I .. alue..deI
I I Tr .... spor Mo... her I I I I
I I Prhnul I I ~icl I
I I Jl.esucboEJltrserY I I I I
I I Tud.en' I I I I
• I c-un1c I
I I Kousrent I •I II II

I
I
I
I
.----------------t---------------t-------.·----__ I
+ __ EnerProd
:
"-ascult
I
ftO~!t:::tr:n9-.eul~ enarprCtd :
t---------------!\ousrent---------t---------------.------- __ t
Axis 2 Ads 2

Figure 5: Representation of backward (left) and forward (right) linkages on the correspon-
dence analysis planes spanned by factors 1-2, 1~3 and 2-3.
113

opposition of Final demand column to all other sectors on the plane spanned
by the first two axes, with Banking and insurance backward linkages opposed
to all others on the second axis. The forward linkages of both Entreprises
supplies and Banking and insurance are as well opposed to all other rows on
the same plane, which form a kind of seriation. The third axis clearly separates
strong backward linkages from weak ones.
The graph analysis was performed first without threshold, then considering
only the 30 flows grater than the mean value. In the first case a strong block
composed of 15 sectors is ascendant of the only sink, corresponding to Non-sale
supplies sector. Within the block, most sectors are directly adjacent to most
of all others, only Non- sale supplies and the Private health sector, weakly vul-
nerable by its only reachable sector, namely Agriculture, livestock and fishing,
are not fully integrated in the strong economic structure, and Research and
education and Private health are the only sectors within the block with fairy
low outer degree. The graphical representation of the strong block structure
is difficult to read, since too many crossing edges should be represented.
If we consider the graph limited to the 30 greatest flows, the structure
is most variated, showing a pattern of higher interest. At this level, three
sectors are isolated, namely Research and education, Private health, and En-
tertainment and culture, a 9 sectors strong block results, the collapsed graph
resulting as shown in Figure 6 (left): Energy production and Communication
are both sources, both reaching the strong block A and through it reaching
both sinks Hotels, restaurants and bars and Non-sale supplies. The latter is
also reachable directly from Energy production. Within the block (Figure 6,
right), the Transformation production is by far the most important sector, rep-
resenting both graph center and anti-center and having strong striking power
on the block itself. Other sectors having weak striking power are Building and
public constructions over House rentals, Trade supplies over both Building and
public constructions and House rentals, Entreprise supplies over Banking and
insurance, and House rentals over Building and public constructions.
The importance of Transformation productions is stressed, considering that
ofthe 11 edges having weak striking power, 8 are its adjacent: four are entering
from Agriculture, livestock and fishing, Rescue and repairing, Trade supplies,
and Communication, and four are exiting to Agriculture, livestock and fish-
ing, Rescue and repairing, Trade supplies, and Transportation; the other three
edges connect Building and public constructions with House rentals, Entreprise
supplies with Banking and insurance, and House rentals with Trade supplies.
The inspection of Figure 6 representation of A block, adds interesting con-
sideration to what was exposed up to now. In particular, the weak position of
sectors 10 (Banking and insurance) and 12 (House rentals), the latter included
114

Figure 6: The graph corresponding to the 30 flows greater than the mean. Left: collapsed
graph; right: the A block structure
115

in the circuit 3 -> (11) -> 4 -> 12 -> 6 -> 3 and 3 -> 11 -> 4 -> 12 -> 6
-> 3, and the former included in the circuit 3 -> 11 -> 10 -> (6) -> 3 (the
sectors within parentheses may be bypassed). It may be noted that sector 11
(Entreprises supplies) has only one entering edge, from Transformation pro-
duction, whereas sector 6 (Trade supplies) has only an exiting one, to Banking
and insurance.

12 Conclusions

Both triangularization and linkages tables are easily understandable, being


well known I/O tables analysis tools. Triangularization is strongly limited in
its application by the number of sectors and the amount of zeros, dues to the
use of thresholds, so that one may not expect to use it with ordinary tables.
In addition, it gives a very approximate picture of the true table structure,
completely ignoring the existence of flows circuits. Correspondence analysis
does not seem to add more to linkages tables, so that its use may be limited
to an academic exercise, whereas linkages classification tables, in particular
those based on total linkages, seem much more appropriate for an investigation
based on technical coefficients. The usual ability of correspondence analysis to
provide graphical representations is limited, since graphics based on linkages
are easier to understand.
The complete graph analysis, together with graph representation, enables
the user to have a very strong tool for the investigation of input/output ta-
bles structure: in particular, in comparison with triangularization it is much
faster and able to give details of the true table structure, rather than forcing
the representation to an abstract totally ordered model. It is evident that, in
comparison with other techniques, only this one may give information on the
circuits of flows otherwise impossible to detect. Striking power and vulnera-
bility analysis add to this study an idea of dynamic effects over input/output
analysis. The use of thresholds gives a kind of zooming effect, outlining the
flows of major interest. The choice of the mean as a threshold is not a must, al-
though it allows a representation of major flows more readable. Investigation
is to be developed on the coefficients distribution as well as on their mean-
ing, in order to suggest other suitable threshold values. If critical thresholds
could be identified, in combination with vulnerability analysis, they may pro-
vide information on the effects of technological innovation on the variation of
strong blocks structure, in particular concerning the possible decrease under
the threshold of some strategic flows. If the thresholds were independent on
the particular table to be studied, different country tables comparisons may be
done. The improvement of graphical representation tools, although expected
116

to be rather difficult, may easen the readability of the graph structure, In


particular that of very complicated blocks.

Acknowledgments

The author is most indebted with Antonello Pucci, who helped in the initial
formulation of graph methods in 1983. Thanks are due to both Marco Martini
and Silvana Stefani who strongly encouraged and granted the present work.
The graph analysis programs were developed with participation of Mariano
Patane, Marco Cellucci, and Roberto Granato.

References

Abbate, C.C. and G. Bove ,1992. «Modelli multidimensionali per l'ana-


lisi di tavole input/output» Atti delle seconde giornate di studio su
«Avanzamenti metodologici e statistiche ufficiali», Istat, Roma.

Benzecri, J.P. , 1973. L'analyse des donnees. Tome II, L'analyse des cor-
respondances. Dunod. Paris.

Bon, R. and X. Bing , 1993. «Comparative Stability Analysis of De-


mand-side and Supply-side Input-Output Models in the UK» Applied
Economics, 25: p. 75-79.

Bove, G. and F. Critchley , 1989. «Sulla rappresentazione di prossimita


asimmetriche» Atti delle giornate di studio del gruppo italiano aderente
all'IFCS, Societa !taliana di Statistica, La Palma Editrice, Palermo.

Camiz, S. , 1987. «The Analysis of Graph Structure as a Method for the


Analysis of the Economy Input/Output Tables» In: Heiberger R.M.
(ed.), Computer Science and Statistics. Proceedings of the 19th Sympo-
sium on the Interface. American Statistical Association: p. 169-178.

Camiz, S. , 1993. «The Analysis of Input/Output Matrices through asso-


ciated Graphs» Atti del XVII Convegno A.M.A.S.E.S.: p. 287-299.

Camiz, S. , 1994. «Strumenti metodologici per l'analisi dell'off'erta » In:


S. Camiz and S. Stefani (eds.), Metodi di analisi e modelli di localiz-
zazione dei seruizi urbani. Franco Angeli, Milano: p. 88-102.
117

Camiz, S. and A. Pucci , 1986. «:Vulnerability and Striking Capacity in


Strongly Connected Components of Digraphs: an Application to In-
put/Output Analysis» COMPSTAT'86, short communications, Dipar-
timento di Statistica, Probabilita e Statistiche applicate, Universita "La
Sapienza" , Roma: p. 43-44.

Carpano, M.J. , 1980. «:Automatic Display of Hierarchized Graphs for


Computer Aided Decision Analysis» IEEE Transa- ctions on Systems,
Man and Cybernetics, SMC-lO(ll): p. 705-715.

Cassetti, M. , 1994. «:The Identification of the most Important Interindus-


try Linkages of the European Economies through a Method for Ordering
the Input-Output Coefficients» Dipartimento di Scienze Economiche,
Universita di Brescia, Discussion Paper n. 9401.

Chenery, H.B. and T. Watanabe, 1958. «:International Comparisons


of the Structure of Production» Econometrica, 6(4): p. 487-508.

De Antoni F. (ed.) , 1982. Tavola rotonda su «:1 fondamenti dell'analisi


dei dati» Istituto di Statistica dell'Universita di Roma, CISU.

Folloni, G. , 1983. «:Una tipologia delle caratteristiche settoriali di atti-


vazione e di dipendenza» Note economiche, 4: p. 54-73.

Gantmacher, F.R. , 1959. The theory of matrices, vol. II, Chelsea Pub.
Co., N.Y.

Greenacre, M.J. , 1978. «:Quelques methodes objectives de representation


d'un tableau de donnees» These de doctorat 3eme cycle, Universite
Pierre et Marie Curie, Paris.
Greenacre, M.J. , 1984. Theory and Applications of Correspondence Anal-
ysis. Academic Press, London.
Guttman, L. , 1953. «:A note on Sir Cyril Burt's factorial analysis of qual-
itative data»British Journal of Statistical Psychology, 6: p. 1-4.

Harary, F. , 1972. Graph Theory. Addison Wesley, Reading, Mass.

Hawkins, D. and H.A. Simon , 1949. «:Note: Some Conditions of Macro-


economic Stability» Econometrica, 17(3): p. 245-248.
Hill, M.O. , 1974. «:Correspondence Analysis: a Neglected Multivariate
Method» Applied Statistics, 23: p. 340-354.
118

Istat , 1992. «Tavola intersettoriale dell'economia italiana per l'anno 1988


(versione a 44 branche)~. Suppl. Bollettino mensile di statistica, 1992.

Jones, L.P. , 1976. «The Measurement of Hirshmanian Linkages ~ Quar-


terly Journal of Economics, 90: p. 323-333.

Korte, B. and W. Oberhofer ,1971. «Triangularizing Input-Output Ma-


trices and Structures of Production~ European Economic Review, 3: p.
493-521.

Lantner, L. , 1974. Theorie de La dominance economique. Dunod, Paris.

Lebart, L. , 1984. «Correspondence Analysis of Graph Structures~ De-


mandez le Programme, 2{1-2): p. 5-19.

Lebart, L., A. Morineau and K.M. Warwick , 1984. Multivariate De-


scriptive Statistical Analysis, Correspondence Analysis and Related Tech-
niques for Large Matrices. J. Wiley and Sons, New York.

Leontief, W. , 1953. Studies in the Structure of the American Economy.


Oxford University Press, New York.

Lucev D. , 1981. «Analisi della struttura economica di una matrice input


output. II metodo di triangolarizzazione di Korte e Oberhofer ~ Uni-
versita di Napoli, Istituto di Statistica e Demografia. Internal note.

Mattioli, E. , 1993. «La caratterizzazione della struttura delle interdipen-


denze economiche mediante l'analisi delle corrispondenze ~ Diparti-
mento di Scienze Economiche, Giuridiche e Sociali, Universita del Molise,
Campobasso, Quaderni di metodi quantitativi, n. 11.

Ponsard, C. , 1967. «Essai d'interpretation typologique des systemes in-


terregionaux~ Revue Economique, 3: p. 353-373, 4: p. 543-575.

Samuelson, P.A. ,1987. Economia. Zanichelli, Bologna.

Simpson, D. and J. Tsukui , 1965. «The Fundamental Structure of In-


put-Output Tables: an International Comparison~ The Review of Eco-
nomics and Statistics, 47: p. 435-446.

Tomassone, R. , 1980. «De l'analyse des donnees a la modeIisation~ In:


Analyse des donnees. Rencontre avec l'ecole franf}ise, Istituto di Statis-
tica e Demografia, Universita di Napoli, Document Support 1.
119

Yan, C. S. and E. Ames, 1965. ~Economic Interrelatedness» Review of


Economic Studies, 32(4): p. 299-310.
Yotopoulos, P.A. and J.B. Nugent, 1973. ~A Balanced-Growth Ver-
sion of the Linkage Hypotesis» Quarterly Journal of Economics, 87: p.
157-172.
120

GRAPHS AND MACROECONOMETRIC MODELLING

M. GILL!
Department d'Econometrie
UniversiU de Geneve, Suisse

It is well known, that the non-singularity of the Jacobian matrix is a necessary


and sufficient condition for the local uniqueness of the solution of a system of
equations. A necessary condition for this non-singularity is the existence of a
normalization of the equations. If the Jacobian matrix is large and sparse, as
it is, for instance, the case for macroeconometric models, the verification of this
necessary condition is not immediate. The paper shows how this problem can be
efficiently investigated by means of a graph-theoretic approach. In particular, this
is done by seeking a maximum cardinality matching in a bipartite graph. The
case where a normalization does not exist often constitutes a heavy challenge to
the model builder and is a situation which, again, is analyzed using properties
connecting covers to matchings in bipartite graphs.

1 Introduction

The verification of a necessary condition for the local uniqueness of a solution


of a system of equations can be a non-trivial problem for certain large models.
This is particularly the case for macroeconometric models. The reason for this
is their particular building process which proceeds as follows: First, one makes
a choice about a set of variables which are explained by behavioral equations
in an explicit form. Second, the economic variables involved in these equa-
tions verify the relations of the National Accounting System. These relations
have then to be included into the model. Finally, the set of equations is com-
pleted by other definitional equations and/or equilibrium conditions. Thus,
this building process is very different from a process where by definition each
equation explains a different endogenous variable, which precisely constitutes
the necessary condition for local uniqueness. For the macro econometric model,
this necessary condition has then to be investigated on the Jacobian matrix.
Our purpose is to show how this can be done efficiently using a graph-theoretic
approach.
In Section 2, we associate a bipartite graph to the Jacobian matrix. It is
then shown that the necessary condition for local uniqueness is equivalent to
the existence of a perfect matching in this bipartite graph.
Once the existence of a perfect matching is verified, we associate an ori-
ented graph to the Jacobian matrix. This graph then tells us about the logical
structure of the equations, an information which proves useful when solving
the model numerically. This is discussed in Section 3.
121

The case where a perfect matching does not exist constitutes a heavy
challenge for the model builder. In Section 4, we cope with this problem using
minimum cardinality covers in the bipartite graph associated to the model's
Jacobian matrix.
Throughout the paper, we use examples to illustrate our approach and we
also suggest the algorithms which are able to solve these problems efficiently.

2 Models without explicit normalization

Modelling generally proceeds in specifying a set H = {hI, ... , hn } of functional


relations, involving a set X = {Xl, ... , xp} of variables. This set of functional
relations then constitutes a system of implicit or explicit equations

i = 1, .. . ,n . (1)

The model builder has then to specifya the set of endogenous variables, i. e.
the partition
X=YUZ
where Y is the set of endogenous variables and Z is the set of exogenous
variables. This then defines the model

hi(y, z) = 0 i = 1, ... , n (2)

with y E R n and z E Rm as the vectors of endogenous and exogenous variables.


The first property one expects from such a model is at least local unique-
ness of the solution y in the neighborhood of z. Therefore we derive from the
classical implicit function theorem, that the Jacobian matrix Bh/ By', evaluated
in the neighborhood of z, has to be nonsingular, i.e.

(3)

We can use (3) to establish a necessary condition on the structure of the


Jacobian matrix. To do this, we only need the incidence matrix of the Jacobian
matrix, for which we use the same notation D. The structure of matrix D is
then represented by a graph

G= (H,Y,E) (4)

where H is the set of vertices representing the rows hi, i = 1, ... , n of the
Jacobian matrix. The set of vertices Y represents the columns of the Jacobian
a Clearly, as far as behavioral equations are concerned, the choice of the endogenous
variables has already been made.
122

matrix, i.e. the endogenous variables. E is the set of edges such that [hi, Yj] E E
iff ~ =1= O.
In order to illustrate this, let us define the following variables:
BPOARD Net change in exchange rate reserves;
BPENC Loans to public enterprises on favorable conditions;
BPMCA Private capital entries;
BPLTP Long term capital entries;
BOC Balance of current transactions;
FLU Exports plus taxes;
R French interest rate;
RE Foreign interest rate;
S Exchange rate;
sa Anticipated exchange rate;
T RES Index of companies' reserves;
TCG Coverage rate of the balance of current transactions;
TCoCDE Global coverage rate of OECD countries;
DM Deutsche Mark exchange rate.
Consider a system of functions as specified in (2) given by the following 5
equations

BPMCA .. BPENC
FLU =ao+al(~R-~RE-S+sa)+a2 FLU +a3TRES

!:;'R = do + dl!:;'RE + d2 BPOR~~:PENC + d3(sa - S) + d4S


BPORD - BPENC = BPMCA + BPLTP + BOC
sa = blS + b2(!:;'TCG - ~TCOCDE)
BPORD - BPENC . .
h5 : FLU = Co + CIS + c2 DM + c3(~TCG - ~TCOCDE)

which constitute a submodel of a French macroeconometric model b explaining


the exchange rate S, together with BPMCA, S', Rand BPORD. The incidence
matrix of the Jacobian matrix of this model is given in Figure 1 together with
the adjacency matrix, as well as the picture representing the graph G defined
in (4).
One verifies that the set of vertices is partitioned into two subsets Hand
Y, in a way that all edges have one vertex in set H and the other vertex in set
Y. Such a graph is defined as a bipartite graph.
bFor a description of the model see Artus et al. (1989: p. 98).
123

H y

I~---~IO S
1 1 1 1 H Y
1 1 1
1
1
1
HITli]
1 1 Y @TIJ ~~~~~O BPORD
1 1
H ali

o BPMCA

Figure 1: Bipartite graph G = (H, Y, E).

In the following, we will use the bipartite graph G defined in (4) to investi-
gate a necessary condition for matrix D as being nonsingular. The determinant
of matrix D can be written (Maybee et al., 1989: p. 501):

n
IDI = L s(p) II d iPi (5)
pEP(n) i=l

where P(n) is the set of n! permutations of set J = {1, 2, ... , n}, Pi is the i-th
component of permutation p and s(p) is a sign function. We then immediately
conclude that a necessary condition for IDI =f. 0 is that there exist at least one
permutation p, such as the product in (5) is nonzero.
A set of n nonzero entries dipi , i = 1, ... , n corresponds in the bipartite
graph to a set of n non-adjacent edges. By definition a set of non-adjacent edges
in a graph is called a matching, denoted by Wand a matching of cardinality
n, i.e. saturating all the vertices of the bipartite graph, is called a perfect
matching.

Theorem 1 A necessary condition for the nonsingularity of a Jacobian matrix


D E IR nxn is the existence of a matching W, verifying card(W) = n, in the
bipartite graph G representing the structure of the Jacobian matrix.

The proof for theorem 1 is given in the explanation that follows relation (5).
Whereas the identification of a permutation p, verifying the existence of a
nonzero product in (5) is a very hard task, it is easy to establish the maximum
cardinality of the matchings in graph G.
124

2.1 Finding maximum-cardinality matchings in a bipartite graph


We consider a bipartite graph G = (H, Y, E) and W ~ E an arbitrary match-
ingc for G. The matching W generates a partition H' U H" and Y' U Y" of
the set of vertices; H' and Y' are the sets of saturated vertices, i. e. the end
vertices of the edges W; H" and Y" are the sets of unsaturated vertices, i. e.
all other vertices, none of which belong to an edge of W.
An alternating path in G is an elementary path /-L whose edges alternatively
belong to Wand E - W. We denote by /-Lw the set of edges in the alternating
path belonging to Wand by /-Lw those not belonging to W. An augmenting
path with respect to W is an alternating path /-L = /-Lw U i-Lw between two
unsaturated vertices (one belonging to Hit and the other to Y"). If G contains
an augmenting path /-L, then a matching W' can be found so that

card(W') = card(W) + 1 (6)

simply by exchanging the sets /-Lw and i-Lw in W. Thus we have W' = {W-
/-Lw} U /-Lw· By construction card(i-Lw) = card(/-Lw) + 1 and therefore (6) holds.
We illustrate this with the bipartite graph in Figure 2 where the edges
belonging to the matching W = {[hl,Rj, [h 2 ,BPORD], [h 3 ,BPMCAj, [h4'S]}
are drawn in dotted lines.

H Y
hl o S

h2

h3 o BPORD
h4 o sa
h5 0 o BPMCA

Figure 2: Bipartite graph G = (H, Y, E) with matching W.

The sets of unsaturated vertices are then H" = {h5} and Y" = {sa} and
an augmenting path is given by

_ -_ _
0 ...............,....
, _ __

R h2 BPORD h5

CW may correspond to an empty set.


125

where J.tw = {[sa, hI), [R, h 2], [BPORD, h5]} and Jtw = {[hI, R], [h2' BPORD]}.
Exchanging the edges in the augmenting path gives

: ............... 0-0_ _...,.0 ......... •.. •.. 0


1> ••••••••••••••• 0-:- -.....

R h2 BPORD hs

which then defines a new matching W' with its cardinality augmented by one.
Figure 3 shows the matching W'.

H Y

o BPMCA

Figure 3: Bipartite graph G = (H, Y, E) with matching W'.

We then use the following theorem about the existence of augmenting


paths and the cardinality of a matching (Berge, 1973: p. 119; Thulasiraman
and Swamy, 1992: p. 225):

Theorem 2 (Berge, 1957) In a bipartite graph a matching is of maximum


cardinality if and only if there does not exist an augmenting path.

2.2 An algorithm for finding a maximum-cardinality matching


In order to facilitate the search of the alternating path, it is convenient to
orient the edges of a bipartite graph verifying a matching W as follows:

• edges ei E W are oriented so that the starting vertex is in H' and the
ending vertex is in yl;

• edges ei fj. Ware oriented in the opposite direction (starting vertex in Y


and ending vertex in H).
An alternating path in the bipartite graph G corresponds then in the oriented
version of G to a pathd going from a vertex of set Y to a vertex of set H.
dIn an oriented graph, a path corresponds to a sequence of oriented edges, where the final
vertex of an edge is the initial vertex for the next edge.
126

Figure 4 shows the orientation of the edges for the bipartite graph G with the
matching W' given in Figure 2.

~~~~o BPORD

o BPMCA

Figure 4: Oriented bipartite graph G = (H, Y, E) with matching W.

Theorem 2 then suggests immediately the following algorithm to find a


maximum cardinality matching in a bipartite graph.

Algorithm 1 Maximum cardinality matching in a bipartite graph


G= (H, Y, E).
1. Select an arbitrary matching W
2. repeat Construct H", yll and orient G according to W
3. while 3i E yll and i not marked, do
4. seek a path p.ij, j E H"
=
if p.ij 0, then mark vertex i, else goto 6
enddo
5. I W I is maximum, ,lI augmenting path in G, stop
6. Construct new W (permutation of edges in p.ij)
end

The algorithm is polynomiale • Statement 2. is executed at most n times (the


cardinality of the matching increases by 1 at each loop) and, in statement 3.,
one explores at most m =1 E 1edges.

3 Models verifying a perfect matching

In this section we show that, for systems of equations verifying a perfect match-
ing, it is possible to associate an oriented graph whose vertices are the variables
in the equations. Such a graph proves useful in analyzing the logical structure
of the equations. In particular we will show that some interesting properties
of this oriented graph are invariant with respect to the matching chosen.
Again we represent the structure of the system with a bipartite graph.
However, as we want to include the exogenous variables, we now derive the
eThe best known algorithm is O(n 2 . 5 )(Hopcroft and Karp, 1975).
127

bipartite graph from the system (1) and we get

(7)
where the sets H and X have already been defined in connection with (1) and
EO = {[hi, Xjl I ~3 f. O} is the set of edges.
We now consider that the model satisfies the necessary condition for the
local uniqueness of the solution, i. e. that there exists a matching W in GO
which saturates the vertices of set H and the vertices of set Y. We recall that
in the subgraph G = (H, Y,E) of GO, already defined in (4), involving only
sets Hand Y, this matching is a perfect matching.
The matching W enables the definition of a particular orientation of the
edges of GO and we get the oriented version

(8)

of our bipartite graph GO. The set of arcs UO is constituted by U', the set of
edges belonging to W which are oriented from H to X, and the set of arcs U,
i. e. all other edges which are oriented from X to H. Formally we have

UO = U'UU

with

In the oriented bipartite graph G w,


the set of arcs U' defines a biunivoque
correspondence between the sets of vertices Hand Y. It is then possible to
obtain a condensed graph
Gw = (X,U)
involving the set of vertices X and the set of arcs U only, by contracting all
vertices from set H into set Y according to the correspondence given by the
arcs of set U'.
w
Theorem 3 Graph G and the corresponding condensed graph Gw verify the
same reachability for the vertices of set x.
Proof. w
By definition, every vertex hi in G has only one outgoing arc, which
then is an ingoing arc for just one Yj. Thus, every path in the oriented bipartite
graph G w, that reaches a vertex Yj, goes over the arc hi -) Yj defined by the
matching Wand therefore the contraction of vertex hi into vertex Yj cannot
create any new paths. 0
The links between the variables of the system of equations, given a par-
ticular perfect matching W in the bipartite graph, define the causal structure
128

of the model. Our oriented graph Gw is consequently the graph associated


with the causal structure. A first step in analyzing the causal structure of
a model is the construction of the reduced graph, which describes the causal
links existing between the interdependent parts of the model! This includes,
among others, the so called block-ordering, which consists in finding the block
triangular pattern of the model's Jacobian matrix. The vertices of the reduced
graph correspond to the strong components of Gw and an arc in the reduced
graph corresponds to at least one arc between vertices of the corresponding
strong components.
In Figure 5, we illustrate by means of a system of five equations the steps
which lead to the definition of the reduced graph associated to the causal
structure of the equations. Graph GO has not been reproduced as it corresponds
w
to graph G without the orientation of the arcs. The arcs drawn with dotted
w
lines in G form the matching.

H x
h 1 (X3,X4,XS) =0
h2(Xl, X2, X4, X5) = 0
h3(X2,X3,X7) = 0
h4(Xl,X4,XS) = 0
h5(X2,X7) = 0
hS(X5,X7) = 0

Gw = (X,U)

Figure 5: Graph of the causal structure of a system of equations.

3.1 Invariance of the reduced graph with respect to the matchings

A fundamental property of the reduced graph is its invariance with respect to


the matching W on which the particular orientation of G w relies.
As the matching concerns only endogenous variables we first need to con-
sider only the subgraph G = (H, Y,E) of GO.
Theorem 4 For any perfect matching W in a bipartite graph G, the corre-
sponding condensed oriented graph G w verifies the same strong components
and the same reduced graph.

'For a detailed discussion of the analysis of causal structures see Gilli (1992).
129

To demonstrate this we consider a particular perfect matching and show


that different perfect matchings in G can only be generated by a permutation
of the edges along a circuit within a same strong component.
Given a perfect matching W in a bipartite graph, we know from theorem
2 that there cannot exist any augmenting path in the bipartite graph and
that, by definition, all vertices are saturated by the matching. It is therefore
impossible to obtain a different perfect matching W' by a permutation of the
edges along an alternating chain.
The only way to generate a different perfect matching W' consists in a
permutation of the edges along an alternating circuit. This is evident if the
alternating circuit has no common edges with other alternating circuits. In fact
such a permutation is always possible. This can be immediately seen from the
w
graphs G and GWI in Figure 6 where the dotted arcs represent the matching.
w
For G let us consider the alternating circuit ... - Ys - hi - Y£ - hr - ...

~
>< Rest
Q'~
....:.<.... Rest
o •
GW· 0··················-0 YI
h· of hiO~OYI of
.~ G w GWI

~~ ~~
.
J •••, ••••••

Figure 6: Permutation of edges along an alternating circuit in G w


and the alternating circuit··· - Yk - hi - Yt. - hi - ... , which have edge
hi - Yt. in common. A permutation of the edges of the first alternating circuit
produces a new matching W' which is perfect as all vertices remain saturated
by the new matching.
Let us then show that the new perfect matching W' will not modify the
strong components in GWI. By definition two vertices in G belong to the w
same strong component, if and only if they are in at least one common circuit.
In other words, this means that each vertex in a same strong component is
reachable from each other vertex in the same strong component. It is easy to
w
verify in Figure 6 that the reachability in G and GWI is the same. Indeed,
for the first circuit, the permutation implies a reversion only of the orientation;
for the vertices on the second circuit, the reachability is the same, but simply
using different paths. We verify that, in G w,
we have a path Yk - hi -
Ye - hi and a path hi - ... - Yk. In GWI the path hi - ... - Yk remains
unchanged and the path Yk - ... - h j has been modified as it goes over
Yk - hi - Ys - ... - hr - Ye - hj •
130

w
Finally, if the strong components in G and G w' are identical, we con-
clude from theorem 3 that the corresponding condensed graphs G w and G w '
also verify the same strong components.
The invariance of the reduced graph is then due to the fact that an edge
between different strong components in G w, or an edge connecting an exoge-
nous variable, can by definition not belong to an alternating circuit. Therefore,
such edges will remain unchanged for all different matchings.

3.2 Incidence of a parlicular matching on the causal structure


In general, models verify many different perfect matchings, except recursive
models, for which the matching is unique. Recursive models contain no inter-
dependent blocks, i. e. strong components of a cardinality greater than one.
The choice of a particular matching can become relevant when we come
to the numerical resolution of the model. Two types of methods are com-
monly used for the resolution: Newton-like methods and first-order iterative
techniques such as Gauss-Seidel (Ortega and Rheinboldt, 1970).
If a Newton-like method (Ortega and Rheinboldt, 1970) is chosen, one is
faced with the problem of repeated resolutions of a linear system involving the
Jacobian matrix. This resolution is in general achieved by using the Gaussian-
elimination and the crucial point is then the condition number of the Jacobian
matrix. In this situation it does not seem important to pay attention to a
particular choice of the matching. Nevertheless, one could think of improving
the scaling of the Jacobian matrix by means of a matching, that maximizes
the edge of lowest weight.
For first-order iterative resolution techniques, it is absolutely necessary to
specify the matching, which then defines the variable for which each equation
has to be solved, either explicitly or implicitly. In this context, it then be-
comes important to discuss the impact of a particular matching on the causal
structure.
In section 3.1, we already showed that the composition of the interde-
pendent blocks of the causal structure does not depend on the choice of a
matching. In the proof for theorem 4, we have seen how different matchings
define different circuits within a strong component. For the example in Fig-
ure 5, we verify six perfect matchings. There are two possibilities of assigning
a variable in equations h3 and h5 of the first interdependent block and three
possibilities for the equations hI, h2 and h4 ofthe second interdependent block.
In Figure 7, we show the oriented graphs Gw, corresponding to the different
matchings, where the matching for the first block is [h 3 ,X2], [h5,X7J and the
matching for the second block is indicated with each graph. It can be clearly
131

Figure 7: Graphs corresponding to different matchings.

seen that the structure differs from one graph to another. For instance, vari-
able X4 has a different position in all three graphs. Moreover the arcs will also
be valued differently in each graph.
It is well known that the convergence of the Gauss-Seidel algorithm de-
pends on the eigenvalues of a matrix B = (L + diag(D))-lU, defined by split-
ting the Jacobian matrix D into L + diag(D) + U, where L is lower triangular,
diag(D) is the diagonal of matrix D and U is upper triangular.
The eigenvalues of B depend of course on the ordering of the equations,
yielding different splittings of the Jacobian matrix. This corresponds to si-
multaneous row and column permutations. Different matchings correspond to
independent permutations of the rows and columns of the Jacobian matrix and
therefore also influence the eigenvalues of matrix B.

4 System of equations without perfect matching

It follows from theorem 1, that if the maximum cardinality p of a matching W


in the bipartite graph G = (H, Y,E), defined in (4) and representing a system
of n equations, verifies p < n, the Jacobian matrix D is singular. In such a
situation one has to make a decision on how to modify the system of equations,
in order to achieve the uniqueness of the solution the model should provide.
To illustrate this, let us consider the following system of eight equations:

hl: !1(Yl,Y2,Y3,Y4,Y5) =0
h2 : Y6 = h(Y3)
h3: Y3 = /a(Y7)
h4 : !4(Yl,Y2,Y4,Y6,Y7,YS) = 0
h5 : !5(Y5, Y6) = 0
h6: Y6 = !6(Y7)
h7 : Y7 = h(Y3)
hs : !S(Y3, Y5) = 0
132

which is supposed to represent the mix of explicit and implicit relations one
frequently encounters in real macro econometric models. One can verify that
the maximum cardinality of a matching for these eight equations is 6. There
exist 252 different matchings with cardinality 6 from which we select: W =
{[hI, YIj, [h3, Y3), [h4' Y4), [h5, Y5), [h6, Y6), [h7' Y7)}.
In order to remove the singularity of the system of equations, a simple
strategy would consist in dropping the n - p equations which are not in the
matching, i.e. equations h2 and hs for our example. However, the resulting
model will depend on the particular matching we selected before and which is,
as indicated, not unique. Moreover, one might not wish to exogenize some of
the n - p variables that are not in the matching, i.e. the variables Y2 and Ys.
This problem can be approached in a far more efficient way. To do this,
we introduce the notion of cover in the bipartite graph G = (H, Y, E) and we
will be interested in a minimum cover9 , i.e. the smallest set of vertices which
meets every edge in G. An interesting relation between matching and covers
is then given by the following theorem:

Theorem 5 (Konig, 1931) In a bipartite graph G = (H, Y, E) a maximum


cardinality matching Wand a minimum cover C verify

card(W) = card (C)

This means that the maximum number of edges in a matching equals the
minimum number of vertices in a cover.
Given a bipartite graph G = (H, Y, E), verifying a maximum cardinality
matching W, with card(W) = p, p < n, we will use a minimum cover C to
reorder the Jacobian matrix D. We denote by He the set of vertices in the
cover belonging to Hand Ye the set of those vertices in the cover belonging to
Y. Let VI = card(Hc) and V2 = card(Ye), with obviously VI +V2 = p. We now
reorder the rows of the Jacobian matrix by taking first the n - VI equations
from set He and then the remaining VI equations from set He. The columns
are reordered taking first the V2 variables from set Ye and then the n - V2
variables from set Ye. As a consequence, matrix D will show the following
pattern:
Yc Yc

(9)

gIn the literature a cover is sometimes also called a transversal.


133

where the dimension of the zero matrix verifies n - VI + n - V2 > n. The


Jacobian matrix as presented in (9) is then extremely useful to cope with the
situation where the system of equations is singular.
Figure 8 shows the bipartite graph representing our example. The edges
drawn in dotted lines belong to the matching W and the vertices in the boxes
form the minimum cover C = He U Ye . We have He = {hI. h4 } and Ye =
{Y3, Y5, Y6, Y7}. The Figure also reproduces the reordered incidence matrix of
the Jacobian matrix.

H Y
h3 1 1
h5 1 1 YI

He
h6 1 1
h7 1 1 0 6,4 h3
h2 1 1 Y4
hg 1 1 Y5
{ hI 1 1 1 1 1 Y6
He h4 1 1 1 1 1 1 h7 Y7
Y3 Y5 Y6 Y7 Y2 yg YI Y4 hg 0 o yg
"-v--'''-v--'
Ye Ye

Figure 8: Partitioned Jacobian matrix and bipartite graph of example.

The sets He and Ye then clearly indicate where the modifications of the
equations should occur. More precisely, the n - p equations which have to be
modified must be chosen among the set He and the variables which have to
be added to these equations have to be in set Ye. In case one is willing to
exogenize the n - p variables not saturated by the matching, once again, the
set He indicates where to choose the n - p equations one has to drop. In the
case of the example, without the information given by the cover, one would
automatically drop equations h2 and hg , whereas the set He allows us to chose
two equations among a set of six equations.

4.1 Finding a minimum-cardinality cover in a bipartite gmph


We consider the bipartite graph G = (H, Y, E) and W ~ E a matching of
maximum cardinality supposed not to be a perfect matching, i.e. card(W) <
card(H). Again, we consider the partitions H = H' U H" and Y = Y' U Y"
134

generated by the matching W, with H' and Y' as the sets of saturated vertices
and H" and Y" as the sets of unsaturated vertices.
We now need to define R(hi ) the set of proper descendants of hi, i. e. the
set of all vertices reachable along all alternating paths starting from vertex hi.
By definition, the starting vertex hi does not belong to R(h i ). We then have

R(H") = U R(h i ) (10)


hiEH"

as the set of proper descendants from the set H" of unsaturated vertices. We
also need the set of edges WW' C W which belong to the different alternating
paths starting from H". We then have the subsets TH C H' and Ty C Y' of
vertices saturated by WH" and, of course, card(TH) = card(Ty).
The set R(H") then verifies the following property:

R(H") n Y" = and R(H") n H" =


as there are no unsaturated vertices in R(H") and because W is of maximum
cardinality. It then follows that

R(H") = TH UTy (11)

To find minimum-cardinality covers we then use the following theorem:

Theorem 6 (Roy, 1969) In a bipartite gmph G = (H, Y, E) with a maximum-


cardinality matching Wand the corresponding partition H = H' U H", into
satumted and unsatumted vertices, a minimum-cardinality cover C is given by

C = {H' - R(H")} U {R(H") - H'}


where R(H") is the set of proper descendants of H" as defined in (10).
Substituting in the theorem the set of proper descendants R(H") by the ex-
pression given in (11), a minimum-cardinality cover is also defined by

This then defines the sets He and Ye of the partition of matrix D given in (9),
i. e.
He = H' - TH and Ye = Ty
Let us illustrate this result with our example; the corresponding graph
is given in Figure 8. W = {[hl,YI], [h 3 ,Y3], [h 4,Y4], [hs,Ys], [h 6,Y6], [h 7 ,Y7]}
is the maximum-cardinality matching defining the partition H' = {hI, h 3, h 4,
135

h 5, h 6 , h7} and H" = {h 2 , hs}. We then explore all the alternating paths
starting from vertex h2 and vertex hs and we get the sets of proper descendants
R(h2) = {Y3, h 3 , Y7, h 7 , Y6, h6} and R(hs) = {Y5, h 5 , Y6, h 6 , Y7, h 7 , Y3, h3}' The
union of the sets R(h2) and R(hs) gives R(H") = {Y5, h 5 , Y6, h 6 , Y7, h 7 , Y3, h3}
and we easily verify that R(H") partitions into TH = {h 3 , h 5 , h 6, h7} and
Ty = {Y3,Y5,Y6,yd. Thus we have He = H'-TH = {hl>h4} and Ye = Ty =
{Y3, Y5, Y6, Y7} which defines our minimum cover C = He U Ye.

4.2 An algorithm for finding a minimum-cardinality cover

In order to facilitate the exploration of the alternating paths in the bipartite


graph G = (H, Y, E) with matching W, we again orient the edges as we did for
the algorithm 1. From the preceding presentation we then derive the algorithm
which follows.
Algorithm 2 Minimum-cardinality cover C = He U Yo in a bipartite graph
G= (H,Y,E).
1. Select maximum cardinality matching W
2. Compute H', H" and orient G according to W
3. Compute R(H") the set of proper descendants
4. TH = R(H") n Hi Ty = R(H") n Yi
5. He = H'-THi Ye =TYi

5 Concluding remarks

The paper presents an operational approach to problems common in the prac-


tice of building large and sparse systems of equations not verifying a zerofree
diagonal of the Jacobian matrix.
A first problem consists of finding a permutation of the Jacobian matrix
such that its diagonal becomes zerofree. This is called a normalization of
the equations and corresponds to a perfect matching in a bipartite graph, a
problem which can be solved with a polynomial time algorithm.
A second problem arises if the model does not admit a normalization, i.e.
has a Jacobian matrix which is structurally singular. In this case a minimum
cover of the same bipartite graph enables us to reorder the Jacobian matrix in
such a way as to show where to modify the structure of the equations in order
to obtain a normalization.
A way to find the blockrecursive form of the Jacobian matrix is to com-
pute the reduced graph of the oriented graph defined by a perfect matching.
This can be done because the reduced graph does not depend on a particular
matching. The oriented graph however will vary for different matchings, the
136

number of which becomes very large even for graphs of relatively modest size.
Several questions then arise: How to describe the structural characteristics of
this very large set of graphs? How does a particular matching influence the
numerical behavior of solution algorithms?
For the first question one could think that it might be possible to clas-
sify the different oriented graphs corresponding to the matchings into a much
smaller number of classes of isomorphic graphs and then analyze the structure
of the graphs of each class. To answer the second question, one certainly needs
to take into consideration the quantitative structure of the Jacobian matrix.

References

Artus, P., M. Deleau and P. Malgrange ,1986. Modelisation macro-


economique, Economica, Paris.
Berge, C. , 1967. «Two Theorems in Graph Theory». Proc. Nat. A cad.
Sci. U. S., 43: p. 842-844.
Berge, C. , 1973. Graphes et Hypergraphes, Dunod, Paris.

Gilli, M. , 1992. «Causal Ordering and Beyond». International Economic


Review, 33: p. 957-971.
Hopcroft, J. E. and R. M. Karp , 1975. «An n 5 / 2 Algorithm for Maxi-
mum Matching in Bipartite Graphs». SIAM J. Computing, 2: p. 225-
231.

Konig D. ,1931. «Graphs and Matrices». (in Hungarian) Mat. Fiz.


Lapok., 38: p. 116-119.
Maybee, J.S., Olesky, D.D., Van den Driessche, P. and G. Wiener,
1989. «Matrices, Digraphs, and Determinants». SIAM J. Matrix Anal.
Appl., 10 (4): p. 500-519.
Ortega J. M. and W. C. Rheinboldt , 1970. Iterative Solution of Non-
linear Equations in Several Variables. Academic Press, New York.
Roy, B. ,1969. Algebre moderne et theorie des graphes, Vol. I, Dunod,
Paris.

Thulasiraman, K. and M.N.S. Swamy ,1992. Graphs: Theory and Al-


gorithms, Wiley, New York.
137

QUALITATIVE SENSITIVITY ANALYSIS IN


MULTIEQUATION MODELS

M. GILLI, G. PAULETTO
Department d "Econometrie
Universite de Geneve, Suisse

The analysis of the sensitivity of a given model to some perturbation is an im-


portant tool to investigate a model's behavior. This can be either performed by
deterministic simulation or by using a linear approximation of the model. Both
methods have disadvantages. A comprehensive evaluation of a model by simulation
requires perturbation for a large set of parameters and variables and may quickly
become cumbersome. On the other hand, linear approximation for a non linear
model will only hold for small deviations from the simulation path, which then
limits the range of validity of linear analysis. Our approach explores the quali-
tative behavior of a model, given the sign and some restrictions on the interval
of the parameters in the linear approximation. Such qualitative conclusions are,
therefore, less limited to the neighborhood of a particular simulation path than
quantitative ones.

1 Introduction

Qualitative analysis is not new and goes essentially back to Samuelson (1947)
who used it in comparative statistics. The problem has been discussed, among
others, by Basset et al. (1968), Ritschard (1983), Lady (1995) and Lang et
al. (1995). More recently, Artificial Intelligence research in connection with
economic modelling seems be paying a lot of attention to questions related to
qualitative properties in a model (Kuipers, 1986; Iwasaki and Simon, 1986;
Fairley and Lin, 1990; Berndsen, 1992).
We will use a graph-theoretic approach, which proves to be particularly ef-
ficient when dealing with the sparse matrices representing the linearized model.
Moreover, such an approach will provide interesting information about struc-
tural properties of the model. Among others, it reveals the existence of quali-
tatively linked variables, i.e. pairs of variables which either always vary in the
same direction or always vary in opposite direction.
To clarify notation, let us consider a model, formally represented by the
system of n equations
(1)
where y and z are the vectors of endogenous and exogenous variables respec-
tively, (3 is a vector of np parameters and f is an error term.
Deterministic simulation is then the period by period solution of this equa-
tion system with ft == 0 and conditional to the parameters (3, the exogenous
138

variables Zt and prior-period solution for lagged endogenous variables.


We then consider a linearization by taking the first-order Taylor series
approximation to the model's non linear structural form around the baseline
simulation
(2)
where flYt, flYt-l, flz t and fl{3 are deviations around the baseline simulation
. M M M
and the matnces D t , E t , Ft and G t correspond to -[)" -[)' '-[)' and [){3'
M
Yt Yt-l Zt
respectively. In the following, we will consider these matrices constant and
therefore drop the subscript t.
Assuming the existence of an appropriate normalization of the equations,
i.e. that matrix D can be written in the form

D=B-I (3)

we obtain the reduced form of the deviation model

(4)

The reduced form of the deviation model is then suitable for analyzing the
dynamics a (matrix D-l E), the multipliers (matrix D- 1 F) and the param-
eter perturbation (matrix D-IG). Matrix D-l, sometimes called the shift-
multiplier matrix, is common to all three problems, and we therefore will focus
our attention on it.
Our aim is to determine the sign of the elements of matrix D- 1 , given only
the sign of the elements of matrix B, where we recall that D = B-1. Thus,
for an element d ij , we define:

~{
if dij > 0
';gn(<I;,) : if d ij < 0
if dij = 0

and therefore the sign of a given matrix D is defined as sign(D) = [sign(dij)J.


We now introduce a small system of 11 interdependent equations which cor-
responds to an undecomposable matrix D in (2) and which will serve as an
example throughout the paper. The corresponding matrix B is shown in fig-
ure 1.
The entries "+" and "-" in matrix B correspond to the hypotheses made
about the sign of the parameters. The entries with value +1 and -1 correspond
aproperties deriving from the structure of matrices D- 1 and E have been discussed by
Garbely and Gilli (1991).
139

Yl +
Y2 +
Y3 +
Y4 . +1+1 .
Ys +
Y6 +
Y7 . +1-1 . +1 .
YB +
yg + +
YI0 +
YIIL-________~__~+~~
VI Y2 V3 Y4 Y5 Y6 V7 VB Y9YIOYll

Figure 1: Matrix B.

to the parameters of identities. For these parameters, the interval has thus
been reduced to a particular value. This first and very trivial reduction of a
parameter's interval can be easily justified, as any deviation from such a value
is meaningless.
We now associate to the model defined by matrix B an oriented graphb
GB = (V, U). The set of vertices V corresponds to the endogenous variables
and the set of arcs U is given by the non-zero entries of matrix B, i.e. to a non-
zero entry bij in matrix B corresponds the arc j ---4 i in the graph G B. Figure 2
shows this oriented graph, where the arcs corresponding to the coefficients in
the identities are valued +1 or -1. According to what is reported in figure 1,
we have for U3 a negative sign and for all other Ui a positive sign.

Figure 2: Graph GB associated with matrix B.

This graph will help us take advantage of the sparse structure of matrix
b A detailed discussion about the use of graphs in analysing complex systems is given in
Gilli (1992; 1995).
140

B in computing the elements of the inverse of matrix D.

2 Symbolic computation of D- 1

In order to be able to find the sign of the elements of the inverse of matrix
D, we will compute them symbolically. These symbolic expressions will in
the following possibly enable the identification of necessary restrictions on the
interval of the parameters. A graph-theoretic approach for the computation of
the elements of matrix D- 1 is particularly well suited for taking advantage of a
sparse matrix. According to the structure of the graph associated to a matrix,
there exists a variety of formulations for the expression of the determinant and
the cofactors. A detailed presentation of such formulas for the computation
of the determinant as well as the cofactors, involving paths and circuits of the
graph representing the matrix, is given in Maybee et al. (1989).
We now introduce the notation used in all the subsequent developments
involving determinants and cofactors. We consider a matrix D and its corre-
sponding graph GB-I obtained by adding a loop of value -1 to every vertex
of graph G B. A sequence of adjacent arcs going from vertex j to vertex i is
a path, for which we use the notation /.L~, where k is the index for this path.
The length of path /.L~ , i.e. the number of arcs in sequence from j to i, is then
denoted by lk. The particular path, where vertex i is starting- and ending-
vertex, is called a circuit and is denoted by 4. All paths, and of course circuits,
are assumed elementary, i.e. they do not go twice over the same vertex. In
the following, we will use indifferently the notation /.L~ for the k-th path going
from vertex j to vertex i, as well as for the set of arcs forming this path. The
same remark holds for a circuit ci.
Given a path /.L~, we denote P{/.L~} as the
determinant of the subgraph obtained after deleting all vertices belonging to
path /.L~.
For the computation of the determinant, we will use its expansion in terms
of the principal minors of the matrix. Let us consider an expansion relative to
a given vertex i in the graph representing matrix D. Let {ci, ... , c~} be the
set of all circuits to which vertex i belongs. We then have c :
q
det{D) = ~)_l)lk+1 P{ci} II u (5)
k=l

To compute the inverse of a matrix D, one further needs the transposed


matrix of cofactors 1) = [iJij]. Let {/.L~j, ... ,/.L!j} be the set of all paths going
CDefinitions (5) and (6) together with a proof can be found in Maybee et al. (1989) p.
503 and p. 514.
141

from vertex j to vertex i in the graph representing matrix D. We then have


s
(lij = 2) _1)lk P{JL~} II u for i =1= j . (6)
k=l

For the elements on the diagonal we have

(Iii = P{i} (7)

where, according to the notation introduced before, P{i} is the determinant


of the subgraph where vertex i has been deleted.

3 Condensation of vertices

The formulas that are used to evaluate the determinant and the cofactors,
require the enumeration of elementary circuits for the determinant and of el-
ementary paths for the cofactors in the oriented graph G B-1 associated to
matrix D = B-1. The complexity of such a task can be reduced if it is
possible to condense the graph G B before resorting to the graph G B-1.
The general situations which allow operating a condensation of the vertices
in graph G B are the following:

" .!:!.)
-::::i_j~
~

i.e. the existence of a vertex verifying only one outgoing arc, situation a),
or only one ingoing arc, situation b).
In both cases, vertex i will be dropped and each path of length 2 crossing
vertex i will be replaced by an arc corresponding in value to the product of the
two arcs forming the path. If parallel arcs are generated, they will be replaced
by a single arc, the value of which corresponds to the sum of the parallel arcs.
The condensed graph may again contain vertices verifying situation a) or
b), and the same rules for the elimination of a vertex i will be applied. Thus,
the condensation is made in K steps, where K is the number of vertices which
can be dropped from the original graph. For a condensed graph of step k, i.e.
after the elimination of k vertices, we will use the notation G B(k) and B(k) for
its associated matrix. We now need to know how determinants and cofactors
in the condensed matrix B(k) - I are related to those of matrix D = B-1.
142

Proposition 1 At every step k of condensation the determinant and the co-


factors of matrix B-1 verify
det(B - I) = (-l)kdet(B(k) - I) (8)
()ij (-1)k()~7) (9)

where ()ij and ()~7) are elements of the transposed cofactor matrix of matrix
B-1, respectively matrix B(k) - I.
P1?of Given situation a) or b) in graph G B we consider the partition
eM
{ci, ... , U {s,+ l' ... , ~} of the set of all circuits going over vertex j in
graph G B-1, so that all p circuits in the first partition include vertex i and
all q - p circuits in the second partition exclude vertex i. Due to the loop on
vertex j, we have q ~ p + 1. The circuits in the second partition verify
P{c{}=(-l)P{c{Ui} k=p+l, ... ,q
because vertex i, in the subgraph corresponding to P{c{}, is involved in only
one circuit, i.e. a loop of value -1.
We now examine graph G B(1), where vertex i has been condensed into
vertex j and parallel arcs have been preserved, in order to have the same
number of circuits in G B - I and GB(I)_I. The set of all circuits going over
vertex j in GB(I)_I is {c{, ... ,q,} U {c;,+1' ... ,~}. Denoting by lk and lk
the lengths of circuits c{. and c{ respectively, and according to the rule for
condensation, we have

• for k = 1, ... , q IIu=IIu


• for k = 1, ... , p
• fork=p+l, ... ,q P{c{.}=P{c{Ui} and lk=lk
as these circuits do not include vertex i, which is deleted in G B(1) _ I.

Using definition (5), we expand the determinant relative to vertex j:


p q
det(B(l) - I) = I)-l)lk+1P{c{.} II u+ L (-l)lkP{c{.} II u
143

To prove (9), we can proceed in a similar way, as done above, by considering


in this case a partition of paths instead of a partition of circuits.
We now apply these condensations to the graph in figure 2 and it can
easily be seen that the vertices Y2, Y3, Y5, Y6, Y10 and Yn satisfy conditions
a) or b). The picture given in figure 3 shows the condensed graph, where
the above-mentioned vertices have been dropped and every path of length 2
crossing these vertices has been replaced by an arc.

Figure 3: Condensed graph G B(6) •

Parallel arcs are replaced by a single arc representing their sum. For
instance, there is now an arc Y4 U~5 Y7 which replaces the paths Y4 ~
1 d Us -1
Y5 ----. Y7 an Y4 ----. Y6 ----. Y7·
Continuing to apply successively the rules a) and b) for condensation, we
obtain the condensed graph GB(8). Figure 4 gives the graph GB(8)-I associated
to matrix B(8) - I. The determinant and the cofactors corresponding to the

T1 = -1 + '1.£3('1.£1 + '1.£2)
T2 = '1.£11('1.£1 + '1.£2)
T3=u4-US
T4 = '1.£8
TS = '1.£9 +'1.£10
T6 = '1.£6-1
T7 = '1.£7

Figure 4: Graph G B (8)_I associated to matrix B(8) - I.

condensed graph G B(8)-I can now be easily computed.


144

4 Qualitatively linked variables

For the cofactors associated to the vertices, which have been dropped during
the condensation process, we can show that they are simple functions of the
cofactors of the condensed graph.

Proposition 2 For two vertices i and j satisfying the rule a) for condensation,
i.e.
........ 'Un ~
?i-j",,"

the following relations between cofactors are verified:

()pi = Ua()pj for p =I- i (10)


()ii = Ua()ij - det(D) (11)

Proof. To compute ()pi, we need V4,j U U a , ... ,J-L~j U u a }, the set of all paths
going from vertex i to vertex p. The length of the k-th path J-L~j U U a is lk + 1.
The determinants P {J-L~j U u a } for k = 1 ... , s verify

as the subgraph corresponding to P {J-L~j} differs from the subgraph correspond-


ing to P{J-L~j U u a } by vertex i, which is involved in only one circuit of length
1 and valued -1. Using definition (6) we write the cofactor
s
()pi = ~)_I)ldl P{J-L~j UU a } II U
k=l 'UEIL:; U'U n
S

= Ua L)-I)ldl(-I)P{J-L~j} II U

To prove (11), we consider {cD U {J-L~j U U a , . .. ,J-L~j U u a } the set of all


circuits going over vertex i, where ci is the loop with value -1 and the k-th
circuit is defined by path J-L~ of length lk and by the arc U a • This circuit then
goes over the same vertices as the path J-L~, and therefore
145

Using definition (5) we expand the determinant of D relative to vertex i:


q
det(D) = (-1)2Dii( -1) +2:) _1)lk+2 P{f.L~ U u a } II U
k=l

which shows that (11) holds. o


Proposition 3 For two vertices i and j satisfying the rule b) for condensation,
i. e.
,. 1£/, .~
~'-J~
the following relations between cofactors are verified:

Dip = UbDjp for p #- i (12)


Dii = UbDji - det(D) (13)
Relations (12) and (13) can be demonstrated in exactly the same way as
done before for relations (10) and (11).
Finally, in the particular situation where a vertex i satisfies rules a) and
b), i. e.
~ Ub Un e:;.
-;;?h-i-l~
it results from proposition 2 and 3 that
Dii = uaubDhl - det(D) (14)
The relations between cofactors given in equations (10 - 13) define a for-
tiori a relation between their sign and therefore the corresponding variables
are said to be qualitatively linked.
For our application, we obtain the following classes of qualitatively linked
variables according to the type of condensation:
Situation a): Situation b):
{Y4, Y2, ya, Yl, Y9, YlO, yn} fYI, Y2, ya}
{Y7, Y6, Y5, ys} {Y4, Y5, Y6}
{Y7, ys}
{Y9, YlO, yn}
In the transposed cofactor-matrix 1), these sets of qualitatively linked vari-
ables define columns which are proportional for the variables in a set defined
by situation a), and proportional rows for the variables in a set defined by
situation b). This means that, knowing the elements of 1) that correspond to
the rows Yl. Y4, Y7, Y9 and to the columns Y4, Y7, we can compute all other
elements of matrix 1) by using relations (10 - 13).
146

5 Formulation of constraints

An element dij , in the inverse of the n x n matrix D = B-1, is defined by


the quotient
.. (Jij
d'3 = --- (15)
det(D)
where (Jij is an element of the transposed cofactor matrix D. Therefore, the
sign of dij will depend upon both, the sign of (Jij and the sign of det(D).
The classical formulation of the comparative static problemd assumes that the
underlying dynamic model of system (4) is stable, which then implies

sign{det(D)} = sign{(-I)n} (16)

which defines the sign of the divisor in (15), and therefore we admit that

For our example, we will now compute the elements (Jij and the determi-
nant of matrix D = B-1 by resorting to the reduced graph G B(8)-1 given in
figure 4. According to definition (8), we have det(B - I) = (-I)Bdet(B(B) - 1).
Due to the small size of matrix B(B) - I, these elements can be easily com-
putede and we give them in the following table, which is a submatrix of matrix
D:
Y4 Y7

Y4 -r6 r2 + r7r5
Y7 r3 -rl - r5r4
Y9 r3r7 - r4r6 r4r2 -r7r l

det(B(B) - 1) = r2 r3 - rlr6 + r5r3r7 - r4r5r6 .


According to the stability condition given in (16), the following constraint must
hold:

Co: sign{ det(D)} = sign{r2r3 - rlr6 + r5r3r7 - r4r5r6}


=sign{(-1)3(-I)B} (17)

dSee Samuelson (1947) or Basset et al. (1968).


elf the condensation leads to a graph of such small size, the cofactors and the determinant
can be computed immediately without using formulas (5) and (6).
147

The definition of the elements ri, i = 1, ... ,7 is recalled hereafter:


(-) (+) (+)
rl = -1 + U3 (Ul + U2) = -
(+) (+) (+)
r2 = Un (Ul +U2) = +
(+) (+)
r3 = U4 - U5 =?
(+)
r4 = Us =+
(+) (+)
r5 = Ug + UlO =+
(+)
r6 = U6 -1 =?
(+)
r7 = U7 =+
where we see that the sign of only rl, r2, r4, r5 and r7 is defined in an un-
ambiguous way, given the sign of the original arcs Ui in G B. This is already
sufficient to determine that sign({)Y4,Y7) = + and sign({)Y9,Y7) = +.
The sign of ()Y7,Y4 depends upon the sign of r3. We then introduce the
following constraint:
Cl : U5 > U4
From constraint Cl, it then follows that sign(r3) = -, and we then obtain
sign({)Y7,yJ = -.
In order to sign r6 let us consider the following additional constraint:

C2: u6<1
from which it then follows that sign(r6) = -, and sign({)Y4,Y4) = +.
Let us now resume these results in the following matrix, where it is indi-
cated which constraints are necessary to sign a given element in matrix 1)
Y4 Y7
Y4 +c2 +
Y7 - cl ?
yg ? +
The qualitative links between endogenous variables established in section 4
partition the elements of matrix 1) into eight classes of qualitatively linked
variables. These classes of variables are shown in figure 5.
To fill in all the elements of the matrix shown in figure 5, we need to
compute only two additional elements, which are {)Yl>Y4 and ()Yl>Y7' To com-
pute them, we use a condensed graph containing vertex Yl, which is shown in
figure 6. Using the definition (6), we easily compute the symbolic expressions
given hereafter
Y4 Y7
Yll~_____r~3U~l~l~-
__r~6_U~3____~_r~5_r7~U~3~-_U~l~l~(r~4~r~5~-
__1~)~
148

Y4 Y2 Y3 YI YIO Y9 Yll Y7 Y6 Y5 Ys

YI il YloY4 il YI 'Y7
Y2
Y3

Y4 il Y4 'Y4 il Y4 'Y7
Y5
Y6

Y7 il Y7 'Y4 il Y7 ,Y7
Ys

yg il Y9 'Y4 il Y9 'Y7
YIO
Yll

Figure 5: Classes of qualitatively linked variables,

Figure 6: Condensed graph containing vertex Yl,

and, given the constraints defined above, we obtain the following signs for these
elements:

YI

The constraints needed to sign the remaining classes of elements linked


together with il yl • y7 , il y7 • y7 , and il y9 • y7 , as well as the two elements il YloYI and
il y6 •Y6 are either too complex or too stringent functions of the original param-
eters. Therefore we decide to leave these elements unsigned.
The sign of the elements of matrix D- 1 concerning our example is given
in figure 7,
This matrix describes the qualitative behavior of the endogenous variables
of the model corresponding to our example. This means that, for any given
numerical value of the parameters verifying, first the sign given in figure 1,
second the constraints co, Cl and C2, the sign of the elements in the shift-
multiplier matrix will always bear the sign reported in the table.
149

Y4 Y2 Y3 Yl YIO Y9 Yll Y7 Ys Ys Ys

Yl + + + 1 + + + 1 1 7 1
Y2 + 7 + + + + + 7 7 7 7
Y3 + + 7 + + + + 7 7 7 1

Y4 - - - - - - - - + - -
Ys - - - - - - - - + 7 -
Ys - - - - - - - - 7 - -
Y7 + + + + + + + 7") 7") 7") 7")
Ys + + + + + + + 7") 7") 1") 1")

Y9 7") 7") 7") 7") 7") 7") 7" - + - -


YIO 7") 1") 1") 1") 1") 1") 1" - + - -
Yll 1") 1") 1") 1") 1") 1") 1" - + - -
a) signed if Cl is not verified.

Figure 7: Sign of elements in D-l.

6 Further decomposition

The more a model increases in size and complexity, the less it will be possible
to conclude about its qualitative responses. In the following, we suggest a
further decomposition of the problem, which is likely to facilitate this task.
Let us consider a matrix D as defined in (2) which is undecomposable, i.e.
which cannot be put into a block-triangular form. We then decompose matrix
D into a sum of matrices
D = (R-I)+Q (18)
where matrix R is such that R - I can be put into a block triangular form.
Considering flYt-l == 0 and fl(3 == 0, expression (2) becomes

{(R - I) + Q}fly = Fflz (19)

which can be considered as the equilibrium relation for the following underlying
dynamic model
(R - I)fly.,. = -QflY"'-l + Fflz (20)
The response of this model after n iterations is given by the familiar convolution
formula:

flY.,.+n = (-It(KQ)n fly.,. + {( -It(KQ)n + ... - KQ + I}KFflz (21)


150

where we have K = (R - 1) -1. An appropriate choice of matrix Q will guaran-


tee lim (KQt = 0 and we get the following approximation for matrix D-1:
n--+oo

{(-It(KQt + ... - KQ}K + K ~ D- 1 (22)

The subsystems in the block-triangular matrix K are less complex than


the original system. This simplifies the formulation of constraints, as they are
independent from one subsystem to another. The decomposition given in (18)
will yield a matrix Q with only a few non-zero columns. Then matrix KQ and
(KQ)i have the same simple pattern, i.e. the same zero columns as matrix Q.
In the situation where matrix Q has a single non-zero column, the condition
lim (KQt = 0 can be easily used to establish some necessary conditions to
n--+oo
sign the remaining elements outside the diagonal blocks of matrix K.

7 Concluding remarks

This presentation suggests a qualitative approach to the analysis of structural


sensitivity in a model. For sparse systems, as it is the case for economic models,
graph theory provides efficient tools for characterizing particular properties of
the structure of the models and for computing the determinants and cofactors
needed to establish the sign of some of the elements in the shift-multiplier
matrix.
In particular, graph theory is used to reduce the original problem into
an equivalent problem of much smaller size. This condensation reveals at the
same time classes of qualitatively linked variables.
Dealing with the condensed problem reduces significantly the computa-
tional complexity, as it provides an appropriate factorization of the determi-
nants and cofactors and allows the analytical development of all the computa-
tions. It is especially these factorized analytical expressions for determinants
and cofactors that make it possible to identify necessary restrictions on the
interval of the parameters, in order to have these expressions signed.
This last point is the most important as, in practice, the existence of
qualitative solutions will always require such restrictions on the parameters.

References

Basset, L., J. Maybee and J. Quirk, 1968. «QualitativeEconomicsand


the Scope of the Correspondence Principle» Econometrica, 36: p. 544-
563.
151

Berndsen, R. , 1992. «Qualitative Reasoning and Knowledge Representa-


tion in Economic Models» CIP-Gegevens Koninklijke Bibliotheek, Den
Haag.

Fairley, A. M. and K. R. Lin, 1990. «Qualitative Reasoning in


Economics» Journal of Economic Dynamics and Control, 14: p. 465-
490.

Garbely, M. and M. Gilli , 1991. «Qualitative Decomposition of the


Eigenvalue Problem in a Dynamic System» Journal of Economic Dy-
namics and Control, 15: p. 539-548.
Gilli, M. , 1992. «Causal Ordering and Beyond» International Economic
Review, 33: p. 957-971.
Gilli, M. ,1995. «Graph Theory-based Tools in the Practice of Macroe-
conometric Modelling» In Schoonbeek, B., E. Sterken and Kuipers S. K.
(eds.), Methods and Applications of Economic Dynamics (Series: Contri-
butions to Economic Analysis), North-Holland, Amsterdam, p. 89-114.

Iwasaki, Y. and H. A. Simon, 1986. «Causality and Device Behaviour»


Art~ficial Intelligence, 29: p. 3-32.

Kuipers, B. , 1986. «Qualitative Simulation» Artificial Intelligence, 29:


p. 289-338.
Lady, G. M. ,1995. «Robust Economic Models» Journal of Economic
Dynamics and Contro~, 19: p. 481-501.
Lang, K. R., J. C. Moore and A. B. Whinston ,1995.
«Computational Systems for Qualitative Economics» Computational
Economics, 8: p. 1-26.
Maybee, J., D. D. Olesky, P. Van den Driessche and G. Wiener ,
1989. «Matrices, Digraphs, and Determinants» SIAM J. Matrix Anal.
Appl., 10 (4): p. 500-519.
Ritschard, G. , 1983. «Computable Qualitative Comparative Static
Techniques» Econometrica, 51 (4): p. 1145-1168.
Samuelson, P. A. , 1947. «Foundations of Economic Analysis» Harvard
University Press, Cambridge, Massachusetts.
152

HADAMARD MATRIX PRODUCT, GRAPH AND SYSTEM


THEORIES: MOTIVATIONS AND ROLE IN ECONOMETRICS

M. FALIVA
Istituto di Econometria e Matematica per le Decisioni Economiche
Universitd Cattolica, Milano

In this paper it is shown that matching Hadamard product algebra with graph
and system theoretical arguments renders it possible to shed new light onto a basic
econometric issue, namely the analysis of a model's causal structure. After out-
lining the problem, the paper develops an efficient mathematical toolkit, involving
advanced algebraic topics giving several new results. This leads to a clearcut un-
derstanding of the causal and interdependent mechanisms associated with large
econometric models.

1 The econometric setting

As it is well known, in an econometric model each equation represents a spe-


cific relation, drawn from an economic theory, expressing a link between an
endogenous variable - located on the left-hand side (LHS) of the equation -
and a set of variables - located on the right-hand side (RES) of the equation -
which are considered as explicative of the former.
Each relationship specified by the model either represents a unidirectional
link among the LHS variable and RES ones - with an appended implicit causal
meaning - or mirrors a bidirectional link, with its feedback inheritance. The
phenomenon of interlacement, through feedback mechanisms, of LHS and RES
endogenous variables is referred to, by econometricians, as interdependence. In
such a context, a resort to system theory concepts along with a reinterpreta-
tion of the model's connections in a graph-theoretic framework proves effective
in order to gain a thorough insight into the causal structure of econometric
models.
To state the matter more formally, let us consider the following economet-
ric model:

y =ry+Az+g (1)
g rv N(o,:E) (2)

where r, A and :E are sparse parameter matrices - such that r is a hollow


matrixa , with no unit root in its spectrum and :E is a diagonal matrix with
positive entries - and y, z and g denote, respectively, a vector of L current
aBy a hollow matrix we mean a square matrix whose diagonal entries are all zero.
153

endogenous variables, a vector of J predetermined variables and a vector of L


structural disturbances.
In order to highlight the structural characteristics of the mechanisms op-
erating in the model at a systematic level, let us examine the deterministic
counterpart of (1), namely:
y=ry+Az (3)
which can, more conveniently, be stated in the following form:

y = [I, A] . [ ~ ] + [I, A] . [ ~ ] .y (4)

Specifying our model as in (4) leads - according to Faliva (1991) - to its


straightforward interpretation as a closed-loop system, as shown in Figure 1.

[; l--,.e±>----I [I.AI I-----I.~ .-----1~ Y

t ~ I [ 1 '4-----....
Figure 1: The model as a closed-loop system

Since, due to
r = [I, A] . [ ~ ] (5)

the matrix r plays the role of system's feedback factor, the model's causal
structure actually turns out to depend crucially on the structure of the eigen-
values (Ah) and the left eigenvectors (p' h) of r. The point becomes clear
looking at the set of implicit auto-feedback single-equation relationships:

(6)
154

which ensue from (3) by premultiplying both sides by the eigenvectors of the
matrix r.
By inspection of (6) the following conclusions, regarding the causal or
interdependent nature of the model, can be easily drawn:

a) whenever all the >'~s turn out to be zero, which is tantamount to saying
that r is nilpotent, the feedback rebound - as evoked by the block di-
agram of Figure 1 - is only apparent and the equation system is of the
recursive (causal-chain) type;

b) if the >'~s no longer vanish, then the matrix r turns out to be either de-
composable, with the feedback mechanisms operating at a local level and
a model causal pattern with block-recursive features, or indecomposable,
with the feedback mechanisms operating at a global level and a model
causal pattern with interdependence features.

In that the decomposability, indecomposability and nilpotency - as long


as hollow matrices are concerned - turn out to be topological properties b of a
matrix array, the foregoing remarks carryover from the feedback factors r to
its indicator matrix c.
Another system of explicit autofeedback equations can be deduced from
(3), namely:
(7)
where
(8)
(9)
eh is the h-th elementary vector and the b ' h 's are coefficient vectors depending
on r and Ad.
bWith the term topological properties we mean - following Marimont (1969) - the proper-
ties of a matrix which depend exclusively on the density and the relative position of its null
(and non-null) entries.
cThe point is a remarkable one since, unlike the parameter matrix r which needs to be
estimated, the knowledge of its indicator matrix - i.e. of the matrix whose entries are either
zero or one, depending on the values, null or non null, respectively, taken on by the elements
of r - ensues from the model specification, as it simply mirrors the a-priori information
on which current endogenous variables do or do not play an explicative role in the various
equations.
dLet us derive, with no loss of generality, the expression (7) for h = 1.
Set:
155

In the light of (6) and (7) we call the eigenvalues Ah of r characteristic


feedback factors and the scalars J.Lh, i.e. ratio of the diagonal entries of r . (I -
r)-l to the diagonal entries of (I - r)-l, intrinsic feedback multipliers.
Even if at a first glance, by inspection of (6) and (7), the information
contents of the sets of coefficients Ah and J.Lh may look the same, the issue
is somewhat subtle and further investigation is required to grasp their very
meaning.
Postponing the matter till Section 3, let us now move from a system-
anchored to graph-oriented approach to causal structure analysis. The graph-
theoretic approach essentially rests on an interpretation of the interlacement of
endogenous variables inherent in the model, in terms of a directed graph, with
the indicator matrix of r as adjacency matrix. The indicator matrix actually
and define:

e'lrJ~ "('
Jlrel g
JlrJ~ rl
Simple computations show that
ele'l + J~Jl = I (i)
e'lr = e'rJj.Jl = "('Jl (ii)
e'I(I - r)-lel = {1- ,,('(I _ rl)-lg}-l (iii)
Jl(I - r)-lel = (I - rl)-l . ge'l . (I - r)-lel (iv)
where (iii) and (iv) arise from well known partitioned inversion formulas (see, e.g., FaJiva,
1987). Let us now split the equation system (3) as follows:

(v)

After some algebraic manipulations (v) takes the form:

e'lY = "('Jly+e'lAz (vi)


{ JlY = ge'ly+rlJly+Jl Az
Solving the second equation of (vi) with respect to JlY gives:
JlY = (I - rt}-lge'lY + (I - rl)-lJlAz (vii)
Replacing JlY into the first equation of (vi) by the RHS of (vii) gives:
e'!y = ,,('(I - rl)-lge'lY + b'lZ
where:
b'l = {"('(I-rl)-lJl +e'l}·A
and:
,,('(I - rl)-lg = e'lr(I - r)-lel· [e'l(I- r)-lel]-l
bearing in mind (ii) and (iv) above.
156

mirrors the direct links from the RHS to the LHS endogenous variables, while
the positive integer powers of its transpose mirror the specular links among
such variables, i.e. the direct and indirect feedback rebounds.
From inspection of the zero-one pattern of the term-to-term product of
the adjacency matrix and of (the integer powers of) its transpose, it will then
be possible to gain a neat perception of the recursive and of the interdepen-
dent mechanisms operating in the model and, in short, of the model's causal
structure.
In order to master the issues raised so far we need an appropriate analytical
framework and toolkit, a topic to which is devoted the subsequent section.

2 The algebraic apparatus

Let us introduce a few definitions and establish some basic results.

Definition 1 (Hadamard Product)


The Hadamard-product A * B of two matrices A and B of the same order
is defined as the matrix of the term-to-term products of the elements of the
matrices being considered, namely:

A*B -- [a··
>J b··]
>J (10)

The following properties of the Hadamard-product can be easily estab-


lished (see Styan, 1973; Johnson and Shapiro, 1986; Faliva, 1987; Horn and
Johnson, 1991) e:

ii) (A * B)' = A' * B'

iii) (A + B) * C = A * C +B *C

iv) (A * B) . u = [(AB') * I] . u = [(BA') * I]. u,


where u is a vector whose entries are all ones;

(11)
e All matrices are supposed to have the appropriate dimensions and rank so that the
operations make sense.
157

(12)

vi) u'(A * B) . u = tr(AB')


(13)

vii) P(A * B) . P' = (PAP') * (PBP'),


if P is either a permutation or a selection matrix:!;

(14)

Definition 2 (Transition matrix from the Kronecker to the Hadamard prod-


uct)
The role of transition matrix from the Kronecker (13)) to the Hadamard (*)
product is played by the selection matrices of the form:

IN = [ :;~~~~: :;~~~~ 1 (15)


e ' N(N) (Q,'
e N(N)
'<Y

where en(N) is the n-th elementary N x 1 vector. In fact the following equality
holds:
IN(A®B)J'M =A*B (16)
for any pair of N x M matrices A e B.

The following properties of the matrix J are worth mentioning:

JJ' =1 (17)

(J'J)2 = J'J (18)


J . vecA = (A * I) . u (19)
J'J. vecA = vec(A * I) (20)
J . (D (3) D- 1 ) = J, if D is a diagonal matrix, (21)
P J = J . (P ® P), if P is a permutation matrix. (22)
f A permutation matrix is a full-rank square matrix whose rows (columns) are elementary
vectors. A selection matrix is a matrix obtained from an identity matrix by deleting one or
more of its rows.
158

Definition 3 (Hadamard idempotency)


A matrix A is said to be Hadamard idempotent if it satisfies the condition:

(23)

Definition 4 (Hadamard orthogonality)


Two matrices A and B, of the same order, are said to be Hadamard-orthogonal
if they satisfy the condition:
(24)
Theorem 1 If the matrices A and B are Hadamard-orthogonal then AB' is
a hollow matrix.

Proof The proof is simple and thus omitted. o

Definition 5 (Binary matrix)


A binary matrix is a matrix whose entries are either one or zero g.

Theorem 2 A matrix A is Hadamard idempotent iff it is a binary matrix.

Proof The proof is trivial. o


Definition 6 (Indicator matrix)
The indicator matrix Ab = [a~j] of a given matrix A = [aij] is defined as the
matrix whose elements are given by h :

if aij = 0
(25)
otherwise

The following propositions can be easily verified:

(aA)b = Ab, (qf 0 (26)

(A + B)b ::; (A b + Bb)b, (27)


(AB)b::; (AbBb)b, (28)
(PAP') = PAbp',if P is either a permutation or a selection matrix. (29)

gPermutation and selection matrices are special cases of binary matrices.


hThe superscript b stands for the symbolic operator mapping the matrix into its binary
image.
159

Definition 7 (Neutral matrix)


Given a matrix A, any binary matrix B such that:

(30)

is said to be neutral towards A.

Theorem 3 If the binary matrix B is neutral towards A and C is a non-


negative matrix, then (B + C)b is neutral towards A as well.

Proof The proof is simple and thus omitted. o

Theorem 4 If the binary matrix B is neutral towards A, then (Bk)b is neutral


towards A k for any positive integer k.

Proof The proof is simple and thus omitted. o


Corollary 1 The binary matrix (A k)h is neutral towards A k for any positive
integer k.

Definition 8 (Nested and majorazing matrices)


If two matrices A and B, of the same order, satisfy the condition:

B~A (31)

and the indicator matrix ofB is neutral towards A, then the matrix A is said
to be nested into Band B is said to majorize A.

Theorem 5 If the binary matrix A and B are Hadamard-orthogonal and C


is nested into B, then A and C are Hadamard-orthogonal as well.

Proof The proof is simple and thus omitted. o


Definition 9 (Hadamard identity matrix)
The role of identity matrix with respect to the Hadamard product is played by
a binary matrix U whose entries are all ones. In fact the following equality
holds:
(32)
for arbitrary A, provided that U has the appropriate order.
160

Definition 10 (Bounding binary matrix)


A bounding binary matrix FB of a matrix F, depending on the matrices A,
B, ... , is a binary matrix, depending on the indicator matrices A b , B b , ..• ,
such that:
i} the matrix FB is neutral towards F all over the arguments' range of
the latter;
ii} there exists a non empty subset in the arguments' range ofF where the
matrix FB reduces to the indicator matrix ofF.

Definition 11 (Similarity and cogrediency)


A similarity transformation of a square matrix A is a transformation of the
form:
(33)
where B is a non-singular matrix of the same order as A. IfB is a permutation
matrix the similarity transformation is called a cogrediency transformation i .

Lemma 1 A similarity transformation which does not affect - except possibly


for the order - the diagonal entries of the matrix at stake can be written as a
cogrediency transformation.

Proof. The invariance requirement - except possibly for the order - of the
diagonal entries of a matrix A under a similarity transformation, by a similarity
matrix B, can be stated, in algebraic terms, as follows:
J. vec(B- 1AB) = PJ· vecA (34)
where P is a permutation matrix and J is the matrix defined in (15).
According to a well known relationship among vec operator, conventional and
Kronecker matrix products, (34) can be rewritten as:
J. (B' ® B- 1 ) . vecA = PJ· vecA (35)
Since A is arbitrary, condition (35) implies that:
J. (B' ®B- 1 ) = PJ (36)
Some computations, bearing in mind (21) and (22) above, show that equation
(36) holds true iff:
B' = PD, with Db = I (37)
and thus, in particular, when B' coincides with the permutation matrix P. 0
ilf two matrices are cogredient so are their indicator matrices.
161

Theorem 6 A matrix whose diagonal entries are equal to its eigenvalues is


cogredient to a triangular matrix.

Proof As it is well known, a square matrix can always be transformed into


a triangular matrix - with diagonal entries equating the matrix eigenvalues -
by an orthogonal similarity transformation (see, e.g., Bellman, 1970). Should
the eigenvalues and the diagonal elements of the matrix coincide, then both
the original matrix and the derived triangular matrix will show the same array
of diagonal values, and the transformation at stake can be written - according
to the foregoing lemma - as a cogrediency transformation. 0

Corollary 2 A hollow nilpotent matrix is cogredient to a hollow triangular


matrix.

The following theorems consider the issue of detecting the location of null
and non null entries of rational functions of matrices, and thus can provide
valuable information on the structure of the underlying indicator matrices
(Faliva, 1983).

Theorem 7 (Linear combination theorem)


Let A and B be two matrices of the same order and a and f3 two scalars, then
the binary matrix:
S = (Ab +Bb)b (38)
is a bounding binary matrix of the linear combination:

aA+f3B (39)
Formally:
S ~ (aA + f3B)b (40)
for arbitrary A, B, a, f3 with the equality occurring if:
(41)

Proof Inequality (40) follows from (26) and (27) in a straightforward man-
ner. Concerning the latter part of the theorem, observe that under (41) the
following holds true for the entries of (39).
aaij + f3b ij = 0, if both aij and bij are zero, (42)
{ aaij + f3b ij =F 0, otherwise,
and thus (40) will be satisfied with the equality sign. o
162

Theorem 8 (Product theorem)


Let A and B be two matrices of order L x M and M x N respectively, then
the binary matrix:
II = {AbBb)b (43)
is a bounding binary matrix of the product:

AB (44)
Formally:
II ;:: (AB)b (45)
for arbitrary A, B, with the equality occurring if every vector:

Cl n = (e'I(L)A) * (e' n(N)B'), l = 1,2, ... ,Lj n = 1,2, ... , Nj (46)

is sign-stable, i. e. it is either a non-negative or a non-positive vector.

Proof. Inequality (45) follows from inequality (28). Regarding the latter
part of the theorem, observe that under (46) the following holds true for the
elements of (44):

e'I(L)ABen(N) = 0 if Cl n = 0,
{ e'I(L)ABen(N) =I- 0,
(47)
otherwise,

and thus (45) will be satisfied with the equality sign. o

Theorem 9 (Inverse matrix theorem)


Let A be a square matrix of order N, with no unit root, then the binary matrix:

(48)

is a bounding binary matrix of the inverse matrix:


(I - A)-l (49)
Formally:
(50)
for arbitrary A, with the equality occurring if A is a non-negative stable ma-
tri:JJi .
jA stable matrix is a matrix whose eigenvalues lie all inside the unit circle.
163

Proof Starting from the Cayley-Hamilton theorem the following equalities


can be easily established:
N-1 N-1
(I - A)-1 = aol + L an(1 - A)n = f301 + L f3nAn (51)
n=1 n=1

where the scalar ao, an, ... , aN-I, f3o, f3n, ... , f3 N-1 are rational functions of
the coefficients of the characteristic polinomial of 1 - A (see, e.g., Rao and
Mitra, 1971; Miller, 1987). By repeated application of rules (26) and (27) we
get:

(52)

which, bearing in mind Corollary 1, leads to (50).


For what concerns the latter part of the theorem, it is well known that the
following power series expansion:
00

(I - A)-1 = 1+ LAh (53)


h=1

holds true when A is a stable matrix, and all terms are non-negative if such is
A. Hence, by referring to Theorems 7 and 8, we get:

(54)

A comparative examination of (52) with (54) leads to conclude that (50)


holds with the equality sign. 0

On the basis of these premises we can now establish the following theorems
which play a significant role in sorting out the questions raised in Section 1.

Theorem 10 If A is a hollow matrix and R is the matrix defined by (48),


then:

i) A * R' and A - A * R' form an Hadamard-orthogonal pair;


ii) A - A * R' is nilpotent;

iii) A * R' and A are co spectral.


164

Proof of i}: Bearing in mind that R is Hadamard-idempotent, simple com-


putations give:

(A * R') * (A - A * R') A * R' * A - A * R' * A * R' =


= A * R' * A - A * R' * R' * A = (55)
A * R' * A - A * R' * A = 0

Proof of ii}: Let:

'l1 =A - A * R' = A * (U - R') (56)

and let N be the order of the matrices at stake.


The nilpotency of 'l1 can be established by showing that (see, e.g., Gantmacher,
1960):
tr'l1n+l = 0, n = 0, 1, ... ,N - 1 (57)
For n = 0, (57) is trivially true. For n > 0, let us observe that, bearing in
mind Theorems 3 and 4 and Corollary 1, the matrix R turns out to be neutral
towards A nand (A b r. Moreover, since 'l1 and 'l1 b are nested into A and A b
respectively, R is neutral towards 'l1 n and ('l1br as well.
Hence:
'l1' * 'l1 n A' * (U - R) * 'l1 n =
= A' * (U * 'l1 n - R * 'l1n) = (58)
= A * ('l1n - 'l1n) = A * 0 = 0
which in turn implies, according to property vi) of the Hadamard product,
that:
u' . ('l1' * 'l1n) . U = tr'l1n+l = °(59)
and thus statement ii) is proved.

Proof of iii}: The proof is similar. Let:

(60)

The cospectrality of C and A can be demonstrated by showing that:

trC n +1 = trA n+l n = 0, 1, ... ,N - 1. (61)

For n = 0, (61) is trivially true. For n > 0, observe that, since C is nested
into A, bearing in mind Theorems 3 and 4 an Corollary 1, R turns out to
165

be neutral towards CP Aq, where p and q are non-negative integers such that
p+q < N.
Now observe that:

tr(Cr AS) = tr [C(Cr-IAS)] = tr [(cr AS-I)A] =


(62)
= u' {C' * (Cr-IA s )} U = U' {A'(C r As-I)} U

but:
C'(Cr-IA S) = A'*R*(Cr-IAS)=A'*(Cr-IAS)=
= A' * (C T AS-I) = A' * (C r As-I) * R = (63)
= A' * R * (C TAS-I) = c' * (CT As-I)
which implies:
(64)
Repeated application of the argument above leads to prove (61) and, in turn,
statement iii). 0

Corollary 3 A hollow matrix A has the decomposition:

(65)

where the matrices C and \)! have the representations:

C=A*R' (66)

\)! = A * (U - R)' (67)


and have the following properties:

ii) C and A are cospectml matrices,

iii) \)! is a hollow nilpotent matrix.

Remark - Whenever Ab represents the adjacency matrix of a directed graph,


the decomposition shown in Corollary 3 splits the original graph into two sub-
graphs with adjacency matrices given by C b and \)!b, respectively: the former
subgraph depicts the circuits of the original graph and the latter the simple
paths. Looking at the decomposition from a system-theoretic standpoint, one
is led to detect the feedback mechanisms of model, on the one hand, and the
unidirectional links among the variables, on the other.
166

Theorem 11 Let A be a hollow matrix and R the matrix defined by (48). If:

A*R' = 0 (68)
then A is nilpotent.

Proo.f. Saying that:

is tantamount to saying that:

(69)

Premultiplying and postmultiplying by u' and u, respectively, both sides of


(69) gives:

(70)

which, according to property vi) of the Hadamard product, corresponds to:


N
trAb + L:tr (Abf = 0 (71)
n=2

Since all the arguments of the trace operators in (71) are non-negative matrices,
equality (71) implies that:

tr(Ab)n=O, n=1,2, ... ,N (72)


Thus A b, and consequently A, is nilpotent k. o
Theorem 12 If A is a hollow nilpotent matrix then the matrix A· (I - A)-l
is hollow as well.

Proof According to Corollary 2, the matrix A will be cogredient to a hollow


lower (upper) triangular matrix and accordingly the matrix (I - A,)-l will be
cogredient to an upper (lower) triangular matrix. Hence A and (I - A,)-l
turn out to be Hadamard-orthogonal and the product A· (I - A)-l will be,
by Theorem 1, a hollow matrix. 0
k According to Corollary 2, A b is cogredient to a hollow triangular matrix and the property
carries over to the matrix A. Otherwise stated: the nilpotency of a hollow matrix is a
topological property of the matrix.
167

Theorem 13 If:
A * (I - A,)-I = 0 (73)
then:
(I - A)-I * 1 = 1 (74)

Proof. According to property v) of the Hadamard product we have:

{(I - A) * (I - A')-I} . u = u (75)

Under (73) equality (75) becomes:

{h (I - A,)-I} . u = u (76)

which, since the matrix in brackets is diagonal, is equivalent to (74). 0

Corollary 4 If A is a hollow nilpotent matrix then:

p, = [(I - A)-I * Ir l J . vee [A. (I - A)-I] = 0 (77)

where J is the matrix defined by (15Jl.

3 The core of the issue enlightened


Let us reconsider the model seen in Section 1 in its deterministic version:

y = ry+Az (78)
Splitting the matrix r into a nilpotent term \)! and a cospectral term C as
shown in Corollary 3, our reference model can be written as:

y = Cy + \)!z + Az (79)

where:
c=r*R (80)
\)! = r * (U - R) (81)
c*r=o (82)
Rewriting (79) in the form:

y = [I, A] . [ ~ ] + [I, A] . [ g]. + [I,


y A] . [ ~ ] .y (83)
168

[~l
[ I, A ] t-----1~~ .----I~ Y

~I [~j 1.--..'
Figure 2: The model as a multiple closed-loop system

leads to an interpretation of it as a closed-loop system as shown in Figure 2.


Now observe that:

a) from a graph perspective, the splitting of the indicator matrix r b into the
Hadamard-orthogonal pair 'lib and C b corresponds to distinguish, within
the oriented graph associated to our model, the simple paths from the
circuits m.

IThe converse is not true: the class of matrices A satisfying (77) turns out to be somewhat
broader than the class of hollow nilpotent matrices.
m Actually, the (non-zero entries of the) indicator matrix rb reflects the direct links from
the endogenous variables on the right-hand side to those on the left-hand side of the system.
Conversely, the transpose matrix (rb)' shows the direct links in the opposite direction and
its integer powers reflect the set of the indirect links.
The binary matrix:

thus collects all the basic information on the overall (direct and indirect) links connecting
the left-hand side endogenous variables to the right-hand side ones.
169

b) from a system perspective, the splitting of the feedback factor r into the
matrix pair C and \]! corresponds to separate the effective feedback loops,
with the appended interdependence meaning, from the unidirectional
links among variables, with the respective causal meaning n.

An example may highlight the information content of the suggested de-


composition. Take for instance the following simple model:

Y1 0 0 0 0 0 0 Y1
Y2 1'21 0 0 1'24 0 0 Y2
Y3 0 1'32 0 0 1'35 0 Y3
Y4 0 0 0 0 1'45 1'46
+ Y4
+
Y5 0 0 0 1'54 0 0 Y5
Y6 0 1'62 0 0 1'65 0 Y6

au 0 0
0 a22 0

[
Z1
a31 0 0
+ Z2
0
a51
a42
0
a43
0
Z3 1
0 a62 a63

nFor what concerns the structure of the matrix C observe how (see Faliva, 1992):

i) direct feedback between two variables Yi and Yj corresponds to a straight two-way


connection joining the vertex pair "i" and "j" of the oriented graph associated with
the model. In algebraic terms direct feedback is identifiable by the existence of common
non-null elements in the matrix pair rand r /, namely by the non-null elements of the
Hadamard product r * r'o
ii) Indirect feedback between two variables Yi and Yj is associated with a bidirectional
connection between the vertices "i" and "j" of the graph corresponding to the model:
one connection bridging straight "i" to "j", and other bridging "j" to "i" through one or
more other vertices. In algebraic terms indirect feedback is identifiable by the existence
of common non-null elements in the matrix r and in positive powers of its transpose,
i.e. by the non-null elements of the Hadamard products r * (r/Y, r = 2,3, ... , L - 1.
The overall feedback effects are therefore recognizable from the non zero entries of the
matrix

Since C accounts for all feedback mechanisms, its Hadamard-orthogonal complement


q, is unaffected by them and thus mirrors the undirectional (causal) mechanisms acting
on the model.
170

The three matrices r, '1! and C are given by:


0 0 0 0 0 0
'Y21 0 0 'Y24 0 0
0 'Y32 0 0 'Y35 0
r= 0 0
0 0 'Y45 'Y46
0 0 0 'Y54 0 0
0 'Y62 0 0 'Y65 0

0 0 0 0 0 0
'Y21 0 0 0 0 0
0 'Y32 0 0 'Y35 0
'1!= 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0

0 0 0 0 0 0
0 0 0 'Y24 0 0
0 0 0 0 0 0
C=
0 0 0 0 'Y45 'Y46
0 0 0 'Y54 0 0
0 'Y62 0 0 'Y65 0
and the oriented graphs corresponding to their indicator matrices r b , '1!b and
C b , are shown in Figure 3.
In the light of the arguments developed up to now the following conclu-
sions - covering graph and system profiles and shedding light on the model's
causal structure - hold true:

Proposition 1 If r = \II, then:

* the underlying oriented graph, whose adjacency matrix is r b, does not con-
tain any circuit;

* the linear system shown in Figure 1 is - despite its appearance - an open-loop


system, since there is no feedback loop bridging variables;

* the model is of the recursive, or causal-chain, type.

Proposition 2 If all the eigenvalues of r are zero, i.e. if r is nilpotent, the


conclusions are the same as above.
171

Oriented graph corresponding to r"

cp

I
I

"
Q).------

Oriented graphs corresponding to '1'" and C'


(The dolled lines corrisponJ to the simple paths which are embodied in
'I'~ - The solid lines corri!{pond /0 the circuits which are embodied in C)

Figure 3: Oriented graphs


172

Proposition 3 If r . (I - r)-l is a hollow matrix, namely if the scalars


/-ll,/-l2,· .. , of formula (9) are zero, then:

* the underlying oriented graph, whose adjacency matrix is r b , may contain


not only simple paths but also circuits; in the latter case there is anyhow
a multiplicity of circuits crossing the relevant vertices;
* the linear system shown in Figure 1 may be either an open or a closed-loop
system: in the latter case negative and positive feedbacks balance each
other;

* the model can be either recursive or it exhibits mutually compensating inter-


dependencies among variables.

Proposition 4 If neither r =f l)! nor r =f C then:

* the underlying directed graph, whose adjacency matrix is r b , can be splitted


into two subgraphs with adjacency matrices l)!b and C b , respectively, the
former depicting the simple paths of the original graph and the latter its
circuits;
* the linear system of Figure 1 is actually a closed-loop system with feedback
connections among (subsets of) variables;
* the model has a block-recursive causal structure.

Proposition 5 If r = C, then:

* the underlying directed graph, whose adjacency matrix is r b , is either strongly


connected in its entirety or it is made up of a multiplicity of disjoint
strongly connected subgraphs;
* the system shown in Figure 1 is a closed-loop system with feedback rebounds
on the whole set of variables;

* the model is either interdependent in its entirety or it can be depicted as a


set of disjoint interdependent subsystems.

The flow-chart in Figure 4 mirrors the conclusions we have just drawn.

Acknowledgments

Support from the Italian Ministry of the University and Scientific Research
(MURST 40% 1994-1995, Faliva) is gratefully acknowledged.
173

Recursiveness

Recursiveness or self-baloncing
interdependences

Block recursiveness
no

Interdependence

Figure 4: The flow-chart depicting the model's causal structure

References

Bellman, R. , 1970. Introduction to Matrix Analysis, Me Graw Hill, New


York.

Faliva, M. ,1983. Identijicazione e stima nel modello lineare ed equazioni


simultanee, Vita e Pensiero, Milano.

Faliva, M. , 1987. Econometria: Principi e metodi, UTET, Torino.


Faliva, M. , 1991. «L'analisi dei modelli eeonometrici nell'ambito della teo-
ria dei sistemi» In M. Faliva (ed.), Il ruolo dell'econometria nell'ambito
174

delle scienze economiche. Bologna, II Mulino: p. 115-126.

Faliva, M. ,1992. «Recursiveness vs. interdependence in econometric mod-


els: a comprehensive analysis for the linear case:» Journal of the Italian
Statistical Society: p. 335-357.

Gantmacher, F.R. ,1959. The Theory of Matrices, 2 voll., Chelsea Pub!.


Co., New York.

Johnson, C. R., and H. M. Shapiro, 1986. «Mathematical aspects of


the relative gain array (A· A -T) :» SIAM J. Alg. Disc. Meth., I: p.
627-644.

Horn, R. A., and C. R. Johnson, 1991. Topics in Matrix Analysis, Cam-


bridge University Press, Cambridge, Mass.

Marimont, R. B. , 1969. «System Connectivity and Matrix


Properties:», Bull. of Math. Biophysics, p. 255-274.

Miller, K. S. , 1987. Some Eclectic Matrix Theory, Krieger Pub!. Co., Mal-
abar Flor.

Rao, C. R. and S. K. Mitra ,1971. Generalized Inverse of Matrices and


its Applications, Wiley, New York.

Styan, G. P. N. , 1973. «Hadamard products and multivariate statistical


analysis:» Linear Algebra and its Applications, p: 217-240.

For a general overview and further reading, see:

Basilevsky, A. , 1983. Applied Matrix Algebra in the Statistical Sciences,


North-Holland, New York.

Faliva, M. ,1995. «Causality and interdependence in linear econometric


models:» In C. Dagum et al. (eds.). Quantitative Methods for Applied
Sciences, Nuova Immagine, Siena: p. 186-204.
Faliva, M. and M. G. Zoia ,1994. «Detecting and testing causality in
linear econometric models:» Journal of the Italian Statistical Society: p.
61-76.

Fiedler, M. , 1986. Special Matrices and their Applications in Numerical


Mathematics, Nijhoff, Dordrecht.
175

Huggins, W. H. and D. R. Entwisle, 1968. Introductory Systems and


Design, Blaidell, Waltham Mass.
Zoia, M. G. , 1994. «ANRINT /1: software per l'analisi dei meccanismi ri-
corsivi ed interdipendenti dei modelli econometrici lineari» In G. Calzo-
lari (ed.), Software Sperimentale per la Statistica, Centro Pubbl. Offset,
Firenze.
176

INTERNATIONAL COMPARISONS AND CONSTRUCTION


OF OPTIMAL GRAPHS

B. ZAVANELLA
Laboratorio Statistico-Informatico
Universitd degli Studi di Milano

In this paper an application of graph theory is proposed, by making comparisons


among prices, quantities and volumes, surveyed in k different spatial situations. It
is included in a wider private consumption analysis in 12 countries belonging to
European Economic Community. Graph theory turns out to be a powerful tool
to identify the optimal minimum path arising in the construction of a multilateral
system of index numbers. The problem of optimal minimum path construction is
solved through the Kruskal algorithm and it is applied to the 12 EEC countries
data of 1990. Homogeneous groups of countries are also constructed, by using a
particular graph condensation procedure.

1 Introduction

In this paper an application of graph theory is proposed by making comparisons


among prices, quantities and values, surveyed in k different spatial situations; it
is included in a wider private consumption analysis in 12 countries belonging
to European Economic Community (EEC) , carried out at the University of
Milan. The data have been supplied by Eurostat and result from the 1990
prices survey. This analysis is articulated in three principal phases:

Phase I concerns the implementation of suitable statistical techniques, to


check the data quality, i.e. to identify and, possibly, to correct outliers
and wrong data.

Phase II regards a study (Zavanella, 1993a, 1993b) of the EEC economic


structure, whose purpose is the construction of countries groups char-
acterised by homogeneous consumption models. The aim is reached by
applying two multivariate statistical techniques (Multidimensional Scal-
ing and Cluster Analysis) and three homogeneous areas are identified:
North European countries (United Kingdom, Ireland and Denmark),
Central European countries (Germany, France, Luxembourg, the Nether-
lands and Belgium) and South European countries (Italy, Greece, Spain
and Portugal).

Phase III deals with a traditional comparison between price and quantity
levels, observed in each country. The axiomatic theory of index number
177

is applied in this phase (Martini, 1992)j its main definitions and concepts
are summarised in appendix A. Graph theory turns out to be a powerful
tool to identify the optimal minimum path arising in the construction of
a multilateral system of index numbers satisfying the imposed coherence
conditions (see appendix A).

In the following section, the optimal minimum path problem is described;


its formalization and solution in terms of graph theory is presented in the
third section. Afterwards, the technique developed to solve the minimum path
problem is applied in section 4 for the 12 EEe countries. The construction of
homogeneous areas is considered in section 5 and is performed by using a graph
condensation procedure. Results are discussed in section 6 and comparisons
are made with results from multivariate statistical techniques.

2 Multilateral comparison and mixed system

Questions concerning multilateml comparisons are briefly sketched in Appendix


Aj in this section an example is showed, in order to clarify the problem dis-
cussed and solved in the next section by means of graph theory. A multilateral
system of price indices satisfying tmnsitivity condition can be performed by
constructing a mixed system of direct and indirect indices, that is able to as-
sure the axiomatic properties. The formulas used in direct indices must satisfy
base and factor reversibility. The choice of using base reversible indices follows
from two important reasons. First, the global coherence among comparison
can be obtained, e.g., by requiring that the Italy price index with France base
be equal to the reciprocal of the France index with Italy base. Secondly, the
base reversibility allows to limit the analysis to the k(k - 1)/2 indices lying
in the upper triangle of the (k x k) base reversible indices matrix. A mixed
system (see appendix A) is composed by (k - 1) direct indices pt/b(t, bE K),
that originate, once linked together, the following (k - l)(k - 2)/2 indirect
indices:
(1)
Those indices are necessary to fill the upper triangle of the matrix; however,
each situation has to be considered at least once in the (k - 1) direct indices.
The main problem in constructing the mixed system regards the choice of the
(k -1) pairs of situations to compare by using direct reversible indices as Fisher
(1922) or Sato-Vartia (Sato, 1976; Vartia, 1976) (see appendix A). From a set
of k situations it is possible to obtain kk-2 subsets of (k - 1) pairSj if k is
relatively large the number of subsets is very high; in fact, when k = 3, the
possible sets are 3, when k = 4 they are 16, when k = 5 they become 125,
178

etc. For this reason the research of the optimal set for the construction of
the optimal graph is very complex. The optimal set to construct the mixed
system, formed by (k - 1) pairs, can be univocally identified among all the
others by imposing the further condition of a minimal structural dissimilarity
within each pair of situations.
That condition, among other properties, allows to reach the main goal
a mixed system is built for: the safeguard of the identity property, in those
comparisons where it is required, by virtue of similar structure in the countries
compared. The Bortkiewicz formula (Bortkiewicz, 1922) (see appendix A):

(2)

is treated as a dissimilarity measure between economic structures of two situa-


tions t and b; it is calculated for all situations pairs and results can be ordered
in a (k x k) matrix having the following characteristics:

1) the diagonal elements are zeros, for dissimilarity between a situation and
it.self is zero;

2) the matrix is symmetric. This follows from the equality B t / b = B b / t .

Given t.he characteristics of t.he Bortkiewicz matrix, the mixed system


can be found simply by considering t.he element.s in the upper t.riangle. These
elements shall be sorted in increasing order and the (k -1 ) pairs to be compared
are chosen starting from the smallest. Bortkiewicz index by using the following
procedure, proposed by Martini (1992):

1) the first two pairs must be chosen; the transitive indices for this two pairs
are set equal to t.he corresponding direct indices, that is: Ptib = p t / b;

2) all the possible indirect indices are built., as in equation (2), by linking
t.ogether the direct indices of t.he chosen pairs;

3) the t.hird pair is selected if and only if it has not been already· indirectly
compared in step 2);

4) the procedure ends when all the sit.uations have been considered at least
once in the direct comparisons; if every step is correctly execut.ed, the
pairs selected for t.he direct comparison are exactly (k - 1).
179

Table 1: Bortkiewicz Matrix.

A B C D

A 0 0.05 0.15 0.90


B 0.05 0 0.40 0.85
C 0.15 0.40 0 0.95
D 0.90 0.85 0.95 0

Table 2: Fisher Matrix.

A B C D

A 1 1.010 0.870 3.010


B 0.990 1 0.790 2.810
C 1.149 1.266 1 3.210
D 0.332 0.356 0.316 1

The procedure assures the identity of the price index for those pairs of
situations where the conditions of equality or prices proportionality subsist,
implying Bortkiewicz indices close to zero. An example will show the choice
procedure described above.
Let us compare four situations A, B, C, and D by means of a mixed
system. The related Bortkiewicz and Fisher matrices are in Table 1 and Table
2 respectively. The ordered Bortkiewicz values are in Table 3.
Note that "*,, in Table 3 indicates the three pairs chosen for direct compar-
ison. The first two pairs are chosen as usual, the third pair is rejected because
the transitive index P;jb has already been computed by linking together the
indices P;ja and p;/a; the fourth pair is chosen as it includes the situation D
in the system. At this point the objective has been reached and the procedure
can be stopped. Therefore the mixed system matrix is as in Table 4, where
bold character indicates indices calculated by using the direct formula, whereas
the others are obtained as transitive indirect indices:
P'B/A = PB / A = 1.010
PC/A = PC / A = 0.870
180

Table 3: Ordered Bortkiewicz values.

b t Bt/ b

A B* 0.05
A C* 0.15
B C 0040
B D* 0.85
A D 0.90
C D 0.95

Table 4: Mixed System matrix.

A B C D

A 1 1.010 0.870 2.838


B 0.990 1 0.861 2.810
C 1.149 1.161 1 3.262
D 0.352 0.356 0.307 1

P'D/B = PD / B = 2.810
P'D/A = PD / B / PA/ B = 2.810/0.990 = 2.838
PC/B = PC/A/PB/ A = 0.870/1.010 = 0.861
P'D/c = P'D/A/ PC/ A = 2.838/0.870 = 3.262

The indices in the lower triangle of the matrix in Table 4 are found com-
puting the reciprocal of the corresponding indices lying in the upper triangle,
that is by using the base reversibility property of the Fisher index. The method
suggested by Martini is easily applicable if the number of situations is small;
on the other hand, when the number of situations increases, it becomes very
hard to decide whether or not a pair has to be included. If the set of situations
to compare is high (for example the k=12 members of the EEC in 1990, the
present k=15 EU members, the k=24 members of OECD), the choice of pairs
to compare requires a few steps. The choice can be performed by applying
181

graph theory: the problem is reformulated in terms of optimal minimum path


choice, as described in the next section.

3 Construction of the optimal minimum path

Graph theory can help to choose in the mixed system the pairs to be compared.
We associate to the Bortkiewicz matrix a weighted digraph. With reference to
the previous example the digraph can be represented as in Figure 1.

0.05

Figure 1: Graph (A)

It is easily seen that it is a symmetric digraph of order k, where k is the


number of situations compared ( k=4 in the example) and it has k(k -1) edges
symmetrically oriented in pairs; each edge is weighted with the Bortkiewicz
value corresponding to the joined nodes. The k loops representing the values
lying in the main diagonal of the matrix in Table 1 should also be added;
actually they are excluded as their weights are all zero. In the example, the
symmetric digraph representing the optimal minimum path for the mixed sys-
tem is as in Figure 2.
It is a strongly connected digraph, with no cycles of length greater than
2, i.e. a tree. Furthermore, it is a spanning subgraph of the digraph (A),
because it includes (A) nodes. Hence, it is a symmetric spanning tree. Both
the graph (A) and the symmetric spanning tree can be replaced by a simple
valued graph, without loosing any information,
182

Figure 2: Graph (B)

~.".'.' _ _ _ _ _ _ _O_.O,-5_ _ _ _ _ _ _ tfIA.


~ ~

~----------Q9-j--------~

Figure 3: Graph (C)

and it is equivalent to limit the analysis to the upper triangle of the matrix,
as in the previous section. The graph (C) is a complete graph, because each
pair of nodes is joined by an edge.
To solve the optimal minimum path problem we need some requirements
on the subgraph (D); afterwards it is possible to give an algorithm which
enables its construction. The optimal minimum path can be represented by a
subgraph of the complete graph found under the following conditions:

1) it must be a spanning subgraph of the complete graph, i.e. it must have k


nodes to guarantee all countries are included;
2) the pairs of countries to be compared are (k-l), so the spanning subgraph
results to have (k - 1) edges;
3) the spanning subgraph must be a connected component of the complete
183

0.15 0.85

Figure 4: Graph (D)

graph, because the nodes have to be mutually reachable; this implies the
existence of a path joining all nodes pairs as in Figure 4;
4) the spanning subgraph must have no circuits, because it is not necessary
to compare directly pairs of situations already indirectly compared;
5) the edges to be inserted in the tree must have a weight as small as possible.
If (C) = (N, E) is a connected graph, where N is a set of nodes and E is a
set of edges, then a spanning subgraph (D) = (N,X) of (C) with no cycles, is
a tree that connects all the nodes in N. The cost of a spanning tree is simply
the sum of its weights.
The subgraph (D) is a minimum cost spanning tree and will be called
minimum optimal gmph in the following.
In order to obtain data in increasing order it is necessary to apply first a
sorting algorithm to the elements lying in the upper triangle of the matrix. The
objective is to find a spanning tree of minimum cost. The Kruskal algorithm
is used here (Kruskal, 1956), which is characterised by the following steps (see
Kingston, 1990: p. 257)
Let Q, X and Y be sets of edges
X=O
Y=O
Q = E (sorted in increasing order)

procedure Kruskal (N, E: gmph): gmph;


var Q, X, Y: set of edges;
begin
X:={ }; Y:={ };
Q:=E;
while not Empty (Q) do
Delete an edge {n,m} of minimum cost from Q;
if nand w lie in different components of < N, X > then
Insert ({n, m}, X);
else
Insert ({n, m}, Y);
184

Table 5: Ordered Bortkiewicz for EEC countries

b t B t /" b t Bt/I' b t Bt/I'

UK IRL 0.0262 F NL 0.1229 L IRL 0.1765


D B 0.0493 F UK 0.1261 B DK 0.1793
D NL 0.0532 NL IRL 0.1286 I IRL 0.1819
D F 0.0615 F E 0.1301 IRL P 0.1830
D IRL 0.0641 NL DK 0.1316 L UK 0.1882
D UK 0.0657 I DK 0.1359 IRL E 0.1899
F I 0.0760 B IRL 0.1371 I NL 0.1919
NL UK 0.0782 I GR 0.1412 UK E 0.1977
D L 0.0810 F IRL 0.1436 L DK 0.2044
B L 0.0835 B E 0.1442 F P 0.2067
F B 0.0864 F DK 0.1456 UK P 0.2146
B UK 0.0888 I P 0.1462 NL GR 0.2176
D E 0.0901 I UK 0.1465 DK P 0.2256
NL B 0.0973 F GR 0.1473 IRL GR 0.2291
GR E 0.1062 L E 0.1573 D P 0.2372
IRL DK 0.1080 NL L 0.1611 B P 0.2397
F L 0.1095 DK E 0.1615 GR P 0.2446
I E 0.1119 NL E 0.1622 B GR 0.2549
E P 0.1157 I B 0.1641 UK GR 0.2649
D DK 0.1187 I L 0.1676 NL P 0.2709
D I 0.1187 D GR 0.1684 L GR 0.2844
UK DK 0.1214 DK GR 0.1703 L P 0.3504

end;
end;
return < N,X >;
end K ruskal;

X and Yare empty at the beginning; when the algorithm ends, every edge
of E is either in X or Y. The edge (n, m), deleted from Q, is inserted in X
if nand m lie in different components; this means that there is no a path
(n, ... , m) in (N, X). On the other hand, the edge (n, m) is included in Y if n
and m lie in the same component of (N, X).

4 An optimal graph for EEC countries

This section deals with the optimal minimum path construction for the 12
EEC countries. Table 5 contains the Bortkiewicz values for EEC sorted in
increasing order. The complete graph is in Figure 5.
The graph includes 12 nodes and 66 edges. Note that the weights assigned
185

IRL

Figure 5: The Complete Graph. Legenda: D: Germany, F: France, I: Italy, NL: Netherlands,
B: Belgium, L: Luxembourg, UK: United Kingdom, IRL: Ireland, DK: Denmark, GR: Greece,
E: Spain, P: Portugal.

to each edge are not included in the figure. The resulting mixed system optimal
graph is in Figure 6.

The Bortkiewicz value is reported on each edge of the graph. The optimal
graph shows the central position of Germany. Ireland and Spain also have
many links with other countries. The dissimilarity between United Kingdom
and Ireland is the smallest one. Central Europe countries (except Luxembourg)
are very similar, while Denmark, on one side, and Portugal and Greece, on the
other side, are very far from the graph centre, represented by Germany. Italy
is linked with Germany through France which acts as a bridge, while Ireland
and Spain are the bridges for the North European countries and for the South
European countries respectively.
186

Figure 6: Mixed System Optimal Graph

5 Construction of homogeneous areas

In the Phase II of the wider analysis, two multivariate statistical techniques


have been applied (cluster analysis and multidimensional scaling) in order to
identify groups of countries, homogeneous in their consumption models. Both
techniques are based on a distance matrix with elements dtb such as:
dtb = d bt Vt.b = 1, ... , K;
dts+dsb2:dtbVt.b,s=l, ... ,K;
d tt = 0 'lit = 1, ... , K;
Euclidean metric:
dt,b = VEi (In(Xt,i) -In(Xb,i))2
has been chosen among metrics satisfying the conditions above, where i =
1, ... , n are the surveyed items in each country t = 1, ... ,12 referring to the
generic positive variable Xt which represents the following particular variables:
187

1) prices (Pti);

2) quantities (qti);

3) elementary values Vti = Ptiqti;

4) relative values Wti = Vti/ I:i Vti;


5) weighted prices Pti~i;
6) weighted quantities qti~i'

The use of logarithmic transformation allows adimensional distances; in


fact, it is possible to show that: In(pt,i) -In(Pb,i) = (Pt,i - Pb,i)/ML(pt,i,Pb,i)
where ML(pt,i,Pb,i) is the logarithmic mean. Therefore, differences may be
considered adimensional, since they are normalised towards a mean of the two
compared data. The logarithmic mean ML(x,y) defined as:

x-y
ML(x,y) = In(x) -In(y);

satisfies the monotonicity and the linear homogeneity properties (for the bounds
on the variables see Martini, 1992) and it is bounded by the arithmetic and
the geometric mean. The logarithmic mean of relative values Wt,i and Wb,i is
also computed, in order to weight prices and quantities, since it maintains the
symmetry of the distance matrices; consequently, the weights q)i are:

~._ Wt'-Wb'
,r. ,1.

, - In(wt,i) -In(wb,i)

A complete graph can be associated to each distance matrix and this graph can
be used to construct optimal graphs for the 6 considered variables, by using
the Kruskal algorithm. These optimal graphs are able to supply important
information about links existing among the 12 countries. This graphs are not
included in this paper owing to limited space (see Zavanella, 1993b). More-
over, the complete graphs which refer to the same variables, can be condensed
by observing proper rules so that the searched homogeneous areas are easily
identified. The condensed graph is built in as follows:

1) the distances are sorted in increasing order;

2) all the edges are included in the graph to be condensed, starting from the
smallest one until the last country is reached;
188

3) as soon as a complete sub graph is formed, the nodes belonging to this


subgraph are condensed into a single node;

4) the new condensed node replaces the previous nodes in all the following
distances;

5) the procedure ends when the last country is first included in the condensed
graph for the first time.

8=0.079

Figure 7: Condensed Bortkiewicz Graph

The condensed nodes embody a set of countries homogeneous with re-


spect to the variable considered, because the relative distances are similar (and
small). Furthermore it is possible to classify the closeness of their links, as a
function of the arithmetic mean d of the distances inside each component of the
condensed graphs and their relative variability (7/d, where (7 is the distances
standard error. For example, the condensed graph referred to the Bortkiewicz
matrix is in Figure 7. In this graph, the condensed components have a nested
1~

structure. The first component is constituted by United Kingdom, Germany


and Ireland; these countries are similar and close. The mean of the structure
difference for these countries (B = 0.052) is less than a half the largest dissim-
ilarity (B = 0.1157) that allows to include Portugal in the condensed graph.
The second component (C2) is composed by the first component (C1 ) and the
Central European countries. Netherlands is excluded from this component, as
is not linked with France and Luxembourg.
Finally, the third component links Spain and Italy to the other compo-
nents.
Three countries, Portugal, Greece and Denmark, seem to be separated
from each other in the condensed graph.
Next section refers to the condensed graph of the six distance matrices.

6 Analysis of the condensed graphs

6.1 Prices
The condensed graph for prices is very compact; only Denmark, Portugal and
Greece are isolated and three complete components are built. The first com-
ponent is formed by Central European countries. Both the mean distance and
the subgraph variability are very low, this component is therefore very homoge-
neous. The relative variability equals 4.5% of the mean distance. The weights
do not modify the condensed graphs noticeably. It has two nested components
only: the first one consists of the Central European countries, while the second
one contains remaining countries except for Portugal, Greece, and Denmark.
On the contrary, the variability is increased: 11,6% for the first component and
12,6% for the second one. The prices level depends mainly on the productivity
structure existing in the market. Therefore, the results of condensed graphs for
prices highlight an homogeneous pattern in EEC; particularly, in this respect,
Central European countries can be considered as an single market.

6.2 Quantities
In the quantities graph two clearly separated components are originated, while
in the condensed graph Italy remains alone and it acts as a bridge between
those two components. On the contrary, North European countries do not form
any component on their own. The mean distances are higher than they are in
the case of prices: the minimum mean distance is about 3 times the minimum
price distance (the comparison is possible because the distances are dimension-
less). The variability is 8% for the first component and 2% for the second one.
Weights do not alter the condensed graph significantly: UK and DK are still
190

Figure 8: Prices Condensed Graph

linked with the Central countries, while the southern countries are included
in the condensed graph alone. Variability does not increase significantly too:
10.5% for the first component. The structure of private consumption quanti-
ties is only function of habits and tastes of different peoples. EEC is clearly
separated in three distinct areas: South, North and centre of Europe, where
Italy acts as a bridge between the centre and the South European countries.
The role played by Italy is perfectly coherent with the different pattern of
habits and tastes in the different Italian regions.

6.3 Values
Finally, the distance matrices referred to the values and the relative values are
considered.
The condensed graph of values shows the situation already described for
quantities; the variability is the same too. This happens because the quanti-
ties distances are very high if compared to those of prices. The relative values
condensed graph reflects the distributions of the expenditure rates in the dif-
ferent items which characterise the different European countries. It is very
191

Figure 9: Weighted Prices Condensed Optimal Graph

compact; three nested complete components are originated and they have the
usual aspect: the first one formed by Central European countries, the sec-
ond one composed by South European countries and then a third component,
formed by Italy and the first and second components. The variability is similar
to the case of quantities.

7 Concluding remarks

The application of graph theory to consumption structures in EEC Countries


allows to reach interesting results. The proposed algorithm supplies the mini-
mum optimal graph and it helps in the construction of the mixed system; the
goal is reached with a number of computations smaller than any other statis-
tical technique. Moreover, condensed graphs are a powerful tool of analysis
to identify existence and strength of relations among different countries. Gen-
erally speaking, it is remarkable that the 12 countries belonging to EEC are
very well integrated as far as prices are concerned; this means that productiv-
ity structures of various countries are similar. Only three countries (Portugal,
192

Figure 10: Quantities Condensed Graph

Greece and Denmark) are isolated with respect to prices. On the contrary,
quantities and relative values well represent different habits of consumption
in the different geographical European areas; note that with respect to those
variables, Italy plays a role of bridge between Central and southern countries.
To confirm the quality of results, it is interesting to compare them with those
deriving from some multivariate statistical application techniques (Zavanella,
1993a, 1993b). Cluster analysis and multidimensional scaling give concordant
results. Particularly, three groups of countries have been found, homogeneous
with respect to the grouping variables:
1) Northern countries (Denmark, United Kingdom and Ireland)

2) Central countries (Germany, France, Belgium, Luxembourg and Nether-


lands)
3) Southern countries (Italy, Greece, Spain and Portugal).
193

Figure 11: Weighted Quantities Condensed Graph

Note that, within the groups, three countries behave partly differently.
For example, Italy has an intermediate position between southern and Central
countries as concerns quantities and values; Portugal and Denmark, on the
contrary, tend to detach from the respective groups when the variable consid-
ered is price. Therefore it is possible to conclude that statistical techniques
and graph theory give concordant results.

Appendix

We define:

Ps the column vector of the positive prices Ps,i> referred to a set of n items,
observed in s = 1, . .. , k countries;
qs the column vector of the positive quantities qs,i referred to the same set of
items, observed in s = 1, ... , k countries;
Vs the column vector of the corresponding elementary values Vs,i = Ps,iqs,i;

Vs = plsqs the sum of elementary values;


194

Figure 12: Values Condensed Graph

Ws the column vector of relative values Ws,i = vs,i/v";

Bilateral comparison

Let t and b be two countries to be compared, where b is the country chosen


as the base.
Pt/b is the vector of price ratios (or price elementary indices) Pt/b,i = Pt,i/Pb,i;

qt/b is the vector of quantity ratios (or quantity elementary indices) qt/b,i =
qt,dqb,i;

vt/b = vt/~ is the value index; it must be factorized in two positive numbers
pt/bQt/b where Pt / b is the synthetic price index number and it measures
the variation of prices, while Qt/b is the cofactor and measures the quan-
tity variations;

The price index number p t / b is defined to be the following function:


p t / b = F[pt, Pb, q,]
195

Figure 13: Relative Values Condensed Graph

which transforms both the n-dimensional vectors of the observed prices and
the n-dimensional vector of positive weights CP, into a real positive number.
The cofactor function Qt/b is obviously defined as:
Qt/b = Vt/b/ pt / b;
The function pt / b must satisfy the following axiomatic properties:
Strong identity (/) : Pt = Pb = Po :::} F[pt, Pb, cpj = 1
Commensurability (C) : F[3pt, 3Pb, CPj = F[pt, Pb, CPj
3 being the (n x n) diagonal matrix with positive elements ~i on its main
principal diagonal and zeros elsewhere; this property implies the inde-
pendence of the index number from any change in the physical measure
unit of goods;
Linear homogeneity (H) with respect to Pt : a-I F[apt, Pb, CPj = F[pt, Pb, CPj
being a the exchange rate (scalar); this property implies that multiply-
196

ing the prices Pt by the coefficient a, the price index number changes
proportionally.

The same properties must be satisfied by the cofactor function too.


The axiomatic properties above imply the following derived properties:

Strong proportionality: F[>'Pb' Pb, <1>] = >., which follows from strong identity
and linear homogeneity;

Homogeneity of degree -1 with respect to Pb : F[pt, .BPb, <1>] = .BF[pt, Pb, <1>],
which follows from commensurability and linear homogeneity;

Dimensionality: F[apt, apb, <1>] = F[pt, Pb, <1>], which derives from homogene-
ity of degree -1 and linear homogeneity.

Considering the commensurability and the strong proportionality, any in-


dex number can be expressed as a mean of price ratios weighted with <1>; by
letting ~i = Pb,i, we obtain:
P t / b = M[Pt/b, <1>], where M is a mean in the Chisini sense.
Besides the axiomatic properties, there are two important desired properties
that are to be possibly satisfied:
Base reversibility (B): p t / b = (P b/ t )-l;
Factor reversibility (F): Vi/b/Pt/b = Qt/b = Qt/b, which is satisfied when
the cofactor function Qt/bequals the quantity index number Qt/b, calculated
with the same formula applied for the price index number p t / b •

Formulas for bilateral index numbers


The most important formulas of bilateral price index numbers are the
following:
Laspeyres(1871) = LP t / b = ~iPt/b,iWb,i (ICH)
Paasche(1874) = pP t / b = (~iPb/t,iWt,i)-l (ICH)
Fisher(1911) = FP t / b = (LP t / bP Pt/b)1/2 (ICHBF)
Sato(1975) - Vartia(1976) = exp I1i In(pt/b,i)<I>d ~i <l>i (ICHBF)
<l>i = (Wt,i - wb,i)/(lnwt,i -lnwb,i)
ICH indices are bounded between Laspeyres and Paasche index, whereas
the range ofthe !CHBF indices is defined by the Sato-Vartia and by the Fisher
indices.
197

Multilateral comparison

Let t, band s be three countries to be compared; it is now necessary to


calculate a (3 x 3) matrix, which has unitary diagonal elements and the six
indices corresponding to the comparison of all the pairs of countries that is
possible to construct.
Comparisons between countries tand b can be conducted either by a direct
index p t / b, or by a transitive indirect index: P*t/bls = Pt/b/P b/ s ; when
P t/b = P* t/bls, the P formula is transitive.
The value index Vi/b always satisfies the transitivity condition, which assures
a global coherence among all comparisons performed upon a give set of situa-
tions, thus it is reasonable to require that both functions p t / b and Qt/b satisfy
the same condition.
It can be proved that a transitive price index cannot satisfy the strong identity
for the index and/or for the cofactor functions; consequently, to meet the tran-
sitivity condition in the comparison of three countries it is possible to choose
between two alternative procedures:

1) to build six indices (representing all the possible comparisons among sit-
uations) by applying the same transitive formula for every couple of
countries: in this case strong identity and strong proportionality of the
indices and/or of their cofactor do not hold for each countries pair;

2) to build a mixed system of four direct indices ICH, with cofactor ICH, for
two couples of countries and two indirect indices for the last countries
pair; the two pairs of countries to be directly compared must be chosen
so that the loose of identity be minimised: actually it is important to
choose the two countries in which prices are fairly equal or proportional.
A simple similarity measure of prices (and quantities) is the Bortkiewicz
formula (Bortkiewicz, 1922).

3) when the applied P formula is ICHBF, then the em mixed system requires
the calculation of two direct indices and of one indirect index only; the
other three comparisons (corresponding to the lower triangle of the ma-
trix) can be obtained simply by taking the reciprocal of the calculated
indices.

More generally, if the countries to be compared are k ;::: 3, the mixed system
requires the calculation of (k-1) direct indices ICHBF and of [(k -l)(k -2)/2]
indirect indices, the other ones are obtained as the reciprocals. Obviously,
every country must be present at least in one of the (k - 1) couples to be
198

directly compared, to make possible the construction of the (k x k) complete


matrix through the calculation of transitive indirect indices and this condition
have to be remembered in choosing the more similar couples of countries for
the direct comparison.
The Borlkiewicz formula is:

where:
pPt / b =Paasche index of prices;
LPt / b =Laspeyres index of prices;
LQt/b =Laspeyres index of quantities;
Ippql = (IO"pqI/O"pO"q) =absolute value of the correlation coefficient;
O"pq=covariance between price and quantity ratios (Pt/b,i' qt/b,i);
O"p =mean square error of prices ratios Pt/b,i;
O"q =mean square error of quantities ratios Pt/b,i.
The Bortkiewicz formula is a synthetic measure of the dissimilarity of two
situations t and b; as follows:

- it equals zero when it is Pt = APb and then O"p = 0 and O"pq = 0, or qt = -yqb
and then O"q = 0 and O"pq = 0;

- it increases, when the mean square errors 0" q and 0"p and the absolute value
of the correlation coefficient Ippq I increases;

- it is invariant with respect to the exchange of situations t and b, so that


B t / b = B b/ t ;
- it is invariant with respect to the exchange of prices and quantities, for:
IpPt/b - LPt/bll LPt/ b = IpQt/b - LQt/bll LQt/b;
- it is independent of the populations and conventional unit of money order of
magnitude of the two situations.

In the multilateral comparison of k situations, it is possible to calculate k 2


Bortkiewicz measures, in such a way to construct a symmetric (k x k) matrix,
with zero diagonal elements.

References

Bortkiewicz, L. , 1922. «Zweckund ~truktureiner Preisindexzahl~. Nor-


disk Statistik Tidskrijt, 1.
199

Fisher, I. , 1911. The purchasing power of money. Mc Millan, New York.

Fisher, I. , 1922. The making of index number. Houghton Miffin, Boston.

Kingston, J. H. , 1990. Algorithms and data structures: design, correctness,


analysis. Addison Wesley, Sidney
Kruskal, J.B. Jr ,1956. «:On the shortest spanning subtree of a graph and
the travelling salesman problem» Proceedings of the American Mathe-
matical Society, 7: p. 48-54.
Laspeyres, E., 1871. «:Die Berechnung einer mittleren»
Warenpreissteigerung Jahrbucher fur Nationaloeconomic und Statistik.
Band XVI, Jena, 1871.

Martini, M. , 1992. I numeri indice in un approccio assiomatico. Giuffre,


Milano.

Paasche, M. , 1874. «:Uber die Preisentwicklung der letzten Jahre nach den
Hamburger Borsenmotierungen»Jahrbucher fUr die Nationaloekonomic
und Statistik, Band XXIII, Jena.
Sato, K. , 1976. «:The ideal log-change index number» The Review of Eco-
nomics and Statistics, 58(2): p. 223-228.
Vartia, Y. O. ,1976. «:Ideallog-change index numbers»Scandinavian Jour-
nal of Statistics, 3: p. 121-126.
Zavanella, B. M. , 1993a. «:Comparison of consumption among EEC coun-
tries: prices, quantities and values»Bulletin of the International Sta-
tistical Institute, Contributed Papers, 49th Session, Book 2, Firenze.
p. 571-572
Zavanella, B. M. , 1993b. «:The private consumptions in Europe: prices,
quantities and values.»Internal report, Istituto Scienze Statistiche, Uni-
versita di Milano.
200

GRAPHICAL GAUSSIAN MODELS AND REGRESSION

G. JONA LASINIO, P. VICARD


Dipartimenta di Statistica e Prababilita
Universita di Rama, "La Sapienza "

This paper is basically a review of definitions and properties of graphs and their
use in statistical modelling. We focus our attention only on what we think is
essential for a good understanding of statistical graphical models. In the last
sections the relative merits of regression and graphical modelling approach are
compared theoretically and by means of an application to real data.

1 Introduction

Since the paper of Wright (1923) the idea of associating the knots (or vertices)
of an oriented graph with continous random variables (r.v.) and its edges with
given measures of correlation and causality, has received increasing attention.
Later on the work of Darroch et al. (1980) showed the existence of a strong re-
lation between log-linear models and certain probability distributions (Markov
random fields) defined on the knots of a graph. The aim of this paper is to
analyze the relation between graphical gaussian models and regression.
We first briefly review the Iterative Proportional Fitting (IPF) procedure
and some basic concepts of graphs theory, then we discuss the relation between
these models and regression theory. In particular we study the conditions
under which the two methodologies have the same capability of representing
the interaction in a set of random vectors.

2 Introduction to Graphical Models

Before defining graphical models, we need some definitions and properties from
graph theory.

Definition 1 A graph G = (K, E) is a mathematical object composed of two


sets: K the set of knots and E the set of edges joining the knots.

A graph is said to be directed if its edges are arrows representing some


causality relation (in a broad sense) among the knots, and it is said to be
undirected if the edges are simple lines.
Let us now consider a k dimensional r.vt. (random vector)
X = (Xl, ... ,Xk ) E ~k,
201

and associate to each knot of the graph G a component of X.

Definition 2 The conditional independence graph of X is an undirected gmph


G = (K, E), where K = {I, ... , k} is the set of vertices and the edge (i,j) does
not belong to E if and only if Xi is independent of Xj given the remaing
variables.
A gmphical model for the r.vt. X is a family of probability distributions
on ~k constrained to verify the conditional independence statements described
by the conditional independence graph of X, and arbitrary otherwise.
From the given definitions the central role played by the concepts of inde-
pendence and conditional independence between r.vt. in the development of
graphical models theory is clear. Let us now review the models breafly.
Let X t = (Xl' ... ,Xk) and yt = (Yl , ... , Yk ) be two r.vt. in ~k, and denote
with fx{x) and fy(y) their p.d.f. (probability density functions).

Definition 3 The r. vt. X and Y are independent, and we will write X 1. y,


when their joint p.d.! satisfies
fxy(x,y) = fx(x)fy(y)
From Definition 3 we can easily build the following factorization criterion:

Proposition 1 Two r. vt.s X and Y are independent if and only if there exist
two functions 9 and h such that

fxy(x,y) = g(x)h{y) Vx,y E ~k (I)


The independence is reached when the joint p.d.f. can be written as the
product of two functions, depending only on the values of X and Y respectively.
Furthermore 9 and h do not need to coincide with f x (x) and fy (y).
Notice that the relation of independence is symmetric and that joint inde-
pendence implies marginal independence, to clarify consider for k = 3
X = (Xl, X 2 , X 3 ), if Xl 1. (X2 , X 3 ) it follows that Xl 1. X 2 and Xl 1. X 3 •
The idea of conditional independence between r.vt. (Dawid, 1979) it is
even more central than the concept of independence.

Definition 4 Let X, Y and Z be three r. vt.s on ~k, we say that Y is inde-


pendent from Z conditionally on X, and we write Y 1. Z I X if the conditional
p.d.! ofY and Z given X fy,zlx(Y, Zj x) can be written as

fy,zlx(Y, Zj x) = fYlx(Yj x)fzlx(zj x)


202

for all values of X such that fx(x) > O.


The last condition is necessary to ensure the existence of the conditional
p.d.f. fYlx(Y;x) and fZlx(z;x).
Definition 4 is equivalent to the following:
Y ..L Z I X if and only if one of the two following propositions is true

a) fYlx,z(Y; x, z) = fYlx(Y; x)
fx z(x, z)
b) fx,Y,z(x, Y, z) = fxy(x, y) ix(x)

From a) we have that to completely define the density of Y conditionally on


X and Z, when Y ..L Z I X, we need only to know the values of Y and X.
From a practical point of view, we only have to collect information about Y
and X instead of X, Yand Z, fact that simplify sampling procedures.
The factorization criterion in the case of conditional independence, states
that given three r. vt.s X, Yand Z, Y ..L Z I X if and only if there exist two
functions g and h such that

fx,Y,z(x, Y, z) = g(x, y)h(x, z) Vy, z

for all values of X such that fx(x) > O.


The relevance of conditional independence statements becomes clear in the
study of regression models. Consider the random vector (Xl, X 2 , X 3 , Y) and
assume that its components are linked by the following relation:

E(Y I X I ,X2 ,X3 ) = ao + alXI + a2X2 + a3 X 3'


If Y ..L Xl I (X2 , X 3 ) then Xl can be removed from the covariate set. In
this case the knowledge of Xl do not add any information on the dependent
variable Y. The reduced model is E(Y I X 2 ,X3) = ao + a2X2 + a3X3'
Few more definitions and properties are needed in order to show how the
independence and the conditional independence of r. vt. are related to graphical
models.
Recalling that a graph is undirected when all its edges are lines, we can
give a more formal definition of this idea (Whittaker, 1990). Let i and j be two
knots of a graph G = (K, E), i and j are joined by an undirected edge (line) if
E contains (i, j) and (j, i), then G is undirected if all its edges are undirected.
We will say that i,j E K are neighbours (we write i rv j) if they are joined by
an edge.
To build a graph from a p.d.f. a very important role is played by the
Markov property. This property is usually given in terms of neighborhood
203

relation between knots, then we have to clarify the concepts of boundary and
closure of a set of knots.
The boundary of a set a ~ K (bd(a)) is the set of all knots in K but not
contained in a that are neighbours of the knots in a. More formally we write
K\a to denote the set of knots not contained in a. The closure of a set of
knots, ii, is the union of a and its boundary.
We are now able to state the Markov properties (Darroch et al., 1980).

Proposition 2 Let X = (Xl' ... ' Xk)t be a r. vt. with p.d./. fx(x) > 0
for all x E ?J?k. The vector X is said to be a Markov vector if, given a graph
G = (K, E) associated to it, one of the following equivalent statements is
verified
i) If I is the closure of i E K and K\I is the set of knots excluding all
neighbours of i and i itself, then for all i E K Xi 1.. XK\i I Xbd(i).
ii) For all i,j E K such that i rf j we have that Xi 1.. Xj I XK\(i,j).
iii) If a is a subset of knots in K, Xa 1.. XK\a I Xbd(a).
iv) If two disjoint subsets a and b of K are separated by a third subset d E K
then Xa 1.. Xb I Xd.
The equivalence of this four properties holds while fx(x) > 0, otherwise
we cannot use them interchangeably. The property i) is usually called local
Markov property and it is strictly related to regression models, because the
r.vt. Xi can be explained only by the variables belonging to its boundary in
the independence graph i.e. the variables from which it depends directly.
Property ii) called pairwise Markov property has an important role in the
development of gaussian graphical models. In fact these models are based
on the multivariate normal distribution which takes in account only pairwise
interactions. Then using this property we can easily build the independence
graph associated to such models.

3 Graphical Gaussian Models

A graphical gaussian (or multivariate normal) model is a graphical model in


which the r. vt. X = (Xl, ... , Xk)t has a multivariate normal p.d.f. with
given mean vector m = (ml, ... , mk) t and covariance matrix V, and we write
X rv Nk(m, V). Recalling that the multivariate normal density is
(x-rn)tD(x-rn)
fx{x) = 2 (2)
204

where D is a k x k symmetric, positive definite matrix and D = V-I.


The gaussian distribution has a central role in many statistical prob-
lems, not only because many statistical functions have, under mild conditions,
asymptotic normal distribution (central limit theorem), but also for its "good
"properties. For example if X rv Nk{m, V) the marginal p.d.f. of Xi is a
univariate normal p.d.f. with mean mi and variance Vii, where Vii is the i-th
diagonal element of the matrix V. Furthermore the conditional p.d.f. of say
Xi given all the remaining r.vt. is again a univariate normal. Clearly, as in
graphical models we always deal with conditional and marginal p.d.f., these
results are very useful.
Relations of independence and conditional independence in this kind of
models can be completely defined by the covariance matrix and its inverse, as
shown in the following propositions (Whittaker, 1990). In what follows, with-
out loss of generality, we take m = O.

Proposition 3 Let Xa and Xb be two r. vt.s with multivariate normal joint


p.d.f. and write the covariance matrix V and its inverse D as block matrices,
i.e.
(3)

where Vaa is the covariance matrix of the r. vt. X a , Vab is the matrix of
covariance between Xa and Xb and analogous definitions hold for V ba and
Vbb (usually Vab = Vba). Then Xa and Xb are independent if and only if one
of the two following condition is true

i) Vab=O
ii) Dab = O.
Proposition 4 Let X a , Xb and Xc be three r. vt.s with multivariate normal
joint p.d.f.. Under the same assumptions of Proposition 3, Xb ..1 Xc I Xa if
and only

i) V be - V ba v;;l Vac = 0, or equivalently


ii) Dbc = O.

In particular if Xb and Xc are one-dimensional and K = {I, ... , k} we have

Xi ..1 Xj I XK\(i,j) if and only if d ij = 0

where dij is the {ij)-entry of matrix D.


205

From these two propositions it is clear that to check independence and


conditional independence in gaussian models, we only need to know the co-
variance matrix. Notice that the elements of matrix D = {d ij } are the coef-
ficients of the crossproduct of variables Xi and Xj, then they "measure "the
interaction between them. It follows that the independence graph associated
to X rv Nk(m, Y) must contain only pairwise conditional independence state-
ments, because of the structure of the multivariate normal p.d.f..

3.1 Estimation of the Covariance Matrix


In order to estimate the covariance matrix of such models we choose the maxi-
mum likelihood method (m.l.m.). Then our aim is to maximize the logarithm of
the p.d.f. (log-likelihood function) of an Li.d. sample of size n from Nk(m, Y)
under the restrictions given by some independence graph associated to X.
These restrictions will be given setting suitably chosen elements of the inverse
covariance matrix to zero.
Our data are collected in a n x k matrix Y = (YI, ... ,yn)t = {Yij}
i = 1, ... , n, j = 1, ... , k, with log-p.d.f.
n
2f(m, Y) = cost - L(Yi - m)ty-I(Yi - m) - nlogdet(Y) (4)
i=1

Ai; our main interest is the estimation of the covariance matrix, we set the
mean vector equal to its maximum likelihood estimate (m.l.e.), i.e. m = y, so
that expression (4) simplifies to

2f(Y) = cost - tr(y-IS) - n log det(Y) (5)


k
where S = {Sij} is the sample covariance matrix and tr(A)= L au is the
i=1
trace of the k x k matrix A.
To maximize (5) under the constraints given by a conditional independence
graph G = (K, E) we have to solve a constrained system of equations. Notice
that this implies that the marginal p.d.f. of Yare given.
The system to be solved is

Vij=Sij if(i,j)EE or i=j (6)


ij
V = 0 if (i,j) ~ E and i =f: j (7)
where vij are elements of y-I.
206

Equation (6) can equivalently be written in terms of cliques of the graph


G. A clique of a graph is a subgraph where all the knots are neighbours of each
other. In other words, in a clique all the knots are connected by an edge. Then
if C(G) is the class of all cliques in G we have that (6) becomes Vee = Sec if
c E C(G). This expression will be relevant in the construction ofthe IPF (Iter-
ative Proportional Fitting) procedure, which is based on the following theorem
(Speed and Kiiveri, 1986).

Theorem 1 Let G = (K,E) be the complementary graph ofG = (K,E) such


that (i,j) E E if and only ifi # j and (i,j) </. E. Furthermore let C and C be
the class of cliques of G and G respectively. Then given two positive definite
matrices Land M, defined on the vertex set K, there exists a unique positive
definite matrix V such that:

i) Vij = lij if (i,j) E E ori =j.


ii) v ij = mij if (i,j) </. E and i # j.
In terms of cliques i) and ii) become:

i') Vee = Lee if C E C


ii') V-;'} = Mce for C E C except on the diagonal.
In most cases equations (6) and (7), even applying Theorem 1, cannot be
solved directly. Dempester (1972) showed the existence and uniqueness of the
solution of the system, but he proposed a very complicate iterative procedure.
A simpler estimation technique is based on the IPF algorithm which generates
a sequence of positive definite matrices that converges to the m.l.e ..
The main tool in the IPF is the Kullback-Leibler information divergence
(Kullback, 1968).

Definition 5 The Kullback-Leibler information divergence between two p.d.f.


for the r. vt. X taking values in ~k, fx(x) and gx(x) is

SS(fllg) = r
J!Rk
fx(x) log fx((x))dX
gx x
Notice that SS is not a distance being not symmetric. Usually SS is used
to assess the quality of the approximation of gx to fx and it becomes a good
way to study the convergence of an iterative procedure for the estimation of a
p.dJ..
207

In the gaussian case <;S has a very simple form. Consider two normal p.d.f.,
fx(x) and gx(x), with the same mean vector and covariance matrices VI and
V 2 respectively, then

(8) depends only on the two covariance matrices, then we will write
<;SUlIg) = <;S(V 1 1IV2 ). Furthermore it can be seen as the divergence between
two positive definite matrices and used to assess convergence of the IPF. In
order to do that, let us consider some of the properties of (8) useful in this
case.
Let IKI denote the cardinality of the set K and G = (K, E) be a graph.
Let P be the class of IKI x IKI positive definite matrices. Each matrix in P
identifies a normal p.d.f.; recall that a set Q C ~k is said to be a compact set if
given any sequences taking values on Q we can build a subsequence converging
to some Xo E Q.

Proposition 5 The <;S-divergence has the following properties:

i) If Land M are two matrices belonging to P, then <;S(LIIM) ;::= 0 with


equality holding if and only if L = M;

ii) given L,M E P, if there exists V E P such that


a) Vij = lij if (i,j) E E and
b) v ij = m ij if (i,j) ¢ E, then

<;S(LIIM) = <;S(LIIV) + <;S(VIIM)


where E is a set of unordered pairs of elements (not necessarily distinct)
of K. If such a V exists it is unique.

iii) If {Mn} and {Ln} are sequences of elements of compact subsets of P,


then

Property ii) is crucial for the solution of the constrained maximum likeli-
hood system of equations given above.
208

3.2 Iterative Proportional Fitting


To build the estimate of the covariance matrix of Theorem 1 we are provided
with two different versions of the IPF. Both of them are obtained as special
cases of a more general cyclic algorithm designed to solve the following problem:
given a graph G = (K, E), let E b ... , Em be a sequence of sets of unordered
pairs of elements of K such that U~I En = Ej given two matrices D and H
belonging to P, we have to find a matrix F E P such that

A. fij = d ij if (i,j) E E,

B. fi j = h ij if (i,j) ~ E.
The general algorithm starts setting Fo = H- I and generates a sequence of
positive definite matrices {Fn} belonging to P, and then for n ? 1, working
on the set En" we have:

1) f(n),ij = dij if (i,j) E En'

2) ft~) = f(n-I),ij if (i,j) ~ En'

The idea is to let fixed 2) and to make 1) vary along the iterations. The first
proof of the convergence of this procedure was given by Csiszar (1975). A more
statistical approach is given in the paper of Speed and Kiivery (1986) to which
we refer. They gave the following proposition

Proposition 6 The sequence of matrices {F n} generated by the general cyclic


algorithm, converges to the unique F E P that satisfies A. and B ..
The more relevant aspect of this iterative procedure is that it allows to
work only on the cliques of the graph. For more details see Vicard (1994).

3.3 Decomposability
A special role in the theory of graphical models, and in particular in the anal-
ysis of the relations between graphical modeling and regression, is played by
decomposable graphical models. A decomposable graph is one that can be
successively decomposed into its cliques. We will say that a r. vt. X is de-
composable if we can associate to it a decomposable conditional independence
graph. This graph's property has important consequences in terms of general
properties of the models (Whittaker, 1990):
i) Decomposable models are multiplicative, i.e. the p.d.f. of the r. vt. X
can be written as the product of the marginal p.d.f. of the cliques. This
209

factorization completely represents the properties of the corresponding


model.

ii) Decomposable models are recursive. That is there exists an ordering on


the vertex set of the independence graph such that the p.d.f. of X can
be factorized in a simpler way.

iii) Decomposable models are such that their m.l.e. can be computed di-
rectly. For example the IPF converges in just one cycle.

Property iii) clearly, makes easier to handle this kind of models as the maximum
likelihood system of equations can be solved directly. This is a consequence of
property i), i.e. of the complete factorization of the p.d.f. of X.

4 Model Choice

In graphical modeling the model selection problem is given in terms of the


choice of edges of the conditional independence graph G = (K,E) that have
to be removed (or added) from the set E. This is usually done starting from
a complete graph which corresponds to a saturated model, removing one edge
at each step, untill a "satisfactory "graph is obtained. Recall that by complete
graph we mean a graph in which all the knots are connected. Eventually we
can end up with an empty graph, i.e. a graph with no edges which corresponds
to a complete independence model. The number of models among which we
have to make our choice, given a k dimensional r. vt., is 2(~), then even for a
small number of variables the problem is of relevant dimension.
We have to consider two foundamental aspects: first we have to include
in the model a number of parameters large enough to properly represent the
relations existing between the r.vt., secondly we want to build a model that
can be handled easily from a computational point of view. Then we want to
choose a ''parsimonious "model.
The model choice, in this contest, is entirely based on the likelihood ratio
statistic. Let Mo and Ml be two nested models, i.e. Ml C Mo (Ml is obtained
removing one or more edges from the set of edges associated to Mo). We
call Ml current model, and Mo base model using the same naming as in the
program MIM.
Let V = {V : Ix (x) E M i } be the set of covariance matrices such that
the gaussian p.d.f Ix belongs to M i , i = 0,1. The likelihood ratio test is
called deviance when Mo is the saturated model, otherwise it is called deviance
difference. We know that when there are no restrictions on the elements of the
210

covariance matrix V, its m.l.e is the sample covariance matrix S. Then let us
compute the log-likelihood difference to compare V and S, we obtain

£(V) = £(S) - n~(SIIV)


where n is the sample dimension. Notice that the ~-divergence has an inverse
relation with the log-likelihhod, then to minimize ~(SIIV) is the same as to
maximize £(V).
Let Mg be a class of graphical models defined imposing conditional inde-
pendence constraints on V-I.

Definition 6 The deviance of a model ME M g , dev(M), is twice the differ-


ence between the unconstmined maximum of £, (£(S)) and the maximum taken
over Mg.
dev(M) = 2 [max£(V) - max£(V)]
v VEV

Remark that the deviance measures, in terms of log-likelihood, how much


the class Mg differs from the saturated model. The deviance difference between
two models MI and M o, obtained one from the other deleting just one edge is
dev(MI ) - dev(Mo).
In order to make inferencial statements we need to know the sample dis-
tribution of the deviance and the deviance difference.

Proposition 7 If fx belongs to the class M g , then the deviance has a chi-


square asymptotical distribution, with v degrees of freedom given by the number
of constmints set on V to ensure that fx E Mg.

The deviance differences are asymptotically independently distributed if


the limit is reached on a chain of models starting from the saturated one and
ending with the minimal one.
There are two procedures to construct the final model (or graph): the
backward and the forward procedures. The first one starts from the saturated
model and goes on deleting one edge at each step. In other words, starting with
the base model Mo at a given significance level, we perform as many deviance
differences tests as are the edges in the graph associated to Mo. We compare
the sample value of the statistic with the value of the X2 distribution with one
degree of freedom, and we compute the corresponding p-value. We remove the
edge with the highest p-value, i.e. we choose the model MI that differs less
significantly from Mo. The backward procedure stops when all the remaining
edges are significant.
211

The forward procedure works in a similar way, starting from the minimal
graph and stopping when all the edges added to it are significant.
Remark that the two selection procedures are perfectly symmetric and
then they usually give the same answer.

5 An Application of a Graphical Gaussian Model

In this section we report an application to real data of what we discussed


previously. AB in the example (Whittaker, 1990) based on tha data set of
Mardia, et al. (1979) of the mathematics marks, we analyze the scores obtained
by two samples 6f students of the Statistics Faculty of Rome in several exams
of two groups: statistics and mathematics. The first sample is taken from the
students of Demographical Statistical Science (SSD) master course and the
second one is observed on the students of Economical Statistical Science (SSE)
master course. All the students matriculated in the same academic year (1985-
86) and never changed from SSD to SSE and conversely. Another common
character to the considered set of students is that they were specializing in
mathematical statistics.
We consider the following subjects:

• Statistics: statistics I, statistics II, methodological statistics, sampling


theory.

• Mathematics: calculus I, calculus II, geometry, probability.

We can use these observations in order to compare statistical and mathematical


teachings, because as all the students never changed from one course to the
other, the relations between the variables are due only to "internal mechanisms
"of the SSE or SSD courses. All the computations relative to graphical models
are made by the program MIM (Edwards, 1987). This program works under
the hypothesis that the continous r.vt. considered are normaly distributed.
MIM can also work on discrete multinomial r.vt. and mixed variables.
We want to show how the IPF works and what are its outcomes.
First consider the SSD sample. We denote by X = (Xl. ... ,Xs ) the r.vt.
which components are associated to (in the order): calculus II, probability,
geometry, calculus I, statistics I, statistics II, methodological statistics and
sampling theory.
We obtain the estimated model by the backward procedure. We start
with the saturated model and we compute the partial correlation matrix in
order to find the pair of subjects with the lowest partial correlation. Once the
model is fitted, backward procedure is implemented by the backs elect option
212

Table 1: Results of the application

SSD EDGES P-VALUE SSE EDGES P-VALUE

X 1 -X3 0.0016 Y1 - Y2 0.0256


X 2 -X3 0.0031 Y 1 - Y3 0.0407
X 1 -X4 0.0082 Y2 - Y3 0.0592
X 2 -X5 0.0003 Y2 - Y 6 0.0106
X 4 -X6 0.0056 Y5 -Y6 0.0596
X5- X 6 0.0083
X 2 -X7 0.0010
X 5 -X7 0.0034
X1-Xs 0.0050
X 7 -Xs 0.0006

of MIM. We fit and test through the likelihood ratio test, all the models we
obtain deleting an edge from the base model. Then the least significant edge is
removed. The significance level is taken to be 5% through the entire example.
The fitting procedure ends when we find only significant edges.
For the SSE sample, we denote by Y = (Y1 , ... , Ys) the r. vt. of observa-
tions and we proceed in the same way as for the SSE sample.
In turns out that, while for SSD all the remaining edges are strongly sig-
nificant, for SSE the links between subjects are never very strong and we are
forced to keep two not higly significant edges in order to explain the realtions
between the r.vt ..
The final results are collected in Table 1.
In both cases we obtain two meaningfull graphical models and their graphs
are in Figure 1. The edges between Y 5 and Y 6, and Y2 and Y 3 in graph (b),
are given by dotted lines because of the weakness of the linkage.
From graph (a) we can conclude that the eight subjects considered con-
stitutes a rather homogeneous set. In fact each knot has no more then three
neighbours and then there are no "principal "subjects.
Graph (b) is decomposable; the vector Y can be decomposed in the fol-
lowing way: 1'4, Y7, Y s , (Y2, Y 6 ), (Y5, Y6 ) and (Yl, Y 2, Y 3 ). Its p.d.f. factorizes
as
fy(y) = !(Y4)!(Y7 )!(YS)g(Y5, Y6)h(Yb Y2, Y3)g(Y6, Y2)
!(Y6)!(Y2)
where all !(.) are univariate normal p.d.f. and g(.,.) and h(.,.,·) are bivariate
213

® I
I
I
I
I
I
I
I
I

(a)
® ® (b)

Figure 1: SSD and SSE graphs

and three variate normal p.d.f. respectively. The joint p.d.f. is a multivariate
normal.
In graph (b) (Figure 1) we have three marginal independence relations
and a conditional independence between the given probability of statistics and
mathematics subjects, subject 1'2 can be seen as a "separation "knot.
Notice that the decomposability of SSE graph implied a sensible reduction
in the computational time employed by MIM.

6 Relations between Graphical Models and Regression

In the previous paragraphs we used IPF to fit the model. Now we analyze
the same set of data by means of regression. Our aim is to verify under
which conditions the two methods have the same capability of representing
the interaction structure between r.vt ..
Denote by X(i) and Y(i) (i = 1, ... ,8) the vectors X = (Xl"'" Xs) and
Y = (YI, ... , Ys ) without the i-th component. In regression analysis we have
two groups of eight equations each:

E(Xi I X(i») = aiO + L aijXj and E(Yi I Y(i») = biO + L bijYj i = 1, ... ,8
j~ j~

and E denotes the expectation taken with respect to the conditional distri-
butions of Xi given X(i) and of Yi given Y(i) respectively. To estimate the
parameters of both systems we apply the reg procedure of SAS, and the back-
ward selection to choose the model, being the latter similar to the backselect
214

procedure of MIM. The significance level is always 5% and at each step the
choice of the variable to be removed from the conditioning set is based on an
F-test with 1 and n - 2 degrees of freedom (n is the samples size).
The results of IPF and regression analysis agree only for the SSE sample.
In the other case the outcomes are completely different, in other words, the con-
ditional independence relations enligthed by the regression analysis of the SSD
sample, are different from the relations given by the graphical model fitted by
IPF. This is mainly due to the different "starting" points of the two method-
ologies. Regression considers conditional distributions to represent relations
between r.vt., while graphical models take in account the joint distribution of
the r.vt .. This implies the use of different kind of conditional independence
graphs for each method.
Conditional distributions imply some idea of causality relations between
the r.vt., and then the use of a directed graph. Joint distributions give the
same "attention "to all the r.vt. and then we associate to them undirected
graphs.
It means that to compare graphical models and regression analysis, we
have to compare undirected and directed graphs. In order to do that we have
to introduce and discuss properties of directed graphs.

6.1 Directed Graphs


In section 2 we already gave the idea of directed graph, now we have to define
a directed conditional independence graph. In this kind of graph is implicit the
idea of "causality "between the r.vt. involved in the analysis. These relations
are represented by arrows pointing to the knots associated to dependent vari-
ables from the knots of independent variables, i.e. if we consider the equation
8

E(YI I Y(1») = blO + I)lj}j


j=2

the edge joining Y 1 to, say, Y3 is an arrow from Y3 to Y 1 •


In order to define uniquely a joint p.d.f. on a directed graph G~ =
(K, E~), we have to give a complete ordering of the knots. This ordering,
in fact, allows to avoid any ambiguity in the representation of the conditional
independence relations between the r.vt., and includes the causality relations
characterizing the directed graph. More formally we assume the existence of
an ordering relation (-<) on the set K, then for all i and j belonging to K we
have

i) i -< j or j -< i;
215

ii) -< is not reflexive;

iii) if i -< j and j -< h then i -< h (the -< relation is transitive).
The -< relation garanties that each edge of the graph can only have one direc-
tion. We will say that if j -< i, j belongs to the past of i and we denote by
K(i) all the elements of K belonging to the past of i and i itself, i.e. the union
of the present and of the past of i. Now we are able to define the directed
independence graph (Whittaker, 1990):

Definition 7 The directed independence graph of a r. vt. X = (Xl>' .. ,Xk ),


is the directedgraphG-<- = (K,E-<-), where K = {I, ... ,k}, K(j) = {1,2, ... ,j}
and the edge (i,j), with i -< j, is not in the edge set E-<- if and only if
Xj 1. Xi I X K(j)\ {i,j} .
The main difference between the undirected and directed independence
graph is in the composition of the conditioning set, i.e. in the first case we
deal only with the joint p.d.f. of the k r.vt., while in the second one we have to
consider a sequence of marginal conditional p.d.f. that allow us to reconstruct
the joint distribution by the following recursive relation:
!t,2,oo.,k(X) =
= fkIK(k)\k(xk; XK(k)\k)fk-IIK(k-I)\(k-l) (Xk-l; XK(k-I)\(k-I») ...
. . . hll(X2; XI)h (Xl)
The Markov properties given in Proposition 2 can be extended to directed
graphs. In order to do that we need to give some further definitions and
properties of directed graphs.
Let G-<- = (K, E-<-) be a directed graph, we denote with G'U = (K, E'U) the
undirected associated graph obtained from G-<- by changing all the arrows in
E-<- in simple lines. To define the Markov properties for a directed graph using
its undirected version, we need the following definition:

Definition 8 A directed graph satisfies the Wermuth condition if there are no


subgraphs such as the one in Figure 2.

If Wermuth's configurations are present in G-<- = (K, E-<-), we have to


remove them (by "moralizing "the graph) in order to ensure the equivalence
of the conditional independence relations represented in G'U (its undirected
version). The "moralization "of the graph is obtained by joining the knots 1,2
in Figure 2. Once this operation has been completed we have the so called
moral graph Gm. If G'U and Gm are the same then Markov properties holding
for Gm and for G-<- coincide.
216

Figure 2: Wermuth's configuration

Figure 3: An example of oriented cycle

6.2 Causal Graphs


A special role in the study of regression analysis through graphs is played by
causal graphs:

Definition 9 A causal graph GC = (K, EC) is given by a pair of sets:

K = Kx UKn and E C= E~UE~

where Kx and Kn are called set of exogenous and endogenous knots respectively;
the edges in E~ are lines, while the edges in E;' are arrows, more precisely they
are given by ordered pairs of vertices (i, j) with j E Kn.
We denote with i - j and i - j an undirected and a directed edge respec-
tively. If there are no oriented cycles (an example is in Figure 3)
then the causal graph is recursive.
Again in order to extend the Markov properties to this kind of graphs, we
consider GUo built by changing all the directed edges in E;' into undirected ones
217

(E~ ---+ E~), then GU = (Kx U K n , E~ U E~). Recalling Definition 8 we give


the following proposition:

Proposition 8 Assume that no Wermuth's configurations are in the causal


graph GC, then the Markov properties of GC and GU coincide. Moreover if
the subgraph G x relative to the exogenous vertices is decomposable, then GU is
decomposable.

The equivalence of the conditional independence statements represented


by GC and its undirected version, can be analyzed through the factorization of
the joint p.d.f.. Infact if the two graphs are equivalent from this point of view,
they have to give the same factorization of the joint p.d.f.

6.3 Recursive systems and their analogies with graphical gaussian models

When we consider quantitative variables the recursive graph can be represented


through recursive systems of equations with independent errors. These systems
have the following expression:

Yq + aq,q-IYq - 1 + aq,q-2Yq-2 + ... + aq,lYI + bq,lXI + ... + bq,pXp = Uq


Y q- l + a q-I,q-2 Y q-2 + ... + aq-I,IYI + bq-I,IXI + ... + bq_I,pXp = Uq- l

where Yi, i = 1, ... , q is an endogenous variable and Xi, i = 1, ... ,p


is an exogenous one. This system is recursive because in thei-th equation
Yi depends only on YI , ... , Yi-l and Xl"'" Xp. Suppose that the errors Ui
i = 1, ... , q are jointly normally distributed with mean value taken to be
zero and diagonal covariance matrix. Then we are dealing with a recursive
Gaussian system. Recall that all the conditional independence statements
regarding the p + q variables can be expressed in terms of null partial cor-
relations, because the elements of the inverse covariance matrix are propor-
tional to partial correlation coefficients. Then we consider the covariance ma-
trix V for the vector (X, Y) whose elements are ordered so that (Y, X) =
(Yq,Yq-l, ... ,Yl,Xp,Xp-l, ... ,Xl) and we assume that E(Y, X) = 0, then
the following lemma (whose proof is in Kiiveri, et al., 1984) holds:

Lemma 1 The inverse covariance matrix D of the recursive Gaussian system


of p + q variables has the unique representation D = LMLt where Land M
218

have the form


"Ill-I
L=[~ ~] M= [
o
where C is q x q a lower triangular matrix with 1 on the diagonal, "Ill is a q x q
diagonal matrix with positive elements and <I> is a p X P positive definite matrix.
In particular:

lij = -.Bij.(I, ... ,i-l,i+1, ... ,j-l) for j > p, i < j


'¢j = G'jj.(I, ... ,j-l) for p < j ~ p + q
<jJgh=G'gh for l~g,h~p+q

where .Bij.a and G'jj.a are the partial regression coefficient and the residual
variance of the j-th variable on the i-th one eliminating all the variables with
indices in set a.
This factorisation of D, denoted by (L,"IlI, <1» or (C, B,"IlI, <1», has been
introduced in order to obtain a faster way to analyse the relations between
variables. Let us introduce an ordering w in the vertex set of the recursive
causal graph GC such that w( i) < w(j) when j E Kn and i -+ j.
Let us suppose that the random variables are indexed by the vertices of GC
and that (Y, X) has joint Normal p.d.f. Pv, with zero mean and covariance
matrix V.

Proposition 9 The distribution Pv, of the r. vt. (Y, X) verifies the equiva-
lent Markov properties (Darroch, et al., 1980) if and only if for every ordering
of K compatible with GC, the elements of the factorisation (L,"IlI, <1» satisfies
the constraints

i) <jJgh = 0 whenever the edge (g, h) ~ Ex,

ii) lij = 0 whenever (i,j) ~ Ex, for all (g, h) E K x , j E Kn and i E K,


where <jJgh is the gh-element of <1>-1.

This is an extension of Proposition 3 to a recursive causal graph. Infact


<I> is just the covariance matrix of the vector X of exogenous variables and we
know that two of its components, Xg and X h , are conditionally independent if
and only if the element (g, h) of <1>-1 is zero. Notice that the typical condition
for a recursive system to have zero partial regression coefficients, concernes
the case when at least one of the variables considered is dependent on some of
the others, i.e. the model contains at least one endogenous variable. Then in
219

recursive systems we have one endogenous variable involved in each equation,


for example in the i-th equation

the role of endogenous variable is played only by Yi. Then we consider a system
of q equations, i.e. the q relations of marginal conditional independence.
Before going further in the analysis of the relations existing between re-
gression and graphical gaussian models, we state the following lemma which
describes the connection between our approach to recursive systems and the
well-known one used in econometrics.

Lemma 2 If D is decomposed into (C, B, W, <p) then X and Y satisfy the


linear structural equations

(9)

where U and X are independent Gaussian vectors with covariance matrices W


and <P respectively.
Conversely if X and Y satisfy (9), if U and X are defined as above and
if C is a lower triangular matrix with 1s on the diagonal and if W is diagonal,
then D = LMLt with Land M defined as in Lemma 1.
We are now able to give the result that allows us to build the causal graph
associated to (9) when C and <p- 1 contain some zero elements.
We suppose that (Y, X) has joint Gaussian distribution then

Proposition 10 The Gaussian system (Y,X) that satisfies (9) with U, X, W


and <p defined as in Lemma 2, satisfies the Markov properties if and only if

and <p satisfies the conditional independence constraints of Proposition 9.

7 Application to our example

The study of equivalence between the analysis of graphical models and re-
gression, can now be done with respect to the sample from students of SSE,
because only the graph associated to SSE is decomposable and then its p.d.f.
can be uniquely factorised.
220

IV
Figure 4: Recursive causal graph of SSE

Let us introduce on the vertex set of the graph an ordering compatible with
the construction of a recursive system. Instead of Y = (Y1 , •.• , Ys), we con-
sider Z = (Zl ... , Z8) where: Zl = calculus II, Z2 =geometry, Z3 =probability,
Z4 =statistics II, Z5 =statistics I, Z6 =calculus I, Z7 =methodological statis-
tics and Z8 =sampling theory.
Then the SSE causal graph is as follows:

and Zl is the only exogenous variable, while Z2, ... , Z8 are endogenous (that
is p = 1 and q = 7).
This graph presents neither Wermuth configurations nor oriented cycles,
then the conditional independence statements contained in the undirected and
the directed graphs are the same. The graph is recursive, as we could expect
since this property is just a consequence of decomposability. The associated
system of equations is

Z8 = 081 Z 1 + 082 Z 2 + 083 Z 3 + 084Z4 + 085Z5 + 086Z6 + 087 Z 7 + U 8


Z7 ~h+~~+~~+~~+~~+~~+~
Z6 = 061 Z 1 + 062 Z 2 + 063 Z 3 + 064Z4 + 06SZS + U 6
Zs OSlZl + OS2 Z 2 + OS3Z3 + OS4Z4 + Us

Z4 = 041 Z 1 + 042 Z 2 + 043 Z 3 + U 4


Z3 = 031 Z 1 + 032 Z 2 + U3

Z2 = 021 Z 1 + U 2

Zl = U1

Notice that the ordering we gave to the subjects, allows the IPF to converge
in only one cycle .
Here, again, the recursive system is estimated through the backward option
221

Table 2: Regression analysis of SSE students

P-VALUE: P-VALUE:
SSE EDGE
RECURSIVE SYSTEM GRAPHICAL MODEL

Z4 ---+ Zs = Ys - Y6 0.0703 0.0596


Z3 ---+ Z4 = Y2 - Y6 0.0131 0.0106
Z2 ---+ Z3 = Y2 - Y3 0.0730 0.0592
Zl ---+ Z3 = Y1 - Y2 0.0336 0.0256
Zl ---+ Z2 = Y1 - 1'3 0.0048 0.0407

of reg procedure of SAS and the fixed significance level is 5%. Thanks to
Proposition 10, the drawing of the independence graph is based on the null
partial regression coefficients then, looking at the output of SAS, we have to
take in account only the significance tests made on such coefficients. Through
backward procedure only the variables, whose associated coefficient is different
from zero in a significant way, remain in the covariate set.
We obtain the same graph seen in paragraph 5; infact, for uniformity, we do
not delete the two not highly significant edges (linking statistics II to statistics
I and geometry to probability). The results of the analysis of the recursive
system are collected in the following Table 1. They are also compared with
the P-value obtained through the analysis of graphical model.
As in this case the conditions on the equivalence of causal graphs and
undirected graphs are satisfied, and furthermore, the graph is decomposable,
the two methodologies give the same results, that is both methods are able
to represent the interaction structure of the variables. Notice that the least
significat edges of the graphical models are still weakly significant and their
regression P-value is even bigger then the one obtained by graphical models
analysis.
The sample from SSD has not been considered because its graph is not
decomposable. Infact decomposabilty (by itself) is necessary to allow us to
introduce an ordering system on the vertex set in order to simplify the fac-
torisation of the joint p.d.f.. Furthermore if Wermuth condition is satisfied,
directed and undirected graphs admit the same interpretation in terms of con-
ditional independence.
Then graphical models seem to represent the relations between variables
better than regression because they work on the joint distribution and con-
sequently do not need an ordering in the variable set. Infact for regression
222

analysis to be meaningful, as it works on conditional distributions, it is neces-


sary to consider (a priori) causality relations between the variables.
This is the main reason for which the two methodologies agree only when
the directed graph and the undirected one have the same interpretation. An-
other important feature of graphical models is that they allow to visualize
easily this kind of relations.

References

Csiszar, I. , 1975. «I-divergence geometry of probability distributions and


minimisation problems» Ann. Prob., 3 (1): p. 146-158.
Darroch, J. N., S. L. Lauritzen and T. P. Speed, 1980. «Markov
field and log-linear interaction models for contingency tables» Ann.
Stat., 8: p. 522-539.
Dawid, A. P. ,1979. «Conditional independence in statical theory» (with
discussion), J. Roy. Statist. Soc. ,B, 41 (1): p. 1-31.
Edwards, D. E. ,1987. «A guide to MIM» Research Report 87/1, Statis-
tical Research Unit, University of Copenhagen.
Kiiveri, H., T. P. Speed and J. B. Carlin, 1984. «Recursive causal
models» J. Aust. Math. Soc.,36: p. 30-52.
Kullback, S. , 1968. Information theory and statistics, John Wiley & Sons,
New York.
Mardia, K. V., J. T. Kent and J. M. Bibby, 1979. Multivariate analy-
sis, Academic Press, London
Speed, T. P. and H. Kiiveri ,1986. «Gaussian Markov distributions over
finite graphs» Ann. Statist., 14 (1): p. 138-150.
Vicard, P., 1994. «Modelli grafici Gaussiani: la procedura di Adatta-
mento Proporzionale Iterativo» Quademo E9 Dipartimento di Statis-
tica, Probabilita e Statistiche Applicate, Universita di Roma "La Sapien-
za".
Whittaker, J. ,1990. Graphical models in applied multivariate statistics,
John Wiley & Sons, New York.
Wright, S. , 1923. «The theory of path coefficients: a reply to Niles'criti-
cism». Genetics, 8: p. 239-255.
223

LINEAR STRUCTURAL DEPENDENCE OF DEGREE


ONE AMONG DATA: A STATISTICAL MODEL

F. LAGONA
Dipartimenta di Statistica e Prababilitd
Universitd di Rama, "La Sapienza "

Variances/covariances matrices and graphs are useful tools for studying stochastic
interaction among statistical data. In this paper the presence of some latent ob-
servations is modelled using an hypothesis of linear structural dependence among
data. The result is a particular Markovian Gaussian field, described via conditional
densities. The model covers the exchangeable hypothesis as a particular case. As
an application, a statistical linear model with dependent errors is presented and
estimation problems are discussed.

1 Introd uction

Some years ago, Baldessari and Gallo (1982) introduced the concept of linear
structural dependence (LSD) among the components Yi, i = 1,2, ... , n, of an
n-dimensional Gaussian random vector Y. Roughly speaking, they assumed
that each Yi is explained by a linear relation between Yi and n independent
Gaussian variables Ui , i = 1,2, ... , n, with E(Ui ) = 0 and Var(Ui ) = 1 for
each i. For instance, we have an LSD of degree 0 if there exist constants d;,
such that
(1)

Furthermore, if there exist constants di and ai such that


n
Yi = diUi +L ajUj i = 1, ... ,n (2)
j=I

we have an LSD of degree L It is clear that (1) is the assumption of indepen-


dence among the Yi's. But what is the actual meaning of (2)? An answer may
be found evaluating the conditional probability density of Yi given the event
{Yi = YI··· Yi-I = Yi-I Yi+I = Yi+I··· Yn = Yn} = {Yj = Yj, j -I- i}, namely
f (Yi!{Yj = Yj, j -I- i}). It is easy to show (Lagona, 1992) that, if aI -I- 0 and
a2 = ... = an = 0 then

a) f (YI!{Yj = Yj, j -I- I}) is a function depending on all the values in y =


(YI,'" ,Yn);
224

b) for i 2 2 f (Yi I{Y; = Yj, j #- i}) is a function depending on the value Yl


only.

In other words, each random variable Yi, i #- 1, does not depend on the
observed values in the set {Yj, j #- 1, j #- i} given the value Yl' In general, if
ai #- 0, j = 1, ... ,k, and ak+l = ... = an = 0, then each Yi, i = k + 1, ... ,n,
does not depend on the observed values in the set {Yi, j E {k + 1, ... , n} - {i}}
given the values in {Yj, j = 1, ... ,k}. The limit case is when all the a's are
nonzero and all the conditional densities f (Yi I{Y; = Yj, j #- i}) are functions
depending on all the values in y.
The conclusion is that the meaning of (2) is linked with the presence of
latent observations Yl, ... , Yk of the random vector Y.
Similar troubles are very common in practical econometrics. For instance,
let us suppose to observe the price levels of the same good stated by n com-
panies. In the quasi-monopolism case, the price levels of n - 1 companies are
independent, given the price stated by one company. In a similar manner, in
the quasi-duopolism case, n-2 companies will have an independent behaviour,
given the one of two companies that interact to each other.
In the following sections, a statistical model is defined for studying data
of an LSD hypothesis of degree one. For this a theoretical statement of the
problem is needed.

2 Formal statement of the problem

Let us suppose to observe a statistical variate over a finite points set N


{ 1, ... , i, ... ,n} in the d-dimensional Euclidean space !Rd. The set N is ar bi-
trary but fixed. If the real axis !R is the range of this variate, observations are
collected in a real vector y = (Yl,'" ,Yi, ... , Yn). Further, we may look at y
as the realization of a random vector Y taking values on a measurable space
(!Rn , B*), where B* is the Borel class of !Rn . Let P a probability measure on
B* and F1, ... ,i, ... ,n(Y) = P(Y :::; y) the distribution function of Y. Because
N is fixed, the sample space is simply !Rn . Further, the data ordering in Y
is purely formal, in order to define F: actually, a neighbourhood relationship
among data is of interest (see Section 3).
Let us suppose that data are not stochastically independent but some
qualitative information about data interaction are available. There are dif-
ferent ways for modelling this dependence: the most common are space-time
dependence models (time series, spatial series, space-time series). A less known
approach comes from the LSD hypothesis coin:ed by Baldessari and Gallo
(Baldessari, 1983a; Baldessari and Gallo, 1982; Baldessari and Weber, 1986)
225

in some papers related to the theory of Intrinsic Inference (Baldessari, 1983b;


Baldessari, 1985; Baldessari et aI, 1983).
The approach followed in this paper for modelling LSD splits into two
main steps (Section 3):

• formal definition of interaction in order to describe a random field via


conditional distributions;

• description of a graphical model in order to define a statistical model for


the random field.

As an application, a linear model with dependent errors will be considered


and estimation problems discussed.

3 Interaction and neighbourhood among data

3.1 A conditionally specified Gaussian model

In this section a formal definition of interaction is given.


We say that Yi 1 , . • . , Yin interact with Yi if Yi is stochastically indepen-
dent of data in the set {Yl:"" Yn} - {Yi 1 , ••• , Yin. }, given the observation
of Yil" .. , Yin' Let us suppose we know which data interact with Yi, for
each i. If so, 'each i is linked to a set N( i) = {i l , ... , in.} of neighbourhoods.
Let us call {N(i), i = 1, ... , n} the neighbourhood structure of y. The class
{N(i), i = 1, ... ,n} defines a binary relation on N, or, in other words, there
is a set E ~ N 2 where (i,j) E E if j E N(i). Furthermore, the consistency
statement i E N(j) ¢:} j E N(i) is assumed. In this case the pair G = (N,E)
is an undirected (not oriented) graph, where N is the vertices set and E is the
edges one. Let us call G a conditional independence graph (Whittaker, 1990).
The next step consists in the definition of a probability measure p on !Rn ,
consistent with the graph G. Let us define a random field as the class of
conditional probabilities

{p(YiIYj j =I i) i = 1, ... ,n} (3)

where
(4)

The statement (4) is a generalized Markovian hypothesis coined by Dobruschin


(1970): if it holds, we say that (3) is a Markovian field. In this paper Gaussian
226

fields are of interest, and, as it is well known, their conditional distributions


are defined by conditional densities. Thus the last assumption is

P (YilYj j E N(i)) = J f (YdYj j E N(i))dYi i = 1, ... , n

{-p [y,-J.t,-o. L (Yj - J-Lj)]2}


f (YilYj j E N(i)) = (21fof)e °i jEN(i) (5)

where of, J-Li, a are unknown parameter. Let us observe that neighbourhood
data enter the regression function E (YilYj j E N(i)): this is a consequence of
gaussian assumption.
It is easy to evaluate the joint density of Y from (5). Actually, if x and y
are different realizations of Y, we have that

(6)

hence, letting x = p. = (J-Ll," . , J-Ln) and using matrix symbology,

(7)

where

M d'
= lag (2. 1 )
0i, Z = , ... , n
A = "'D
<..<
D = {D"}
oj
" ..
U tJ
= {I 0 if j E N. (i)
otherwIse
(8)
Formula (7) is consistent with the Hammersey-Clifford Theorem (Besag,
1974) because the negpotential function is

(9)

where
Yig(Yi) = (Yi -/i)2 i = 1, ... , n (10)
o i
YiYjg(YiYj) = - a(Yi - J-Li)2(Yj - J-Lj) i = 1, ... , n, j E N(i) (11)
°i
From (7) we can recognize the normaling constant and evaluate the joint dis-
tribution of Y
f(y) = N (p., (I - A)-l M) (12)
227

which is the likelihood of a conditional autoregressive (CAR) model (Cressie,


1991). It should be observed that the LSD hypothesis is modelled in (12)
through the inverse variancesjcovariances matrix which is a function of the
incidence matrix of the graph C.

3.2 Linear structuml dependence of degree one

The matrix (1 - A) is symmetric by definition and positive defined by hypoth-


esis. Hence there exists its spectral decomposition (1 - A) = E'E, where E is
not singular. Let e = y - p. be the error vector, where the joint density of Y
is N (p., (I - A)-l M): one has

(13)

where the joint density of eis N(O, I) and the elements eij of E must satisfy
the following relationships

1 if i = 1
-0 if j E N(i) (14)
o otherwise
The statement (13) explains the dependence among errors through the
linear filter E- 1 M~ which transforms the white noise signal e
in the error
signal e. Condition (14) models this filter through a parameter o. It should
be noted that 0 is an autocorrelation coefficient in the model, because we may
write (5) as
€i = 0 L €j +'l/Ji i = 1, ... ,n (15)
j EN(i)

where 'l/Ji has density N(O, an,i = 1, ... ,n. Thus, multiplying both hands of
(15) for L €j and evaluating the expectation value, one has
j EN(i)

(16)
228

3.3 A class D of dependence hypotheses


Let us recall the graph G: a clique of G is a subgraph C that is a sin-
gle vertex or its vertices set is {il, ... ,ij, ... ,ik} and for each i j N(ij) =
{i l , ... ,i j , . •• ,id - {i j }. In other words, in a clique each vertex is a neigh-
bour of all the other vertices in the same clique. Further, if C is a clique and
is not a subgraph of a larger clique, then we say that C is a maximal clique.
The main topic of this section is a class 9 = {G h, h = 1, ... , n} of graphs
such that G h contains n - h maximal cliques with h + 1 vertices and such that
the intersection of the n - h cliques is a subgraph with h vertices (see Fig.
1). Of course G h is not unique, because there are (~) ways of choosing that
intersection. But in 9 we may define an equivalent relation in the following
sense: two graphs is 9 are equivalent if with a permutation of the names of the
vertices of one we find the other one. Thus we will consider G h as a member
of its equivalent class in the quotient set.

Figure 1: Some graphs of Q. Because of the permutation of vertices names,


the graph Gn-l ~ G n .

Let us consider the neighbour hood structure {Nh (i), : i = 1, ... , n} of each
G h E 9 and let it enter the likelihood (12). In this way, we are modelling an
hypotheses set which contains the independence case (a = 0), the one latent
data case (a -=I- 0, and neighbourhood structure{ NI (i), : i = 1, ... ,n}), and in
general the k latent data case (a -=I- 0, and neighbourhood structure{Nk(i),: i =
1, ... ,n}), and finally the exchangeable hypothesis if we use the neighbourhood
structure of the graph G n - l ~ G n . Playing this role, the family 9 provide us
229

an intrinsic inference model (Baldessari, 1983b; Baldessari, 1985; Baldessari et


al, 1983).
Furthermore, it should be noted an important consequence of the Gaussian
assumption stated in (5). Let Gh E 9 be a graph with n vertices, n-h maximal
cliques, and let Yi be the observation corresponding to the vertex i belonging to
the intersection of the maximal cliques. Let us evaluate the following marginal
joint distribution of the random vector Y n-l = (Y1,· .. ,Yi-l Yi+l,· .. ,Yn )

(17)

where I;n-l is square matrix obtained from I; after deleting the i-th row and
column. Clearly, the conditional independence graph for !(Yn-l) is G h - 1 •
Indeed, if we integrate N(O,~) with respect to an observation Yj' with the
vertex j not belonging to the intersection of the maximal cliques, we still
obtain a Gh-equivalent graph with n - 1 vertices.
In general, if we have the joint density of Y , corresponding to some
conditional independence graph G h E g, we may say that under Gaussian
distribution hypotheses, integrating the joint density of Y with respect to Yi
produces a marginal density corresponding to the subgraph obtained from G h
when one deletes the i-th vertex and the edges having i as an end point.
But the algebraic structure of G h changes only if the vertex i belongs to the
intersection of the maximal cliques of G h.

4 Estimation problems

As an application of the previous results, in this section a particular linear


model with dependent errors is considered. More precisely, let us suppose
to have observed a data vector y, which is assumed to be the realization of
random vector Y. Furthermore, let us suppose that E(Y) == X{3, where X is
a known n x k matrix such that X'X is not singular, and {3 is a k-dimensional
vector of unknown parameters. For the sake of simplicity, we may state an
homoscedasticity hypothesis, namely a; = a 2 , i = 1, ... ,n. In this case the
likelihood (12) becomes

(18)

where k + 2 < n, n - k - 1 being the number of the degrees of freedom. The


main problem is to find the maximum likelihood estimates for parameters {3, a
and a 2 . Letting the log-likelihood function

(19)
230

where." = (/3, a, (/2) the problem to be solved is the following

(20)

The simplest procedure for solving (20), is the Gauss-Newton iterative method

(21)

where p is the step number, H(p) is information matrix whose (i, j)-elements
are
(22)

In the case studied, (21) becomes

where, recalling that A(a) = aD,

H(j2,a.
';4 J.r(y
(j - Xf3)'D(y - Xf3)
2 1 (24)
- [ ';4 (y - Xf3)' D(y - Xf3) tr [(I - A(a))-l D]

The iterative method (21) provides a convergent sequence of estimates, but it


needs initial conditions ((/2) (0) and a(O). We may use an arbitrary value in the
region
(25)

because it is possible to show that in the set (25) the variances/covariances


matrix (/2 (I - A(a))-l is positive defined. In fact, the var/cov matrix is
positive defined if its minimum eigenvalue is positive or, in other words, if

(26)

But the Gersgoring disk (Magnus and Neudecker, 1988) of the maximum eigen-
value of Dis
(27)
231

hence
2
{0- > O} n { a < '\~ax} J 2
{0- > O} n {0 < a < n ~ 1} (28)

or, in other words, the region (25) is a subset of the admissibility region for
the parameters a and 0- 2 • Finally, the maximum likelihood estimate ij can be
used for defining the concentration ellipsoid

(29)

where H." is the information matrix evaluated at ." = ij and X%+2 is the
Chi-quared distribution with k + 2 degrees of freedom and corresponding to a
significance level chosen.

5 Conclusion

In several concrete cases studied it is common to observe a series of depen-


dent data. In this paper the dependence among data is supposed due to the
presence of latent observations. This case was investigated first by Baldessari
and Gallo (Baldessari and Gallo, 1982) via the concept of LSD hypothesis,
although their paper did not provide an useful statistical model. With the
help of graph theory and random field theory, a new description of the LSD
assumption is practicable, which seems very natural and intuitive. The main
purpose of such a description is the modelling of the presence of latent data
with a statistical approach (Section 2). Here variances/covariances matrices
and conditional independence graphs played a central role: in particular the
relationship between var/cov matrices and graphs is focused. Furthermore, we
would like estimating and testing the presence of a latent observation in the
data set: this is very easy if we use a linear statistical model with dependent
errors, as the application in Section 4 shows.

Acknowledgments

This paper was supported by the MURST Research Group 1993 "Analisi dei
dati dipendenti ".

References

Baldessari, B. , 1983a. «Analisi dei dati dipendenti: robustezza completa


nel modello lineare» Atti del Convegno sulla Robustezza, Orme, june
1983.
232

Baldessari, B. , 1983b. «Intrinsic Dependence. Foundations and Research


Approaches» (technical report) Dept. of Statistics - University of Ari-
zona.

Baldessari, B. , 1985. «Some Aspects of Intrinsic Inference» Metron, 43


(1-2) .

Baldessari, B. and F. Gallo, 1982. «Linear Structural


Dependence» Metron, 40 (3-4).

Baldessari, B., G.B. Tranquilli and J. Weber, 1983. « Intrinsic Infer-


ence: Review of the Related Literature» (technical report) Dept. of
Statistics - University of Arizona.

Baldessari, B. and J. Weber , 1986. «Statistical Models, Intrinsic De-


pendence and Intrinsic Inference» Metron, 44.

Besag, J. , 1974. «Spatial Interaction and Statistical Analysis of Lattice


System» (with discussion) Journal of the Royal Statistical Society, 36.

Cressie, N. , 1991. Statistics for Spatial Data. J. Wiley & Sons, New York.

Dobruschin, R. L. , 1970. «Prescribing a System of Random Variables by


Conditional Distributions» (tr. by A. R. Kraiman) Theory Prob. Appl.,
15 (3).

Lagona, F. , 1992. « Parametrizzazione di informazioni geografiche in pro-


cessi gaussiani. Un'ipotesi di covarianza uniforme» Graduation thesis
at the Rome University "La Sapienza", supervisor F. Gallo (in Italian)

Magnus, J. R. and H. Neudecker, 1988. Matrix Differential Calculus


with Applications in Statistics and Econometrics. J. Wiley & Sons, New
York.

Whittaker, J. ,1990. Graphical Models in Applied Multivariate Statistics.


J. Wiley & Sons, New York.
233

CLUSTER IDENTIFICATION IN A SIGNED GRAPH BY


EIGENVALUE ANALYSIS

A. BELLACICCO
Dipartimento di Teoria dei Sistemi e delle Oryanizzazioni.
Universitd di Temmo

v. TULLI
Dipartimento di Metodi Quantitativi. Facoltd di Economia
Universitd di Brescia.

In the last years a number of clustering algorithms were presented. Clustering al-
gorithms can be characterized considering a set of choices regarding the constraints
and the objective function. Generally speaking we can imbed any clustering algo-
rithm in the wide range of a graph transformation in terms of cuts and insertion
of arcs or edges in order to obtain a given topology of the subgraphs, like cliques
(a complete subgraph), circuits, arborescences and so on. The shape of a cluster
may be defined in terms of the corresponding topology and therefore can be char-
acterized by the highest eigenvalue of the graph associated matrix. In this paper
we will consider the characterization of clustering algorithms on graphs in terms of
eigenvalues. Real eigenvalues are related to balanced subgraphs, where the notion
of balanced graph will be considered.

1 Introduction

Cluster analysis can be considered as an optimization algorithm able to produce


a partition (covering) of a graph G representing the relations between couples
of units, under given constraints.
Generally speaking, it is a usual way of treating a set of units, described by
a set of variables, to consider a distance (dissimilarity) between couples of units
and setting up a graph, that is a set of relationships between couples of units
weighted by the strength of the relationship, which can be a distance function.
The requirements on the weights can be relaxed and we can have non negative
functions which are symmetric, like dissimilarity and interaction functions. As
is well known, a distance function should observe the previous requirements
and the so called triangular inequality. We can generalize the previous notion
of weight considering the set of real numbers, both positive and negative ones,
including the case of zero weight. The common interpretation of these weights
(When integers) is the flow of individuals from a given place to another in
spatial systems.
A number of clustering algorithms were presented in last years and a com-
mon definition considers them as unsupervised classifiers (Bellacicco and La-
234

bella, 1979; Bellacicco and Tulli, 1995). It is usual to characterize a clustering


algorithm considering a set of choices regarding the constraints and the objec-
tive function. Generally speaking we can imbed any clustering algorithm in
the wide range of a graph transformation in terms of cuts and insertion of arcs
or edges in order to obtain a given topology of the subgraphs, like cliques (i.e.
a complete subgraphs), circuits, arborescences and so on.
In Bellacicco and Coroo (1986) and in Baldassari and Bellacicco (1989) we
introduced the notion of a shape of a cluster in terms of statistical constraints
on each cluster.
In this paper we consider explicitly a graph G(S, X), X ~ S X S, where S
is a finite set of vertices and X ~ S x S is a finite set of arcs labelled with a
°
spin value, namely (+1,0, -1) where means the absence of the corresponding
arc.
As weights, spin values on the arcs are very common and useful in ter-
modinamics, ferromagnetism and in sociograms. In the latter case, the value
+1 means, e.g., that an individual in a classroom has a positive preference
about the other individual joined by the arc, the value -1 meaning repulsive
°
preference and the value indifference (Phillips, 1967; Roberts, 1976). The
eigenanalysis of the matrices of weights (-1, 0, +1) is interesting as far as the
main eigenvalue can give us information about the graph topology, that is the
set of relationships among a set of individuals.
The shape of a cluster may be defined in terms of the corresponding topol-
ogy and therefore can be characterized by the highest eigenvalue of the graph
associated matrix.
In the coming paragraphs we will consider the characterization of clustering
algorithms on graphs in terms of eigenvalues. Real eigenvalues are related to
balanced subgraphs, where the notion of balanced graph will be considered in
the subsequent paragraph.

2 Clusters and balanced graphs

Let us consider a graph G (S, X), X ~ S x S, where S is a finite set of nodes rep-
resenting units, and X a finite set of arcs labelled with a spin value (+1,0, -1),
joining couples of nodes. We may consider both symmetric and antisymmetric
graphs. In particular a clique Kh,h is a symmetric graph whose cardinality is
h.
We can consider the notion of balanced graph in terms of a distance which
is based on the number of common ancestors of two nodes. We recall some
concepts from graph theory for defining the notion of balanced graph.
235

Definition 1 The neighbour C(Si) of a node Si of G is the set of all nodes of


G linked by an arc to the node Si.

Definition 2 The dissimilarity d( Si, Sj) = dij between two nodes, Si and Sj,
ofG(S,X) is
(1)
where ni means the number of the nodes connected by an arc to Si and nij is
the number of nodes connected simultaneously, by an arc, to Si and Sj.

The measure of dissimilarity introduced by (1) can be easily understood if we


compare two strings composed by -1, 0, and +1 where we consider Definition
1, and therefore the neighbours G(Si) and G(Sj), and therefore the vectors
representing each neighbour. The distance between two nodes of the graph
is represented by the euclidean distance without squaring the result, between
two vectors whose elements are -1, 0 and +1. The previous observation can be
suitable for the interpretation of (1) as the distance between two nodes, whose
properties are the well known distance properties. A more general interpre-
tation can be obtained in terms of graph theory: the distance between two
units is the number of units belonging to each neighbour minus the elements
belonging to both simultaneously. This approach of reasoning is quite common
in set theory once the distance between two sets must be evaluated.
It is easy to see that dij ~ 0 and is such that dij = d ji , Vi,j.
The first property is assured by simple set theoretical arguments while the
symmetry is given by the definition of the number nij = nji.
A balanced graph, labelled BG, can be defined as follows:
Definition 3 A gmph G(S,X) is a BG if and only if dij is equal for every
couples of nodes of G.

It is easy to see that a clique Kh,h is a balanced graph.


Moreover there are balanced graphs which are not a clique, like, for in-
stance, any cycle whose size is greater than 3.
We can introduce the definition of cluster in terms of balanced graph.
Definition 4 A cluster G'(S, X) is a subgmph of G(S, X) which satisfies the
properties of a balanced gmph.

Balanced graph are studied in psychology and a simle example can be given
by considering groups of three people, where each individual may either like
or dislike each other. A general definition of a balanced subgraph is related to
graph circuits (cycles). A small group is balanced if each circuit is positive.
This definition implies that we can characterize a cluster of units, whether
236

or not they share some general property about the strength of the relation-
ships both for signed graphs and for the structure of the matrix of the values
associated to the arcs.
A first generalization may be obtained if we consider cliques rather than
circuits and the characterization of a cluster may be generalized without con-
sidering the amount of homogeneity to be maximized, but limiting ourselves
to the shape of the signed subgraph, in terms of properties to be satisfied.
We can consider now spin values on the arcs of a graph C( S, X) and review
the notion of cliques and balanced graphs. We call a graph with spin value on
the arcs as C*(S,X)..
Definition 5 A clique of a spin graph C*(S,X) is a clique whose arcs are
valued +1.

We denote a clique, previously defined, as a spin clique, Kh h'


We have the following statement: '
Proposition 1 The dissimilarity d ij is still working on spin graphs C* (S, X).

The truth of the previous statements on spin cliques can be easily checked if
we consider only the arcs with non negative values.
In order to generalize (1) we can consider the adjacency matrix whose
entries are the values dij of the arc connecting couples of nodes of C* (S, X).
The adjacency matrix M(C*) can be associated to the graph considering nt
the number of the arcs ingoing or outgoing from Si with values + 1, ni the
number of the arcs ingoing or outgoing from Si valued -1 and n?jO is the
number of arcs connecting the common neighbour of nodes of the mentioned
couple. The arcs valued 0 give a null contribution.
We introduce the new terms di,dj,dir
- + +n.-
·· -no
dtJ +n·J+ +n·J- - 2n··
+ - 2-
n·· (2)
t t tJ tJ

where
(3)
thus
~=~+~ W
Thus (2) is the distance formula.
In case of a spin clique Kh h it is obvious that d ij satisfies the same con-
ditions defining the notion of ;. dissimilarity index. We can also state that in
a clique both (1) and (2) are distance functions, whereas in the other cases
the triangular inequality does not hold. We can generalize the definition of a
balanced graph by the previous generalization on the adjacency matrix M (C*).
237

Proposition 2 Balanced subgraphs with spin values on the arcs own a con-
stant dissimilarity measure c4j for every couple of nodes.
The previous statement can be verified if we consider the matrix M (C).
We did not distinguish between ingoing arcs and outgoing arcs and equa-
tion (2) implies that the degrees of the nodes should have the same algebraic
summation of the arcs. Given any graph, a balanced subgraph may be obtained
by introducing a threshold function for each node, supposed to be obtained for
every node of C*(S,X). This notion of threshold function is quite new in
graph theory and the problem is to characterize each cluster in terms of bal-
anced subgraphs and an index which is strictly dependent from the degree of
balance of C*(S, X) (Roberts, 1976).
In the next paragraph we will consider the eigenvalues associated to a
graph C* (S, X) and we will see that balanced graphs BG have real eigenvalues
while unbalanced graphs could have complex eigenvalues. The transition from
balanced graphs to unbalanced graphs can be viewed as a sudden transition
in terms of eigenvalues. In other words, the transition from real eigenvalues to
complex eigenvalues is depending on a small change of the values associated
to the arcs.

3 Eigenvalue analysis of clusters in a graph

In the previous paragraph we considered as a general definition of a cluster


a balanced subgraph BG of a spin graph C*(S,X). With this approach, we
consider as a cluster a subgraph owning some topological properties, without
any explicit reference to underlying optimization functions. In other words a
definition of cluster is based on the cluster shape in terms of its topological
properties (Roberts, 1976).
The optimization problem can be reduced to the search of a minimal set
of cuts such that we can obtain either a partition or a covering by a set of
cliques. Following the previous definitions, the matrix M( C) is decomposed
into a set of submatrices which satisfy the condition:
(5)
In this case we can define the disequilibrium of a spin graph C* (S, X) as

D(C*) = WI +W2 + ... +Wk (6)


where k is the number of cliques able to represent the graph partitioning. Since
each clique can be decomposed in triangular cliques, all having W = 2, we have:
D(C*) = 2k (7)
· 238

The disequilibrium D(C*) can be easily evaluated for a partitioning of


C*(S,X) into k balanced graphs. We have indeed:

D(C*) = ck (8)

As far as D(C*) is larger than ck, we have less than k clusters and the problem
is to isolate the clusters, if they exist.
We introduce now a concept which is quite new in the area of cluster
analysis.
Besides the notion of cluster in terms of balanced subgraph of a spin graph
C* (S, X), we consider the notation of inactive subgraph CO where

(9)

In this case the matrix M(CO) is asymmetric because for each couple of
nodes we have a couple of arcs with opposite values.
The three notions already introduced in this paper can be considered as
three different types of graphs. Considering the associated adjacence matrix a
sharp partition can be made between subgraphs whose associated matrix has
real eigenvalues and subgraphs having complex eigenvalues.
Actually, a symmetric matrix M(C') whose elements are non negative,
owns all real eigenvalues, and, in particular, the largest one is positive.
As a conseguence:
Theorem 1 The first eigenvalue of a clique adjacence matrix is positive.

Proof In fact, for every clique of a spin graph equation (5) holds and the
symmetry of the matrix associated to the clique is guarantee. 0
It is easy to see that in a clique the neighbours of all nodes are equal, and
therefore:
ni = nj and from (1)
dij = 2ni - 2nij = 2nj - 2nij
nij = ni = nj
and as a conseguence dd = O. In case a clique is inbedded in a graph, we have
to add the other nodes of the graph which are the neighbours of each node.
For a graph which is decomposable in a set of cliques, it is easy to see that d ij
has a constatnt value.
To both unbalanced subgraphs and inactive subgraphs of a spin graph
C* (S, X) asymmetric matrices are associated and the main eigenvalue can be
either negative or complex.
Some problems come up when there is just one arc connecting a couple of
nodes. In this case we can have some ambiguous results. In the next paragraph
239

we will consider some experiments on triangles which are the smallest cliques.
Recalling that every clique greater than a triangle can be reduced through a
covering by triangles, the problem to face is to consider the previous equations
when we have a superposition of triangles in a larger clique. We state the
following theorem:

Theorem 2 The Equation (5) is true when a clique Kh,h is covered by a set
of triangles, K 3 ,3.

Proof Let us consider a clique Kh,h and the set of the cliques K3,3 which
actually cover Kh,h. For each couple of nodes of each clique K3,3 equation
(2) holds. The numbers di and dj are equal, respectively, to nt and nj and
nt = nj = 4 = gi = gj where gi is the degree of the node Si and gj is the
degree of the node Sj. The degree of each vertex is equal to 2h - 2, where
h is the size of the clique Kh,h and the loop associated to the vertex is not
considered. The number nrJ is equal to 4h - 2. We get easily c = 2 when we
have two or more cliques with the same couple of vertices in common we have
to sum up the new degrees and we have to consider the new number n~. The
balance of equation (5) is untouched by adding new cliques covering the clique
Kh,h. The same arguments can be used both for balanced spin subgraphs and
for inactive spin subgraphs.

4 Examples and conclusions

We consider here only three matrices which are able to show the truth of the
previous statements. The graph matrices and their largest eigenvalues are the
following ones:
240

MATRIX TYPE MAX EIGENVALUE

(t t D Cli~ 3

U1I n Bala=d 3

11
( 0
_Ill ~1) Unbalanced -0.324

It is possible to build up easily the set of all matrices from the set of all
graphs composed by triangles and to evaluate their largest eigenvalue. We can
see that the shown examples are samples of this set.
Balanced graphs can be built like, for instance, the circuits and the cycles.
By examples considered we give a simple idea of the role of the maximum
eigenvalue corresponding to a clique, a balanced graph or an unbalanced graph,
respectively. In case of an unbalanced graph the maximum eigenvalue is neg-
ative, while for a clique and for a balanced graph the maximum eigenvalue is
a positive number.
We can observe that the clique and the balanced graph own the same
maximum eigenvector in spite of their different sizes. The common feature is
the presence of a set of circuits whose arcs are valued with the same weight,
which is non negative. In the unbalanced case the presences of a weight -1 is
sufficient for a sudden change of the value of the eigenvalue.
We may interpret easily the previous results and we may identify a clique
like a balanced graph. Moreover, the value of maximum eigenvalue can outline
whether the corresponding subgraph is a balanced graph. The identification
of the clusters in a graph implies the search for the square sub matrices having
the max/min eigenvalues positive and therefore the search may be reduced to
a sequence of reordering of rows and columns of the matrix associated to the
graph and to a sequence of cuts of edges (arcs) in order to isolate the balanced
graphs.
From the point of view of the interpretation of a balanced graph, we can
see that the absence of negative values besides the symmetry of the matrix
is a guarantee of the homogeneity of the subgraph elements, which is a basic
241

feature of a cluster.
Cluster identification as balanced subgraphs may consider both circuits
and cliques and some other types of graphs, like balanced graphs, which may
be interpreted as an homogenous group of units, like individuals in a sociogram
and elementary physical units in ferromagnetism. In a sociogram a balanced
graph means that all the individuals show a preference relationship toward the
other individuals and the topology of the graph is unessential. The presence of
the elements of the matrix with a value equal to zero forbids balance, as we can
see from the previous examples. Symmetry of the matrix associated to a graph
is an essential feature besides the absence of element whose value can be -l.
Such a type of good features can help for the search of sub matrices in a matrix
associated to a graph which satisfy the previous requirements. In this paper
we limited our observation to the simple fact that, in order to search suitable
submatrices, it is necessary a sequence of cuts of the edge (arcs) on the set of
edges with a negative weight. Our analysis was limited to the evaluation of
the maximum eigenvalue which can be described as a global index of balance.
We can see that this approach overcomes the usual way of treating cluster-
ing problem, mainly because we do not limit the study to non negative weights
on the arcs, an usual constraint in cluster analysis. The dissimilarity relation-
ship is generalized in terms of spin (spatial interaction) between a couple of
units in a graph.

Acknowledgments

Research supported by 40% " Analisi dei dati spaziali".

References

Baldassarre B. and A. Bellacicco , 1986. «Identification of Linear Re-


gression Models by a Clustering Algorithm» COMPSTAT 1986, Physica-
Verlag, Vien, 1986.

Bellacicco A. and P. Corb ,1982. «Exponential type clustering algorithm


» Randex Metron, XL, 2.
Bellacicco A. and A. Labella, 1979. Le stutture matematiche dei dati.
Feltrinelli, Milano.

Bellacicco A. and V. Thlli , 1995. «Clustering dinamico su grafi orientati


con segno: l'algoritmo CLUSDIN ». Submitted for pubblication.
242

Phillips J. L. , 1967. «A model for cognitive balance». Psychological Re-


view, 34.

Roberts F. S. , 1976. Discrete mathematical models. Prentice-Hall, Engle-


wood Cliffs, New Jersey.

You might also like